Time Series

This page describes the novel Time Series data model we propose, including data structures and operations. This model is well adapted to representing data and metadata on stocks. It includes time-stamped values organized in vectors, the timestamps constituting a calendar time series. The calendar is supposed to be regular and gives a time value (e.g., a date) for each entry. The unit of time is selected when the series is created; it can be any unit from the millisecond to the year, including days often considered. The model also integrates two null values for reconciling conflicting information and capturing missing values. Moreover, time series can be manipulated easily by algebraic expressions of simple but generic operators. These operators are basically an adaptation of relational ones (e.g., restriction and join) plus the support of sliding windows (e.g., moving average or relative strenght index) and of second order function applications.

Time Series Structure

A time series (TS) is a potentially infinite vector of values of lenght n. The vector is associated with a calendar giving for each instant in time the index of the entry. Time can be of different granularities (e.g., second, day, hour, and week).The following table illustrates a TS for times 0 to 5.

Time Series A
0 1 2 3 4 5
10 7 13 5 8 12

In general, the content of a time series entry (say TS[i] to designate entry i) is a real number, althoug we can envision typed object. In any case, the entry value can be unknown  (denoted ?) or not existing (denoted !). Thus, two null values are managed in our model. Notice that time series are potentially infinite. As they are not in the real world, we virtually pad them by repeating the value on the left for time indexes less than 0. Thus, we consider that TS[i] = TS[0] for i < 0. This simplifies several algorithms, notably for sliding windows.

Time Series Algebra

Linear Operations

Time series constitute a linear vector space, i.e., a mathematical structure formed by a collection of vectors that may be added (addition is denoted +) together and multiplied (multiplication is denoted *) by numbers. Scalars are real numbers in our case.

Relational Operations

Relational operations on time series are selection and projection. A selection with a condition of the form filter(input TS, condition) return a time series having the input value as value when the condition is satisfied, and NULL (denoted !) if not. More formally, we denote the selection φcond.(TS). For example, filter(A, <= 10) (if you prefer  φ<=10(A)) gives a time series with the original value A[i] when A[i] is less than 10 and NULL otherwise. The result of application to time series A is the following B TS:

Time Series B
0 1 2 3 4 5
10 7 ! 5 8 !

To illustrate algebraic expression and operation with null values, we give below the time series C = A + 2*B. Notice that NULL added to or multiplied by a value gives NULL.

Time Series C
0 1 2 3 4 5
30 21 ! 15 24 !

As mentionned in the introduction, times series can have different time scales. Calendars can differ in time units and in positions in time. To reconsile time series with different calendars, we introduce the synchronize opertor synchronize(A, cal, fun) where A is the time series to transform, cal a calendar (given as a time series of time)  and fun a function. The operator use the function to  either group values condensed in a time unit, or split a value distributed among several time units. This usefull operator in the context of multiple time series coming from multiple sources at different scales is formally denoted Θfun(TS1, TS2).

Other binary operations such as intersection, union, and join of TS can be introduced. We only need for our financial application a sort of join on calendars denoted ∩fun. It merges two series with same calendars by applying on the corresponding entries the function fun, i.e., it replaces the two values X[i] and Y[i] with same time index i by fun( X[i] ,Y[i] ). Let us point out that :

  • B ∩- C = A - B

  • A ∩+ C = A + B

Thence, ∩fun is a more general operation that covers full product of TS. For example, D = B  ∩* C is a full product of B and C given below :

Time Series D
0 1 2 3 4 5
300 147 ! 75 192 !

Windowing Operations

Most indicators proposed to predict time series future values (e.g., stock prices) work on sliding time periods called windows. Let [xi-w, xi-w+1, ... xi] be a window vector sliding on a time series TS when i goes from 0 (the begining of the series) to n (the last time of the series). To derive a new value from a window, a function on the corresponding vector must be applied. To apply such functions when sliding on a series, we introduce a new algebraic operator called apply as follows : apply(TS, win, fun). Notice that apply aggregates ordered window values, not only unorderd set of values as classical SQL aggregates. The parameter win is the window size.

Thence, the function can be a simple aggregate function not taking into account the vector order such as MIN, MAX, AVG, SUM, ..., or a function defined on ordered vectors such as MOM, EXP, RSI, ... Examples are more precisely:

  • MOM the momentum calculated as (TS[i] - TS[i-w]); it is the differential of a series at rank w.

  • EXP the exponential average using the formula Outpout[i] = Multiplier * Input[i] + (1-Muliplier) * Output[i-1].

  • RSI the relative strength index; applied to a window at time i, it computes the ratio 100 * SUM(D+[i])/ (SUM(Pos[i])-SUM(Neg[i])), where Pos[i] is the positive gain (TS[i] - TS[i-1]) if > 0 and Neg[i] is the negative loss (TS[i] -TS[i-1]) if < 0.

  • We formally denote Ωfun (TS, win) the application of a window function fun to a window of size win slidding over TS. To illustrate, we give below the time series EΩAVG (A, 3)  (if you prefer apply(A, 3, AVG)) that computes the moving average of A with a sliding window of 3.

    Time Series E
    0 1 2 3 4 5
    10 9 10 24.33 8.66 8.33

    We finally give below the time series F = apply(A, 3, RSI) that computes the relative strenght index of A with a sliding window of 3.

    Time Series F
    0 1 2 3 4 5
    50 0 66 43 27 100

     

    In summary, our model is quite powerful, incorporating time series with calendars, null values (unknown and not existing) encapsulated with classical operators (+, *, φ) and less classical and more generic ones (Θ, ∩, Ω), which are parametrized by a user function. It is powerful enough to model most indicators and buy/sell rules in the stock market technical analysis domain.