Re: [R] Functional data anlysis for unequal length and unequal width time series

2018-12-17 Thread Jeff Newmiller
You will learn something useful if you search for "rolling join". The zoo 
package can handle this, as can the data.table package (read the vignette).

Your decision to pad with NA at the end was ill-considered... the first point 
of your first series is between the first two points of your second series... 
you need to interleave the points somehow.

You will need to decide whether you want to use piecewise linear approximation 
(as with the base "approx" function) or the more stable 
last-observation-carried-forward ("locf") or cubic splines or something more 
exotic like Fourier interpolation to identify the new interpolated "y" values 
in each series.

You can avoid the rolling join if you intend to resample the series to have 
points at regular intervals.  Just apply your preferred interpolation technique 
with your intended mesh of regular time values to each of your series in turn 
and then use cbind with the results.

I don't know anything about the package you mention, but getting time series 
data aligned is a common preprocessing step for many time series analysis.

Oh, and to you should probably be familiar with that CRAN Time Series Task View 
[1].

PS you should provide a link back to your original posting when moving the 
conversation to a different venue in case the discussion doesn't stay dead 
there.

[1] https://cran.r-project.org/web/views/TimeSeries.html

On December 17, 2018 8:50:09 AM PST, so...@iastate.edu wrote:
>Dear All,
>I apologize if you have already seen in Stack Overflow. I
>have not got any response from there so I am posting for help here.
>
>I have data on 1318 time series. Many of these series are of unequal
>length. Apart from this also quite a few time points for each of the
>series are observed at different time points. For example consider the
>following four series
>
>t1 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.67)
>V1 <- c(-0.1710, -0.0824, -0.0419, -0.0416, -0.0216, -0.0792, -0.0656,-
>0.0273, -0.0589)
>ser1 <- cbind(t1, V1)
>
>t2 <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38)
>V2 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231, 0.2264)
>ser2 <- cbind(t2, V2)
>
>t3 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.65,
>25.88, 25.97, 25.99)
>V3 <- c(0.0897, -0.0533, -0.3497, -0.5684, -0.4294, -0.1109, 0.0352,
>0.0550, -0.0536, 0.0185, -0.0295, -0.0324)
>ser3 <- cbind(t3, V3)
>
>t4 <- c(24.5, 24.67, 24.71, 24.98, 25.17)
>V4 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231)
>ser4 <- cbind(t4, V4)
>
>Here t1, t2, t3, t4 are the time points and V1, V2, V3, V4 are the
>observations made at over those time points. The time points in the
>actual data are Julian dates so they look like these, just that they
>are much larger decimal figures like 2452450.6225.
>
>I am trying to cluster these time series using functional data approach
>for which I am using the "funFEM" package in R. Th examples present are
>for equispaced and equal length time series so I am not sure how to use
>the package for my data. Initially I tried by making all the time
>series equal in length to the time series having the highest number of
>observations (here equal to ser3) by adding NA's to the time series. So
>following this example I made ser2 as
>
>t2_n <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38, 25.50, 25.55, 25.65,
>25.88, 25.97, 25.99)
>V2_na <- c(V2, rep(NA, 6))
>ser2_na <- cbind(t2_n, V2_na)
>
>Note that to make t2 equal to length of t3 I grabbed the last 6 time
>points from t3. To make V2 equal in length to V3 I added NA's.
>
>Then I created my data matrix as
>
>dat <- rbind(V1_na, V2_na, V3, V4_na).
>
>The code I used was
>
>require(funFEM)
>basis<- create.fourier.basis(c(min(t3), max(t3)), nbasis = 25) 
>fdobj <- smooth.basis(c(min(t3), max(t3)) ,dat, basis)$fd
>
>Note that the range is constructed using the maximum and minumum time
>point of ser_3 series.
>
>res <- funFEM(fdobj, K = 2:9, model = "all", crit = "bic", init =
>"random") 
>
>But this gives me an error
>
>Error in svd(X) : infinite or missing values in 'x'.
>
>Can anyone tell please help me on how to deal with this dataset for
>this package or any alternative package?
>
>Sincerly,
>Souradeep
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Functional data anlysis for unequal length and unequal width time series

2018-12-17 Thread Bert Gunter
Specialized: Probably need to email the maintainer. See ?maintainer

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Dec 17, 2018 at 9:27 AM  wrote:

> Dear All,
> I apologize if you have already seen in Stack Overflow. I
> have not got any response from there so I am posting for help here.
>
> I have data on 1318 time series. Many of these series are of unequal
> length. Apart from this also quite a few time points for each of the
> series are observed at different time points. For example consider the
> following four series
>
> t1 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.67)
> V1 <- c(-0.1710, -0.0824, -0.0419, -0.0416, -0.0216, -0.0792, -0.0656,-
> 0.0273, -0.0589)
> ser1 <- cbind(t1, V1)
>
> t2 <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38)
> V2 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231, 0.2264)
> ser2 <- cbind(t2, V2)
>
> t3 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.65,
> 25.88, 25.97, 25.99)
> V3 <- c(0.0897, -0.0533, -0.3497, -0.5684, -0.4294, -0.1109, 0.0352,
> 0.0550, -0.0536, 0.0185, -0.0295, -0.0324)
> ser3 <- cbind(t3, V3)
>
> t4 <- c(24.5, 24.67, 24.71, 24.98, 25.17)
> V4 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231)
> ser4 <- cbind(t4, V4)
>
> Here t1, t2, t3, t4 are the time points and V1, V2, V3, V4 are the
> observations made at over those time points. The time points in the
> actual data are Julian dates so they look like these, just that they
> are much larger decimal figures like 2452450.6225.
>
> I am trying to cluster these time series using functional data approach
> for which I am using the "funFEM" package in R. Th examples present are
> for equispaced and equal length time series so I am not sure how to use
> the package for my data. Initially I tried by making all the time
> series equal in length to the time series having the highest number of
> observations (here equal to ser3) by adding NA's to the time series. So
> following this example I made ser2 as
>
> t2_n <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38, 25.50, 25.55, 25.65,
> 25.88, 25.97, 25.99)
> V2_na <- c(V2, rep(NA, 6))
> ser2_na <- cbind(t2_n, V2_na)
>
> Note that to make t2 equal to length of t3 I grabbed the last 6 time
> points from t3. To make V2 equal in length to V3 I added NA's.
>
> Then I created my data matrix as
>
> dat <- rbind(V1_na, V2_na, V3, V4_na).
>
> The code I used was
>
> require(funFEM)
> basis<- create.fourier.basis(c(min(t3), max(t3)), nbasis = 25)
> fdobj <- smooth.basis(c(min(t3), max(t3)) ,dat, basis)$fd
>
> Note that the range is constructed using the maximum and minumum time
> point of ser_3 series.
>
> res <- funFEM(fdobj, K = 2:9, model = "all", crit = "bic", init =
> "random")
>
> But this gives me an error
>
> Error in svd(X) : infinite or missing values in 'x'.
>
> Can anyone tell please help me on how to deal with this dataset for
> this package or any alternative package?
>
> Sincerly,
> Souradeep
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Functional data anlysis for unequal length and unequal width time series

2018-12-17 Thread soura
Dear All,
I apologize if you have already seen in Stack Overflow. I
have not got any response from there so I am posting for help here.

I have data on 1318 time series. Many of these series are of unequal
length. Apart from this also quite a few time points for each of the
series are observed at different time points. For example consider the
following four series

t1 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.67)
V1 <- c(-0.1710, -0.0824, -0.0419, -0.0416, -0.0216, -0.0792, -0.0656,-
0.0273, -0.0589)
ser1 <- cbind(t1, V1)

t2 <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38)
V2 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231, 0.2264)
ser2 <- cbind(t2, V2)

t3 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.65,
25.88, 25.97, 25.99)
V3 <- c(0.0897, -0.0533, -0.3497, -0.5684, -0.4294, -0.1109, 0.0352,
0.0550, -0.0536, 0.0185, -0.0295, -0.0324)
ser3 <- cbind(t3, V3)

t4 <- c(24.5, 24.67, 24.71, 24.98, 25.17)
V4 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231)
ser4 <- cbind(t4, V4)

Here t1, t2, t3, t4 are the time points and V1, V2, V3, V4 are the
observations made at over those time points. The time points in the
actual data are Julian dates so they look like these, just that they
are much larger decimal figures like 2452450.6225.

I am trying to cluster these time series using functional data approach
for which I am using the "funFEM" package in R. Th examples present are
for equispaced and equal length time series so I am not sure how to use
the package for my data. Initially I tried by making all the time
series equal in length to the time series having the highest number of
observations (here equal to ser3) by adding NA's to the time series. So
following this example I made ser2 as

t2_n <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38, 25.50, 25.55, 25.65,
25.88, 25.97, 25.99)
V2_na <- c(V2, rep(NA, 6))
ser2_na <- cbind(t2_n, V2_na)

Note that to make t2 equal to length of t3 I grabbed the last 6 time
points from t3. To make V2 equal in length to V3 I added NA's.

Then I created my data matrix as

dat <- rbind(V1_na, V2_na, V3, V4_na).

The code I used was

require(funFEM)
basis<- create.fourier.basis(c(min(t3), max(t3)), nbasis = 25) 
fdobj <- smooth.basis(c(min(t3), max(t3)) ,dat, basis)$fd

Note that the range is constructed using the maximum and minumum time
point of ser_3 series.

res <- funFEM(fdobj, K = 2:9, model = "all", crit = "bic", init =
"random") 

But this gives me an error

Error in svd(X) : infinite or missing values in 'x'.

Can anyone tell please help me on how to deal with this dataset for
this package or any alternative package?

Sincerly,
Souradeep

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.