[jira] [Commented] (SPARK-7316) Add step capability to RDD sliding window

Joseph K. Bradley (JIRA) Wed, 06 May 2015 13:02:51 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531296#comment-14531296
 ]


Joseph K. Bradley commented on SPARK-7316:
------------------------------------------

Definitely makes sense for time series data

> Add step capability to RDD sliding window
> -----------------------------------------
>
>                 Key: SPARK-7316
>                 URL: https://issues.apache.org/jira/browse/SPARK-7316
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.3.0
>            Reporter: Alexander Ulanov
>             Fix For: 1.4.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> RDDFunctions in MLlib contains sliding window implementation with step 1. 
> User should be able to define step. This capability should be implemented.
> Although one can generate sliding windows with step 1 and then filter every 
> Nth window, it might take much more time and disk space depending on the step 
> size. For example, if your window is 1000 then you will generate the amount 
> of data thousand times bigger than your initial dataset. It does not make 
> sense if you need just every Nth window, so the data generated will be 1000/N 
> smaller. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-7316) Add step capability to RDD sliding window

Reply via email to