Hi all, I created a JIRA to discuss adding RDDs for "dimensional" (not sure what else to call it) data like time series and spatial data. Spark could be a better time series and/or spatial "database" than existing approaches out there.
https://issues.apache.org/jira/browse/SPARK-4727 I saw that MLlib supports some operations for time series in 1.2.0-rc1, but I think that specialized RDDs could optimize the partitioning and algorithms better than a regular RDD. Or, for example, spatial data could be partitioned into a grid. Any feedback would be great! Thanks, RJ Nowling -- em rnowl...@gmail.com