Hi all,

I created a JIRA to discuss adding RDDs for "dimensional" (not sure what
else to call it) data like time series and spatial data.  Spark could be a
better time series and/or spatial "database" than existing approaches out
there.

https://issues.apache.org/jira/browse/SPARK-4727

I saw that MLlib supports some operations for time series in 1.2.0-rc1, but
I think that specialized RDDs could optimize the partitioning and
algorithms better than a regular RDD.  Or, for example, spatial data could
be partitioned into a grid.

Any feedback would be great!

Thanks,
RJ Nowling

-- 
em rnowl...@gmail.com

Reply via email to