[ https://issues.apache.org/jira/browse/SPARK-4727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234399#comment-14234399 ]
RJ Nowling commented on SPARK-4727: ----------------------------------- Thanks, Jeremy! Your work may cover my needs, and if not, it seems like a great place to contribute to! Was there some talk about encouraging people to build Spark libraries and putting together a community list? I'd love to see this sort of work advertised more. > Add "dimensional" RDDs (time series, spatial) > --------------------------------------------- > > Key: SPARK-4727 > URL: https://issues.apache.org/jira/browse/SPARK-4727 > Project: Spark > Issue Type: Brainstorming > Components: Spark Core > Affects Versions: 1.1.0 > Reporter: RJ Nowling > > Certain types of data (times series, spatial) can benefit from specialized > RDDs. I'd like to open a discussion about this. > For example, time series data should be ordered by time and would benefit > from operations like: > * Subsampling (taking every n data points) > * Signal processing (correlations, FFTs, filtering) > * Windowing functions > Spatial data benefits from ordering and partitioning along a 2D or 3D grid. > For example, path finding algorithms can optimized by only comparing points > within a set distance, which can be computed more efficiently by partitioning > data into a grid. > Although the operations on time series and spatial data may be different, > there is some commonality in the sense of the data having ordered dimensions > and the implementations may overlap. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org