[jira] [Updated] (SPARK-9744) Add Java RDD method to map with lag and lead

2015-08-07 Thread Jerry Z (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Z updated SPARK-9744:
---
Summary: Add Java RDD method to map with lag and lead  (was: Add RDD method 
to map with lag and lead)

 Add Java RDD method to map with lag and lead
 

 Key: SPARK-9744
 URL: https://issues.apache.org/jira/browse/SPARK-9744
 Project: Spark
  Issue Type: Wish
Reporter: Jerry Z

 To avoid zipping with index and doing numerous mapping and joins, having a 
 single method call to map with an additional two parameters (1: list of 
 offsets [(-) for lag, 0 for current and (+) for lead])) and (2:default 
 value). The other difference to the map function takes an argument of ListT 
 and not just T.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-9744) Add Java RDD method to map with lag and lead

2015-08-07 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-9744:
-
Priority: Minor  (was: Major)

This sounds a bit like a windowing function. This is something you can do with 
a zip-with-index and join -- does that not work?
Or you can get the same effect efficiently* with mapPartitions, and using Scala 
collections methods to look at pairs of elements at a time. This also depends 
on the elements being sorted.

*you'd fail to compare elements across partition boundaries; sometimes that is 
a non-starter, sometimes that's fine.

 Add Java RDD method to map with lag and lead
 

 Key: SPARK-9744
 URL: https://issues.apache.org/jira/browse/SPARK-9744
 Project: Spark
  Issue Type: Wish
Reporter: Jerry Z
Priority: Minor

 To avoid zipping with index and doing numerous mapping and joins, having a 
 single method call to map with an additional two parameters (1: list of 
 offsets [(-) for lag, 0 for current and (+) for lead])) and (2:default 
 value). The other difference to the map function takes an argument of ListT 
 and not just T.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org