[ https://issues.apache.org/jira/browse/SPARK-18688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15715201#comment-15715201 ]
Herman van Hovell commented on SPARK-18688: ------------------------------------------- For my reference: Could you give an example of how you would write this using a cartesian join? > Interpolated time series join > ----------------------------- > > Key: SPARK-18688 > URL: https://issues.apache.org/jira/browse/SPARK-18688 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.0.2 > Reporter: Jarno Seppanen > > Time series joins are very common in analytics tasks. A simple example would > be joining the newest value of number of followers from data frame F with > sessions from data frame S. Currently, a cross join is needed for such joins > in Spark, making them practically impossible. > Example syntax: > {noformat} > SELECT l.account_id, l.time AS login_time, f.num_followers > FROM account_login l > LEFT JOIN follower_count_changed f > ON (f.account_id = l.account_id > AND l.time INTERPOLATE PREVIOUS VALUE f.time) > {noformat} > In essence, I'd like to have support for efficiently running joins like > INTERPOLATE PREVIOUS VALUE joins in Vertica [1]. > Thanks for your consideration, > Jarno > [1] > https://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/SQLReferenceManual/LanguageElements/Predicates/INTERPOLATE.htm -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org