[ https://issues.apache.org/jira/browse/SPARK-10868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944108#comment-14944108 ]
Reynold Xin commented on SPARK-10868: ------------------------------------- [~MartinSenne] this makes sense, and shouldn't be too hard to do. Are you interested in submitting a pull request for this? > monotonicallyIncreasingId() supports offset for indexing > -------------------------------------------------------- > > Key: SPARK-10868 > URL: https://issues.apache.org/jira/browse/SPARK-10868 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 1.5.0 > Reporter: Martin Senne > > With SPARK-7135 and https://github.com/apache/spark/pull/5709 > `monotonicallyIncreasingID()` allows to create an index column with unique > ids. The indexing always starts at 0 (no offset). > *Feature wish* > Having a parameter `offset`, such that the function can be used as > {{monotonicallyIncreasingID( offset )}} > and indexing _starts at *offset* instead of 0_. > *Use-case* > Add rows to a DataFrame that is already written to a DB (via > _.write.jdbc(...)_). > In detail: > - A DataFrame *A* (containing an ID column) and having indices from 0 to 199 > in that column is existent in DB. > - New rows need to be added to *A*. This included > -- Creating a DataFrame *A'* with new rows, but without id column > -- Add the index column to *A'* - this time starting at *200*, as there are > already entries with id's from 0 to 199 (*here, monotonicallyInreasingID( 200 > ) is required.*) > -- union *A* and *A'* > -- store into DB -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org