[ https://issues.apache.org/jira/browse/SPARK-10868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martin Senne updated SPARK-10868: --------------------------------- Description: With SPARK-7135 and https://github.com/apache/spark/pull/5709 `monotonicallyIncreasingID()` allows to create an index column with unique ids. The indexing always starts at 0 (no offset). *Feature wish* Having a parameter `offset`, such that the function can be used as {{monotonicallyIncreasingID( offset )}} and indexing *starts at `offset` instead of 0*. *Use-case* Add rows to a DataFrame that is already written to a DB (via _.write.jdbc(...)_). In detail: - A DataFrame *A* (containing an ID column) and having indices from 0 to 199 in that column is existent in DB. - New rows need to be added to *A*. This included -- Creating a DataFrame *A'* with new rows, but without id column -- Add the index column to *A'* - this time starting at *200*, as there are already entries with id's from 0 to 199 (*here, monotonicallyInreasingID( 200 ) is required.*) -- union *A* and *A'* -- store into DB was: With SPARK-7135 and https://github.com/apache/spark/pull/5709 `monotonicallyIncreasingID()` allows to create an index column with unique ids. The indexing always starts at 0 (no offset). **Feature wish**: Having a parameter `offset`, such that the function can be used as {{monotonicallyIncreasingID( offset )}} and indexing *starts at `offset` instead of 0*. **Background and justification**: Add rows to a DataFrame that is already written to a DB (via .jdbc). In detail: - A DataFrame A (containing an ID column) and having indices from 0 to 199 in that column is existent in DB. - New rows need to be added to A. This included -- Creating a DataFrame **A'** with new rows, but without id column -- Add the index column to **A'** - this time starting at **200**, as there are already entries with id's from 0 to 199 (**here is where monotonicallyInreasingID( 200 ) is required.**) -- union **A** and **A'** -- store into DB > monotonicallyIncreasingId() supports offset for indexing > -------------------------------------------------------- > > Key: SPARK-10868 > URL: https://issues.apache.org/jira/browse/SPARK-10868 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 1.5.0 > Reporter: Martin Senne > > With SPARK-7135 and https://github.com/apache/spark/pull/5709 > `monotonicallyIncreasingID()` allows to create an index column with unique > ids. The indexing always starts at 0 (no offset). > *Feature wish* > Having a parameter `offset`, such that the function can be used as > {{monotonicallyIncreasingID( offset )}} > and indexing *starts at `offset` instead of 0*. > *Use-case* > Add rows to a DataFrame that is already written to a DB (via > _.write.jdbc(...)_). > In detail: > - A DataFrame *A* (containing an ID column) and having indices from 0 to 199 > in that column is existent in DB. > - New rows need to be added to *A*. This included > -- Creating a DataFrame *A'* with new rows, but without id column > -- Add the index column to *A'* - this time starting at *200*, as there are > already entries with id's from 0 to 199 (*here, monotonicallyInreasingID( 200 > ) is required.*) > -- union *A* and *A'* > -- store into DB -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org