Jean, I see your point. For the incremental data, which is very small, I should make sure that the PARTITION BY in the OVER(PARTITION BY ...) is left out so that all the data will be in one partition when assigned a row number. The query below should avoid any problems.
“SELECT ROW_NUMBER() OVER() + b.id_max AS id, a.* FROM source a CROSS JOIN (SELECT COALESCE(MAX(id),0) AS id_max FROM tmp_destination) b”. But initially, I’ll use the monotonicallyIncreasingId function when I first load the data. Thanks, Ben > On Oct 10, 2016, at 8:36 AM, Jean Georges Perrin <j...@jgp.net> wrote: > > Is there only one process adding rows? because this seems a little risky if > you have multiple threads doing that… > >> On Oct 8, 2016, at 1:43 PM, Benjamin Kim <bbuil...@gmail.com >> <mailto:bbuil...@gmail.com>> wrote: >> >> Mich, >> >> After much searching, I found and am trying to use “SELECT ROW_NUMBER() >> OVER() + b.id_max AS id, a.* FROM source a CROSS JOIN (SELECT >> COALESCE(MAX(id),0) AS id_max FROM tmp_destination) b”. I think this should >> do it. >> >> Thanks, >> Ben >> >> >>> On Oct 8, 2016, at 9:48 AM, Mich Talebzadeh <mich.talebza...@gmail.com >>> <mailto:mich.talebza...@gmail.com>> wrote: >>> >>> can you get the max value from the current table and start from MAX(ID) + >>> 1 assuming it is a numeric value (it should be)? >>> >>> HTH >>> >>> HTH >>> >>> Dr Mich Talebzadeh >>> >>> LinkedIn >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw> >>> >>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> >>> >>> Disclaimer: Use it at your own risk. Any and all responsibility for any >>> loss, damage or destruction of data or any other property which may arise >>> from relying on this email's technical content is explicitly disclaimed. >>> The author will in no case be liable for any monetary damages arising from >>> such loss, damage or destruction. >>> >>> >>> On 8 October 2016 at 17:42, Benjamin Kim <bbuil...@gmail.com >>> <mailto:bbuil...@gmail.com>> wrote: >>> I have a table with data already in it that has primary keys generated by >>> the function monotonicallyIncreasingId. Now, I want to insert more data >>> into it with primary keys that will auto-increment from where the existing >>> data left off. How would I do this? There is no argument I can pass into >>> the function monotonicallyIncreasingId to seed it. >>> >>> Thanks, >>> Ben >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>> <mailto:user-unsubscr...@spark.apache.org> >>> >>> >> >