Re: Help in generating unique Id in spark row

2017-01-05 Thread Olivier Girardot
There is a way, you can use org.apache.spark.sql.functions.monotonicallyIncreasingId it will give each rows of your dataframe a unique Id On Tue, Oct 18, 2016 10:36 AM, ayan guha guha.a...@gmail.com wrote: Do you have any primary key or unique identifier in your data? Even if multiple

Re: Help in generating unique Id in spark row

2016-10-18 Thread ayan guha
Do you have any primary key or unique identifier in your data? Even if multiple columns can make a composite key? In other words, can your data have exactly same 2 rows with different unique ID? Also, do you have to have numeric ID? You may want to pursue hashing algorithm such as sha group to

Re: Help in generating unique Id in spark row

2016-10-17 Thread Saurav Sinha
Can any one help me out On Mon, Oct 17, 2016 at 7:27 PM, Saurav Sinha wrote: > Hi, > > I am in situation where I want to generate unique Id for each row. > > I have use monotonicallyIncreasingId but it is giving increasing values > and start generating from start if it

Help in generating unique Id in spark row

2016-10-17 Thread Saurav Sinha
Hi, I am in situation where I want to generate unique Id for each row. I have use monotonicallyIncreasingId but it is giving increasing values and start generating from start if it fail. I have two question here: Q1. Does this method give me unique id even in failure situation becaue I want to