Not that I've seen, at least not in any worker independent way. To guarantee consecutive values you'd have to create a udf or some such that provided a new row id. This probably isn't an issue on small data sets but would cause a lot of added communication on larger clusters / datasets.
Mike > On Aug 5, 2016, at 11:21 AM, janardhan shetty <janardhan...@gmail.com> wrote: > > Mike, > > Any suggestions on doing it for consequitive id's? > >> On Aug 5, 2016 9:08 AM, "Tony Lane" <tonylane....@gmail.com> wrote: >> Mike. >> >> I have figured how to do this . Thanks for the suggestion. It works great. >> I am trying to figure out the performance impact of this. >> >> thanks again >> >> >>> On Fri, Aug 5, 2016 at 9:25 PM, Tony Lane <tonylane....@gmail.com> wrote: >>> @mike - this looks great. How can i do this in java ? what is the >>> performance implication on a large dataset ? >>> >>> @sonal - I can't have a collision in the values. >>> >>>> On Fri, Aug 5, 2016 at 9:15 PM, Mike Metzger <m...@flexiblecreations.com> >>>> wrote: >>>> You can use the monotonically_increasing_id method to generate guaranteed >>>> unique (but not necessarily consecutive) IDs. Calling something like: >>>> >>>> df.withColumn("id", monotonically_increasing_id()) >>>> >>>> You don't mention which language you're using but you'll need to pull in >>>> the sql.functions library. >>>> >>>> Mike >>>> >>>>> On Aug 5, 2016, at 9:11 AM, Tony Lane <tonylane....@gmail.com> wrote: >>>>> >>>>> Ayan - basically i have a dataset with structure, where bid are unique >>>>> string values >>>>> >>>>> bid: String >>>>> val : integer >>>>> >>>>> I need unique int values for these string bid''s to do some processing in >>>>> the dataset >>>>> >>>>> like >>>>> >>>>> id:int (unique integer id for each bid) >>>>> bid:String >>>>> val:integer >>>>> >>>>> >>>>> >>>>> -Tony >>>>> >>>>>> On Fri, Aug 5, 2016 at 6:35 PM, ayan guha <guha.a...@gmail.com> wrote: >>>>>> Hi >>>>>> >>>>>> Can you explain a little further? >>>>>> >>>>>> best >>>>>> Ayan >>>>>> >>>>>>> On Fri, Aug 5, 2016 at 10:14 PM, Tony Lane <tonylane....@gmail.com> >>>>>>> wrote: >>>>>>> I have a row with structure like >>>>>>> >>>>>>> identifier: String >>>>>>> value: int >>>>>>> >>>>>>> All identifier are unique and I want to generate a unique long id for >>>>>>> the data and get a row object back for further processing. >>>>>>> >>>>>>> I understand using the zipWithUniqueId function on RDD, but that would >>>>>>> mean first converting to RDD and then joining back the RDD and dataset >>>>>>> >>>>>>> What is the best way to do this ? >>>>>>> >>>>>>> -Tony >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> Ayan Guha