If you can use a dataframe then you could use rank + window function at the expense of an extra sort. Do you have an example of zip with index not working, that seems surprising. On Jul 23, 2016 10:24 PM, "Andrew Ehrlich" <and...@aehrlich.com> wrote:
> It’s hard to do in a distributed system. Maybe try generating a meaningful > key using a timestamp + hashed unique key fields in the record? > > > On Jul 23, 2016, at 7:53 PM, yeshwanth kumar <yeshwant...@gmail.com> > wrote: > > > > Hi, > > > > i am doing bulk load to hbase using spark, > > in which i need to generate a sequential key for each record, > > the key should be sequential across all the executors. > > > > i tried zipwith index, didn't worked because zipwith index gives index > per executor not across all executors. > > > > looking for some suggestions. > > > > > > Thanks, > > -Yeshwanth > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >