Re: How to generate a sequential key in rdd across executors

2016-07-24 Thread Marco Mistroni
Hi how bout creating an auto increment column in hbase? Hth On 24 Jul 2016 3:53 am, "yeshwanth kumar" wrote: > Hi, > > i am doing bulk load to hbase using spark, > in which i need to generate a sequential key for each record, > the key should be sequential across all the

Re: How to generate a sequential key in rdd across executors

2016-07-24 Thread Pedro Rodriguez
If you can use a dataframe then you could use rank + window function at the expense of an extra sort. Do you have an example of zip with index not working, that seems surprising. On Jul 23, 2016 10:24 PM, "Andrew Ehrlich" wrote: > It’s hard to do in a distributed system.

Re: How to generate a sequential key in rdd across executors

2016-07-23 Thread Andrew Ehrlich
It’s hard to do in a distributed system. Maybe try generating a meaningful key using a timestamp + hashed unique key fields in the record? > On Jul 23, 2016, at 7:53 PM, yeshwanth kumar wrote: > > Hi, > > i am doing bulk load to hbase using spark, > in which i need to

How to generate a sequential key in rdd across executors

2016-07-23 Thread yeshwanth kumar
Hi, i am doing bulk load to hbase using spark, in which i need to generate a sequential key for each record, the key should be sequential across all the executors. i tried zipwith index, didn't worked because zipwith index gives index per executor not across all executors. looking for some