subject:"Can we get the partition Index in an UDF"

Re: Can we get the partition Index in an UDF

2018-06-25 Thread Vadim Semenov

Try using `TaskContext`: import org.apache.spark.TaskContext val partitionId = TaskContext.getPartitionId() On Mon, Jun 25, 2018 at 11:17 AM Lalwani, Jayesh wrote: > > We are trying to add a column to a Dataframe with some data that is seeded by > some random data. We want to be able to

Can we get the partition Index in an UDF

2018-06-25 Thread Lalwani, Jayesh

We are trying to add a column to a Dataframe with some data that is seeded by some random data. We want to be able to control the seed, so multiple runs of the same transformation generate the same output. We also want to generate different random numbers for each partition This is easy to do