from:"Bathi CCDB"

Re: Replacing groupBykey() with reduceByKey()

2018-08-06 Thread Bathi CCDB

values of the result, thus I believe you can't just > replace your groupbykey with that. > > Thanks & Regards > Biplob Biswas > > > On Sat, Aug 4, 2018 at 12:05 AM Bathi CCDB wrote: > >> I am trying to replace groupByKey() with reudceByKey(), I am a pyspark >&g

Replacing groupBykey() with reduceByKey()

2018-08-03 Thread Bathi CCDB

I am trying to replace groupByKey() with reudceByKey(), I am a pyspark and python newbie and I am having a hard time figuring out the lambda function for the reduceByKey() operation. Here is the code dd = hive_context.read.orc(orcfile_dir).rdd.map(lambda x: (x[0],x)).groupByKey(25).take(2) Here