subject:"Re\: partitionBy with partitioned column in output\?"

Re: partitionBy with partitioned column in output?

2018-02-26 Thread Alex Nastetsky

Yeah, was just discussing this with a co-worker and came to the same conclusion -- need to essentially create a copy of the partition column. Thanks. Hacky, but it works. Seems counter-intuitive that Spark would remove the column from the output... should at least give you an option to keep it.

Re: partitionBy with partitioned column in output?

2018-02-26 Thread naresh Goud

is this helps? sc.parallelize(List((1,10),(2, 20))).toDF("foo","bar").map(("foo","bar")=>("foo",("foo","bar"))). partitionBy("foo").json("json-out") On Mon, Feb 26, 2018 at 4:28 PM, Alex Nastetsky wrote: > Is there a way to make outputs created with "partitionBy" to