Right, I figured I'd need a custom partitioner from what I've read around!
Documentation on this is super sparse; do you have any recommended links on
solving data skew and/or creating custom partitioners in Spark 1.4?
I'd also love to hear if this is an unusual problem with my type of set-up -
Afternoon all,
Really loving this project and the community behind it. Thank you all for
your hard work.
This past week, though, I've been having a hard time getting my first
deployed job to run without failing at the same point every time: Right
after a leftOuterJoin, most partitions (600
You can bump up number of partition by a parameter in join operator.
However you have a data skew problem which you need to resolve using a
reasonable partition by function
On 7 Jul 2015 08:57, Mohammed Omer beancinemat...@gmail.com wrote:
Afternoon all,
Really loving this project and the