Join : Giving incorrect result

Ajay Srivastava Wed, 04 Jun 2014 05:33:40 -0700

Hi,

I am doing join of two RDDs which giving different results ( counting number of 
records ) each time I run this code on same input.


The input files are large enough to be divided in two splits. When the program 
runs on two workers with single core assigned to these, output is consistent 
and looks correct. But when single worker is used with two or more than two 
cores, the result seems to be random. Every time, count of joined record is 
different.

Does this sound like a defect or I need to take care of something while using 
join ? I am using spark-0.9.1.


Regards
Ajay

Join : Giving incorrect result

Reply via email to