Figured it out: I need to override method preferredLocation() in MyReceiver 
class. 

     On Wednesday, March 4, 2015 3:35 PM, Du Li <l...@yahoo-inc.com.INVALID> 
wrote:
   

 Hi,
I have a set of machines (say 5) and want to evenly launch a number (say 8) of 
kafka receivers on those machines. In my code I did something like the 
following, as suggested in the spark docs:        val streams = (1 to 
numReceivers).map(_ => ssc.receiverStream(new MyKafkaReceiver()))        
ssc.union(streams)
However, from the spark UI, I saw that some machines are not running any 
instance of the receiver while some get three. The mapping changed every time 
the system was restarted. This impacts the receiving and also the processing 
speeds.
I wonder if it's possible to control/suggest the distribution so that it would 
be more even. How is the decision made in spark?
Thanks,Du



   

Reply via email to