Re: Equal number of RDD Blocks
I also see that its creating both receivers on the same executor and that might be the cause of having more RDDs on executor than the other. Can I suggest spark to create each receiver on a each executor Regards,Laeeq On Monday, April 20, 2015 4:51 PM, Evo Eftimov evo.efti...@isecc.com wrote: #yiv8130515999 #yiv8130515999 -- _filtered #yiv8130515999 {font-family:Helvetica;panose-1:2 11 6 4 2 2 2 2 2 4;} _filtered #yiv8130515999 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv8130515999 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv8130515999 {font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}#yiv8130515999 #yiv8130515999 p.yiv8130515999MsoNormal, #yiv8130515999 li.yiv8130515999MsoNormal, #yiv8130515999 div.yiv8130515999MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv8130515999 a:link, #yiv8130515999 span.yiv8130515999MsoHyperlink {color:blue;text-decoration:underline;}#yiv8130515999 a:visited, #yiv8130515999 span.yiv8130515999MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv8130515999 p.yiv8130515999MsoAcetate, #yiv8130515999 li.yiv8130515999MsoAcetate, #yiv8130515999 div.yiv8130515999MsoAcetate {margin:0in;margin-bottom:.0001pt;font-size:8.0pt;}#yiv8130515999 p.yiv8130515999msolistparagraph, #yiv8130515999 li.yiv8130515999msolistparagraph, #yiv8130515999 div.yiv8130515999msolistparagraph {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv8130515999 p.yiv8130515999msonormal, #yiv8130515999 li.yiv8130515999msonormal, #yiv8130515999 div.yiv8130515999msonormal {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv8130515999 p.yiv8130515999msochpdefault, #yiv8130515999 li.yiv8130515999msochpdefault, #yiv8130515999 div.yiv8130515999msochpdefault {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv8130515999 span.yiv8130515999msohyperlink {}#yiv8130515999 span.yiv8130515999msohyperlinkfollowed {}#yiv8130515999 span.yiv8130515999emailstyle17 {}#yiv8130515999 p.yiv8130515999msonormal1, #yiv8130515999 li.yiv8130515999msonormal1, #yiv8130515999 div.yiv8130515999msonormal1 {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv8130515999 span.yiv8130515999msohyperlink1 {color:blue;text-decoration:underline;}#yiv8130515999 span.yiv8130515999msohyperlinkfollowed1 {color:purple;text-decoration:underline;}#yiv8130515999 p.yiv8130515999msolistparagraph1, #yiv8130515999 li.yiv8130515999msolistparagraph1, #yiv8130515999 div.yiv8130515999msolistparagraph1 {margin-top:0in;margin-right:0in;margin-bottom:0in;margin-left:.5in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv8130515999 span.yiv8130515999emailstyle171 {color:#1F497D;}#yiv8130515999 p.yiv8130515999msochpdefault1, #yiv8130515999 li.yiv8130515999msochpdefault1, #yiv8130515999 div.yiv8130515999msochpdefault1 {margin-right:0in;margin-left:0in;font-size:10.0pt;}#yiv8130515999 span.yiv8130515999BalloonTextChar {}#yiv8130515999 span.yiv8130515999EmailStyle31 {color:#1F497D;}#yiv8130515999 .yiv8130515999MsoChpDefault {font-size:10.0pt;} _filtered #yiv8130515999 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv8130515999 div.yiv8130515999WordSection1 {}#yiv8130515999 And what is the message rate of each topic mate – that was the other part of the required clarifications From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com] Sent: Monday, April 20, 2015 3:38 PM To: Evo Eftimov; user@spark.apache.org Subject: Re: Equal number of RDD Blocks Hi, I have two different topics and two Kafka receivers, one for each topic. Regards,Laeeq On Monday, April 20, 2015 4:28 PM, Evo Eftimov evo.efti...@isecc.com wrote: What is meant by “streams” here: 1. Two different DSTream Receivers producing two different DSTreams consuming from two different kafka topics, each with different message rate 2. One kafka topic (hence only one message rate to consider) but with two different DStream receivers (ie running in parallel) giving a start of two different DSTreams From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com.INVALID] Sent: Monday, April 20, 2015 3:15 PM To: user@spark.apache.org Subject: Equal number of RDD Blocks Hi, I have two streams of data from kafka. How can I make approx. equal number of RDD blocks of on two executors.Please see the attachement, one worker has 1785 RDD blocks and the other has 26. Regards,Laeeq
RE: Equal number of RDD Blocks
What is meant by “streams” here: 1. Two different DSTream Receivers producing two different DSTreams consuming from two different kafka topics, each with different message rate 2. One kafka topic (hence only one message rate to consider) but with two different DStream receivers (ie running in parallel) giving a start of two different DSTreams From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com.INVALID] Sent: Monday, April 20, 2015 3:15 PM To: user@spark.apache.org Subject: Equal number of RDD Blocks Hi, I have two streams of data from kafka. How can I make approx. equal number of RDD blocks of on two executors. Please see the attachement, one worker has 1785 RDD blocks and the other has 26. Regards, Laeeq
Re: Equal number of RDD Blocks
They both have same message rates, 300 record/sec On Monday, April 20, 2015 4:51 PM, Evo Eftimov evo.efti...@isecc.com wrote: #yiv8130515999 #yiv8130515999 -- _filtered #yiv8130515999 {font-family:Helvetica;panose-1:2 11 6 4 2 2 2 2 2 4;} _filtered #yiv8130515999 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv8130515999 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv8130515999 {font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}#yiv8130515999 #yiv8130515999 p.yiv8130515999MsoNormal, #yiv8130515999 li.yiv8130515999MsoNormal, #yiv8130515999 div.yiv8130515999MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv8130515999 a:link, #yiv8130515999 span.yiv8130515999MsoHyperlink {color:blue;text-decoration:underline;}#yiv8130515999 a:visited, #yiv8130515999 span.yiv8130515999MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv8130515999 p.yiv8130515999MsoAcetate, #yiv8130515999 li.yiv8130515999MsoAcetate, #yiv8130515999 div.yiv8130515999MsoAcetate {margin:0in;margin-bottom:.0001pt;font-size:8.0pt;}#yiv8130515999 p.yiv8130515999msolistparagraph, #yiv8130515999 li.yiv8130515999msolistparagraph, #yiv8130515999 div.yiv8130515999msolistparagraph {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv8130515999 p.yiv8130515999msonormal, #yiv8130515999 li.yiv8130515999msonormal, #yiv8130515999 div.yiv8130515999msonormal {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv8130515999 p.yiv8130515999msochpdefault, #yiv8130515999 li.yiv8130515999msochpdefault, #yiv8130515999 div.yiv8130515999msochpdefault {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv8130515999 span.yiv8130515999msohyperlink {}#yiv8130515999 span.yiv8130515999msohyperlinkfollowed {}#yiv8130515999 span.yiv8130515999emailstyle17 {}#yiv8130515999 p.yiv8130515999msonormal1, #yiv8130515999 li.yiv8130515999msonormal1, #yiv8130515999 div.yiv8130515999msonormal1 {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv8130515999 span.yiv8130515999msohyperlink1 {color:blue;text-decoration:underline;}#yiv8130515999 span.yiv8130515999msohyperlinkfollowed1 {color:purple;text-decoration:underline;}#yiv8130515999 p.yiv8130515999msolistparagraph1, #yiv8130515999 li.yiv8130515999msolistparagraph1, #yiv8130515999 div.yiv8130515999msolistparagraph1 {margin-top:0in;margin-right:0in;margin-bottom:0in;margin-left:.5in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv8130515999 span.yiv8130515999emailstyle171 {color:#1F497D;}#yiv8130515999 p.yiv8130515999msochpdefault1, #yiv8130515999 li.yiv8130515999msochpdefault1, #yiv8130515999 div.yiv8130515999msochpdefault1 {margin-right:0in;margin-left:0in;font-size:10.0pt;}#yiv8130515999 span.yiv8130515999BalloonTextChar {}#yiv8130515999 span.yiv8130515999EmailStyle31 {color:#1F497D;}#yiv8130515999 .yiv8130515999MsoChpDefault {font-size:10.0pt;} _filtered #yiv8130515999 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv8130515999 div.yiv8130515999WordSection1 {}#yiv8130515999 And what is the message rate of each topic mate – that was the other part of the required clarifications From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com] Sent: Monday, April 20, 2015 3:38 PM To: Evo Eftimov; user@spark.apache.org Subject: Re: Equal number of RDD Blocks Hi, I have two different topics and two Kafka receivers, one for each topic. Regards,Laeeq On Monday, April 20, 2015 4:28 PM, Evo Eftimov evo.efti...@isecc.com wrote: What is meant by “streams” here: 1. Two different DSTream Receivers producing two different DSTreams consuming from two different kafka topics, each with different message rate 2. One kafka topic (hence only one message rate to consider) but with two different DStream receivers (ie running in parallel) giving a start of two different DSTreams From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com.INVALID] Sent: Monday, April 20, 2015 3:15 PM To: user@spark.apache.org Subject: Equal number of RDD Blocks Hi, I have two streams of data from kafka. How can I make approx. equal number of RDD blocks of on two executors.Please see the attachement, one worker has 1785 RDD blocks and the other has 26. Regards,Laeeq
Re: Equal number of RDD Blocks
Hi, I have two different topics and two Kafka receivers, one for each topic. Regards,Laeeq On Monday, April 20, 2015 4:28 PM, Evo Eftimov evo.efti...@isecc.com wrote: #yiv4992037734 #yiv4992037734 -- _filtered #yiv4992037734 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv4992037734 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv4992037734 {font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}#yiv4992037734 #yiv4992037734 p.yiv4992037734MsoNormal, #yiv4992037734 li.yiv4992037734MsoNormal, #yiv4992037734 div.yiv4992037734MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv4992037734 a:link, #yiv4992037734 span.yiv4992037734MsoHyperlink {color:blue;text-decoration:underline;}#yiv4992037734 a:visited, #yiv4992037734 span.yiv4992037734MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv4992037734 p.yiv4992037734MsoListParagraph, #yiv4992037734 li.yiv4992037734MsoListParagraph, #yiv4992037734 div.yiv4992037734MsoListParagraph {margin-top:0in;margin-right:0in;margin-bottom:0in;margin-left:.5in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv4992037734 span.yiv4992037734EmailStyle17 {color:#1F497D;}#yiv4992037734 .yiv4992037734MsoChpDefault {font-size:10.0pt;} _filtered #yiv4992037734 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv4992037734 div.yiv4992037734WordSection1 {}#yiv4992037734 _filtered #yiv4992037734 {} _filtered #yiv4992037734 {}#yiv4992037734 ol {margin-bottom:0in;}#yiv4992037734 ul {margin-bottom:0in;}#yiv4992037734 What is meant by “streams” here: 1. Two different DSTream Receivers producing two different DSTreams consuming from two different kafka topics, each with different message rate 2. One kafka topic (hence only one message rate to consider) but with two different DStream receivers (ie running in parallel) giving a start of two different DSTreams From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com.INVALID] Sent: Monday, April 20, 2015 3:15 PM To: user@spark.apache.org Subject: Equal number of RDD Blocks Hi, I have two streams of data from kafka. How can I make approx. equal number of RDD blocks of on two executors.Please see the attachement, one worker has 1785 RDD blocks and the other has 26. Regards,Laeeq
RE: Equal number of RDD Blocks
And what is the message rate of each topic mate – that was the other part of the required clarifications From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com] Sent: Monday, April 20, 2015 3:38 PM To: Evo Eftimov; user@spark.apache.org Subject: Re: Equal number of RDD Blocks Hi, I have two different topics and two Kafka receivers, one for each topic. Regards, Laeeq On Monday, April 20, 2015 4:28 PM, Evo Eftimov evo.efti...@isecc.com wrote: What is meant by “streams” here: 1. Two different DSTream Receivers producing two different DSTreams consuming from two different kafka topics, each with different message rate 2. One kafka topic (hence only one message rate to consider) but with two different DStream receivers (ie running in parallel) giving a start of two different DSTreams From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com.INVALID] Sent: Monday, April 20, 2015 3:15 PM To: user@spark.apache.org Subject: Equal number of RDD Blocks Hi, I have two streams of data from kafka. How can I make approx. equal number of RDD blocks of on two executors. Please see the attachement, one worker has 1785 RDD blocks and the other has 26. Regards, Laeeq
RE: Equal number of RDD Blocks
Well spark steraming is supposed to create / distribute the Receivers on different cluster nodes. If you are saying that actually your receivers are running on the same node probably that node is getting most of the data to minimize the network transfer costs If you want to distribute your data more evenly you can partition it explicitly Also contact Data Bricks why the Receivers are not being distributed on different cluster nodes From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com] Sent: Monday, April 20, 2015 3:57 PM To: Evo Eftimov; user@spark.apache.org Subject: Re: Equal number of RDD Blocks I also see that its creating both receivers on the same executor and that might be the cause of having more RDDs on executor than the other. Can I suggest spark to create each receiver on a each executor Regards, Laeeq On Monday, April 20, 2015 4:51 PM, Evo Eftimov evo.efti...@isecc.com wrote: And what is the message rate of each topic mate – that was the other part of the required clarifications From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com] Sent: Monday, April 20, 2015 3:38 PM To: Evo Eftimov; user@spark.apache.org Subject: Re: Equal number of RDD Blocks Hi, I have two different topics and two Kafka receivers, one for each topic. Regards, Laeeq On Monday, April 20, 2015 4:28 PM, Evo Eftimov evo.efti...@isecc.com wrote: What is meant by “streams” here: 1. Two different DSTream Receivers producing two different DSTreams consuming from two different kafka topics, each with different message rate 2. One kafka topic (hence only one message rate to consider) but with two different DStream receivers (ie running in parallel) giving a start of two different DSTreams From: Laeeq Ahmed [mailto:laeeqsp...@yahoo.com.INVALID] Sent: Monday, April 20, 2015 3:15 PM To: user@spark.apache.org Subject: Equal number of RDD Blocks Hi, I have two streams of data from kafka. How can I make approx. equal number of RDD blocks of on two executors. Please see the attachement, one worker has 1785 RDD blocks and the other has 26. Regards, Laeeq