Re: Kafka Consumer in Spark Streaming

Something Something Wed, 05 Nov 2014 10:44:10 -0800

As suggested by Qiaou, looked at the UI:

1)  Under 'Stages' the only 'active' stage is:  runJob at
ReceiverTracker.scala:275
2)  Under 'Executors', there's only 1 active task, but I don't see any
output (or logs)
3)  Under 'Streaming', there's one receiver called, 'KafkaReciever-0', but
'Records in last batch' are 0.


Honestly, I think it's not connecting to my Kafka topic - possibly because
I need to pass the following parameter:

metadata.broker.list -> <machine>:9092

But I don't know how to pass this to KafkaUtils.createStream(...).  Could
that be the problem?



On Tue, Nov 4, 2014 at 11:12 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Your code doesn't trigger any action. How about the following?
>
> JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new
> Duration(60 * 1 * 1000));
>
>         JavaPairReceiverInputDStream tweets = KafkaUtils.createStream(ssc,
> "<machine>:2181", "1", map);
>
>         JavaDStream<String> statuses = tweets.map(
>                 new Function<String, String>() {
>                     public String call(String status) {
>                         System.out.println(status);
>                         return status;
>                     }
>                 }
>         );
>
>
> statuses.print()
> ;
>  
>
> 
>
> Or you could use foreachRDD instead of map() if your intention is just
> printing.
>
> Thanks
> Best Regards
>
> On Wed, Nov 5, 2014 at 12:35 PM, Something Something <
> mailinglist...@gmail.com> wrote:
>
>> It's not local.  My spark url is something like this:
>>
>>         String sparkUrl = "spark://<host name>:7077";
>>
>>
>> On Tue, Nov 4, 2014 at 11:03 PM, Jain Rahul <ja...@ivycomptech.com>
>> wrote:
>>
>>>
>>>  I think you are running it locally.
>>> Do you have local[1] here for master url? If yes change it to local[2]
>>> or more number of threads.
>>> It may be due to topic name mismatch also.
>>>
>>>      sparkConf.setMaster(“local[1]");
>>>
>>>  Regards,
>>> Rahul
>>>
>>>   From: Something Something <mailinglist...@gmail.com>
>>> Date: Wednesday, November 5, 2014 at 12:23 PM
>>> To: "Shao, Saisai" <saisai.s...@intel.com>
>>> Cc: "user@spark.apache.org" <user@spark.apache.org>
>>>
>>> Subject: Re: Kafka Consumer in Spark Streaming
>>>
>>>   Added foreach as follows.  Still don't see any output on my console.
>>> Would this go to the worker logs as Jerry indicated?
>>>
>>>         JavaPairReceiverInputDStream tweets =
>>> KafkaUtils.createStream(ssc, "<mymachine>:2181", "1", map);
>>>         JavaDStream<String> statuses = tweets.map(
>>>                 new Function<String, String>() {
>>>                     public String call(String status) {
>>>                         return status;
>>>                     }
>>>                 }
>>>         );
>>>
>>>         statuses.foreach(new Function<JavaRDD<String>, Void>() {
>>>             @Override
>>>             public Void call(JavaRDD<String> stringJavaRDD) throws
>>> Exception {
>>>                 for (String str: stringJavaRDD.take(10)) {
>>>                     System.out.println("Message: " + str);
>>>                 }
>>>                 return null;
>>>             }
>>>         });
>>>
>>>
>>> On Tue, Nov 4, 2014 at 10:32 PM, Shao, Saisai <saisai.s...@intel.com>
>>> wrote:
>>>
>>>>  If you’re running on a standalone mode, the log is under
>>>> <SPAR_HOME>/work/ directory. I’m not sure for yarn or mesos, you can check
>>>> the document of Spark to see the details.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Jerry
>>>>
>>>>
>>>>
>>>> *From:* Something Something [mailto:mailinglist...@gmail.com]
>>>> *Sent:* Wednesday, November 05, 2014 2:28 PM
>>>> *To:* Shao, Saisai
>>>> *Cc:* user@spark.apache.org
>>>> *Subject:* Re: Kafka Consumer in Spark Streaming
>>>>
>>>>
>>>>
>>>> The Kafka broker definitely has messages coming in.  But your #2 point
>>>> is valid.  Needless to say I am a newbie to Spark.  I can't figure out
>>>> where the 'executor' logs would be.  How would I find them?
>>>>
>>>> All I see printed on my screen is this:
>>>>
>>>> 14/11/04 22:21:23 INFO Slf4jLogger: Slf4jLogger started
>>>> 14/11/04 22:21:23 INFO Remoting: Starting remoting
>>>> 14/11/04 22:21:24 INFO Remoting: Remoting started; listening on
>>>> addresses :[akka.tcp://spark@mymachie:60743]
>>>> 14/11/04 22:21:24 INFO Remoting: Remoting now listens on addresses:
>>>> [akka.tcp://spark@mymachine:60743]
>>>> 14/11/04 22:21:24 WARN NativeCodeLoader: Unable to load native-hadoop
>>>> library for your platform... using builtin-java classes where applicable
>>>> 14/11/04 22:21:24 INFO JniBasedUnixGroupsMappingWithFallback: Falling
>>>> back to shell based
>>>> -------------------------------------------
>>>> Time: 1415168520000 ms
>>>> -------------------------------------------
>>>> -------------------------------------------
>>>> Time: 1415168520000 ms
>>>> -------------------------------------------
>>>>
>>>> Keeps repeating this...
>>>>
>>>>
>>>>
>>>> On Tue, Nov 4, 2014 at 10:14 PM, Shao, Saisai <saisai.s...@intel.com>
>>>> wrote:
>>>>
>>>>  Hi, would you mind describing your problem a little more specific.
>>>>
>>>>
>>>>
>>>> 1.      Is the Kafka broker currently has no data feed in?
>>>>
>>>> 2.      This code will print the lines, but not in the driver side,
>>>> the code is running in the executor side, so you can check the log in
>>>> worker dir to see if there’s any printing logs under this folder.
>>>>
>>>> 3.      Did you see any exceptions when running the app, this will
>>>> help to define the problem.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Jerry
>>>>
>>>>
>>>>
>>>> *From:* Something Something [mailto:mailinglist...@gmail.com]
>>>> *Sent:* Wednesday, November 05, 2014 1:57 PM
>>>> *To:* user@spark.apache.org
>>>> *Subject:* Kafka Consumer in Spark Streaming
>>>>
>>>>
>>>>
>>>> I've following code in my program.  I don't get any error, but it's not
>>>> consuming the messages either.  Shouldn't the following code print the line
>>>> in the 'call' method?  What am I missing?
>>>>
>>>> Please help.  Thanks.
>>>>
>>>>
>>>>
>>>>         JavaStreamingContext ssc = new JavaStreamingContext(sparkConf,
>>>> new Duration(60 * 1 * 1000));
>>>>
>>>>         JavaPairReceiverInputDStream tweets =
>>>> KafkaUtils.createStream(ssc, "<machine>:2181", "1", map);
>>>>
>>>>         JavaDStream<String> statuses = tweets.map(
>>>>                 new Function<String, String>() {
>>>>                     public String call(String status) {
>>>>                         System.out.println(status);
>>>>                         return status;
>>>>                     }
>>>>                 }
>>>>         );
>>>>
>>>>
>>>>
>>>
>>>   This email and any attachments are confidential, and may be legally
>>> privileged and protected by copyright. If you are not the intended
>>> recipient dissemination or copying of this email is prohibited. If you have
>>> received this in error, please notify the sender by replying by email and
>>> then delete the email completely from your system. Any views or opinions
>>> are solely those of the sender. This communication is not intended to form
>>> a binding contract unless expressly indicated to the contrary and properly
>>> authorised. Any actions taken on the basis of this email are at the
>>> recipient's own risk.
>>>
>>
>>
>

Re: Kafka Consumer in Spark Streaming

Reply via email to