Re: How to print data to console in structured streaming using Spark 2.1.0?

kant kodali Tue, 16 May 2017 16:30:33 -0700

Hi Michael,

Thanks for the catch. I assume you meant
*spark-streaming-kafka-0-10_2.11-2.1.0.jar*


I add this in all spark machines under SPARK_HOME/jars.

Still same error seems to persist. Is that the right jar or is there
anything else I need to add?

Thanks!



On Tue, May 16, 2017 at 1:40 PM, Michael Armbrust <mich...@databricks.com>
wrote:

> Looks like you are missing the kafka dependency.
>
> On Tue, May 16, 2017 at 1:04 PM, kant kodali <kanth...@gmail.com> wrote:
>
>> Looks like I am getting the following runtime exception. I am using Spark
>> 2.1.0 and the following jars
>>
>> *spark-sql_2.11-2.1.0.jar*
>>
>> *spark-sql-kafka-0-10_2.11-2.1.0.jar*
>>
>> *spark-streaming_2.11-2.1.0.jar*
>>
>>
>> Exception in thread "stream execution thread for [id = 
>> fcfe1fa6-dab3-4769-9e15-e074af622cc1, runId = 
>> 7c54940a-e453-41de-b256-049b539b59b1]"
>>
>> java.lang.NoClassDefFoundError: 
>> org/apache/kafka/common/serialization/ByteArrayDeserializer
>>     at 
>> org.apache.spark.sql.kafka010.KafkaSourceProvider.createSource(KafkaSourceProvider.scala:74)
>>     at 
>> org.apache.spark.sql.execution.datasources.DataSource.createSource(DataSource.scala:245)
>>     at 
>> org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$logicalPlan$1.applyOrElse(StreamExecution.scala:127)
>>     at 
>> org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$logicalPlan$1.applyOrElse(StreamExecution.scala:123)
>>     at 
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:288)
>>     at 
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:288)
>>     at 
>> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
>>
>>
>> On Tue, May 16, 2017 at 10:30 AM, Shixiong(Ryan) Zhu <
>> shixi...@databricks.com> wrote:
>>
>>> The default "startingOffsets" is "latest". If you don't push any data
>>> after starting the query, it won't fetch anything. You can set it to
>>> "earliest" like ".option("startingOffsets", "earliest")" to start the
>>> stream from the beginning.
>>>
>>> On Tue, May 16, 2017 at 12:36 AM, kant kodali <kanth...@gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I have the following code.
>>>>
>>>>  val ds = sparkSession.readStream()
>>>>                 .format("kafka")
>>>>                 .option("kafka.bootstrap.servers",bootstrapServers))
>>>>                 .option("subscribe", topicName)
>>>>                 .option("checkpointLocation", hdfsCheckPointDir)
>>>>                 .load();
>>>>
>>>>  val ds1 = ds.select($"value")
>>>>  val query = ds1.writeStream.outputMode("append").format("console").start()
>>>>  query.awaitTermination()
>>>>
>>>> There are no errors when I execute this code however I don't see any
>>>> data being printed out to console? When I run my standalone test Kafka
>>>> consumer jar I can see that it is receiving messages. so I am not sure what
>>>> is going on with above code? any ideas?
>>>>
>>>> Thanks!
>>>>
>>>
>>>
>>
>

Re: How to print data to console in structured streaming using Spark 2.1.0?

Reply via email to