Re: Streaming json records from kafka ... how can I process ... help please :)

2015-12-23 Thread Akhil
Akhil wrote
> You can do it like this:
> 
> lines.foreachRDD(jsonRDD =>{ 
>
>   val data = sqlContext.read.json(jsonRDD)
>   data.registerTempTable("mytable")
>   sqlContext.sql("SELECT * FROM mytable")
> 
>   })

See
http://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations
and
http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets
for more information.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-json-records-from-kafka-how-can-I-process-help-please-tp25769p25782.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Streaming json records from kafka ... how can I process ... help please :)

2015-12-23 Thread Akhil
You can do it like this:

lines.foreachRDD(jsonRDD =>{ 
   
val data = sqlContext.read.json(jsonRDD)
data.registerTempTable("mytable")
sqlContext.sql("SELECT * FROM mytable")

})



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-json-records-from-kafka-how-can-I-process-help-please-tp25769p25781.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Streaming json records from kafka ... how can I process ... help please :)

2015-12-23 Thread Gideon
What you wrote is inaccurate.
When you create a directkafkastream what happens is that you actually create
DirectKafkaInputDStream. This DirectKafkaInputDStream extends a DStream. 2
functions that a DStream has are: map and print
when you map on your DirectKafkaInputDStream what you're actually getting is
a MappedDStream. MappedDStream also extends DStream which means you can
invoke print on it. 
DStream are a Spark Streaming abstraction that allows you to operate on RDDs
in the stream

Regarding the converting the JSON strings, I'm not quite sure what you mean
but you can easily transform the data using the different methods on the
DStream objects that you're getting (like your map example)

I hope that was clear



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-json-records-from-kafka-how-can-I-process-help-please-tp25769p25780.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org