Re: Starting Spark SQL thrift server from within a streaming app

2015-08-06 Thread Todd Nist
Well the creation of a thrift server would be to allow external access to
the data from JDBC / ODBC type connections.  The sparkstreaming-sql
leverages a standard spark sql context and then provides a means of
converting an incoming dstream into a row, look at the MessageToRow trait
in KafkaSource class.

The example, org.apache.spark.sql.streaming.examples.KafkaDDL should make
it clear; I think.

-Todd

On Thu, Aug 6, 2015 at 7:58 AM, Daniel Haviv 
daniel.ha...@veracity-group.com wrote:

 Thank you Todd,
 How is the sparkstreaming-sql project different from starting a thrift
 server on a streaming app ?

 Thanks again.
 Daniel


 On Thu, Aug 6, 2015 at 1:53 AM, Todd Nist tsind...@gmail.com wrote:

 Hi Danniel,

 It is possible to create an instance of the SparkSQL Thrift server,
 however seems like this project is what you may be looking for:

 https://github.com/Intel-bigdata/spark-streamingsql

 Not 100% sure of your use case is, but you can always convert the data
 into DF then issue a query against it.  If you want other systems to be
 able to query it then there are numerous connectors to  store data into
 Hive, Cassandra, HBase, ElasticSearch, 

 To create a instance of a thrift server with its own SQL Context you
 would do something like the following:

 import org.apache.spark.{SparkConf, SparkContext}

 import org.apache.spark.sql.hive.HiveContext
 import org.apache.spark.sql.hive.HiveMetastoreTypes._
 import org.apache.spark.sql.types._
 import org.apache.spark.sql.hive.thriftserver._


 object MyThriftServer {

   val sparkConf = new SparkConf()
 // master is passed to spark-submit, but could also be specified 
 explicitely
 // .setMaster(sparkMaster)
 .setAppName(My ThriftServer)
 .set(spark.cores.max, 2)
   val sc = new SparkContext(sparkConf)
   val  sparkContext  =  sc
   import  sparkContext._
   val  sqlContext  =  new  HiveContext(sparkContext)
   import  sqlContext._
   import sqlContext.implicits._

   makeRDD((1,hello) :: (2,world) 
 ::Nil).toDF.cache().registerTempTable(t)

   HiveThriftServer2.startWithContext(sqlContext)
 }

 Again, I'm not really clear what your use case is, but it does sound like
 the first link above is what you may want.

 -Todd

 On Wed, Aug 5, 2015 at 1:57 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 Hi,
 Is it possible to start the Spark SQL thrift server from with a
 streaming app so the streamed data could be queried as it's goes in ?

 Thank you.
 Daniel






Re: Starting Spark SQL thrift server from within a streaming app

2015-08-06 Thread Daniel Haviv
Thank you Todd,
How is the sparkstreaming-sql project different from starting a thrift
server on a streaming app ?

Thanks again.
Daniel


On Thu, Aug 6, 2015 at 1:53 AM, Todd Nist tsind...@gmail.com wrote:

 Hi Danniel,

 It is possible to create an instance of the SparkSQL Thrift server,
 however seems like this project is what you may be looking for:

 https://github.com/Intel-bigdata/spark-streamingsql

 Not 100% sure of your use case is, but you can always convert the data
 into DF then issue a query against it.  If you want other systems to be
 able to query it then there are numerous connectors to  store data into
 Hive, Cassandra, HBase, ElasticSearch, 

 To create a instance of a thrift server with its own SQL Context you would
 do something like the following:

 import org.apache.spark.{SparkConf, SparkContext}

 import org.apache.spark.sql.hive.HiveContext
 import org.apache.spark.sql.hive.HiveMetastoreTypes._
 import org.apache.spark.sql.types._
 import org.apache.spark.sql.hive.thriftserver._


 object MyThriftServer {

   val sparkConf = new SparkConf()
 // master is passed to spark-submit, but could also be specified 
 explicitely
 // .setMaster(sparkMaster)
 .setAppName(My ThriftServer)
 .set(spark.cores.max, 2)
   val sc = new SparkContext(sparkConf)
   val  sparkContext  =  sc
   import  sparkContext._
   val  sqlContext  =  new  HiveContext(sparkContext)
   import  sqlContext._
   import sqlContext.implicits._

   makeRDD((1,hello) :: (2,world) 
 ::Nil).toDF.cache().registerTempTable(t)

   HiveThriftServer2.startWithContext(sqlContext)
 }

 Again, I'm not really clear what your use case is, but it does sound like
 the first link above is what you may want.

 -Todd

 On Wed, Aug 5, 2015 at 1:57 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 Hi,
 Is it possible to start the Spark SQL thrift server from with a streaming
 app so the streamed data could be queried as it's goes in ?

 Thank you.
 Daniel





Starting Spark SQL thrift server from within a streaming app

2015-08-05 Thread Daniel Haviv
Hi,
Is it possible to start the Spark SQL thrift server from with a streaming app 
so the streamed data could be queried as it's goes in ?

Thank you.
Daniel

Re: Starting Spark SQL thrift server from within a streaming app

2015-08-05 Thread Todd Nist
Hi Danniel,

It is possible to create an instance of the SparkSQL Thrift server, however
seems like this project is what you may be looking for:

https://github.com/Intel-bigdata/spark-streamingsql

Not 100% sure of your use case is, but you can always convert the data into
DF then issue a query against it.  If you want other systems to be able to
query it then there are numerous connectors to  store data into Hive,
Cassandra, HBase, ElasticSearch, 

To create a instance of a thrift server with its own SQL Context you would
do something like the following:

import org.apache.spark.{SparkConf, SparkContext}

import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.hive.HiveMetastoreTypes._
import org.apache.spark.sql.types._
import org.apache.spark.sql.hive.thriftserver._


object MyThriftServer {

  val sparkConf = new SparkConf()
// master is passed to spark-submit, but could also be specified explicitely
// .setMaster(sparkMaster)
.setAppName(My ThriftServer)
.set(spark.cores.max, 2)
  val sc = new SparkContext(sparkConf)
  val  sparkContext  =  sc
  import  sparkContext._
  val  sqlContext  =  new  HiveContext(sparkContext)
  import  sqlContext._
  import sqlContext.implicits._

  makeRDD((1,hello) :: (2,world) ::Nil).toDF.cache().registerTempTable(t)

  HiveThriftServer2.startWithContext(sqlContext)
}

Again, I'm not really clear what your use case is, but it does sound like
the first link above is what you may want.

-Todd

On Wed, Aug 5, 2015 at 1:57 PM, Daniel Haviv 
daniel.ha...@veracity-group.com wrote:

 Hi,
 Is it possible to start the Spark SQL thrift server from with a streaming
 app so the streamed data could be queried as it's goes in ?

 Thank you.
 Daniel