RE: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-06 Thread Mohammed Guller
Sure, will do. I may not be able to get to it until next week, but will let you 
know if I am able to the crack the code.

Mohammed

From: Todd Nist [mailto:tsind...@gmail.com]
Sent: Friday, April 3, 2015 5:52 PM
To: Mohammed Guller
Cc: pawan kumar; user@spark.apache.org
Subject: Re: Tableau + Spark SQL Thrift Server + Cassandra

Thanks Mohammed,

I was aware of Calliope, but haven't used it since with since the 
spark-cassandra-connector project got released.  I was not aware of the 
CalliopeServer2; cool thanks for sharing that one.

I would appreciate it if you could lmk how you decide to proceed with this; I 
can see this coming up on my radar in the next few months; thanks.

-Todd

On Fri, Apr 3, 2015 at 5:53 PM, Mohammed Guller 
moham...@glassbeam.commailto:moham...@glassbeam.com wrote:
Thanks, Todd.

It is an interesting idea; worth trying.

I think the cash project is old. The tuplejump guy has created another project 
called CalliopeServer2, which works like a charm with BI tools that use JDBC, 
but unfortunately Tableau throws an error when it connects to it.

Mohammed

From: Todd Nist [mailto:tsind...@gmail.commailto:tsind...@gmail.com]
Sent: Friday, April 3, 2015 11:39 AM
To: pawan kumar
Cc: Mohammed Guller; user@spark.apache.orgmailto:user@spark.apache.org

Subject: Re: Tableau + Spark SQL Thrift Server + Cassandra

Hi Mohammed,

Not sure if you have tried this or not.  You could try using the below api to 
start the thriftserver with an existing context.

https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42

The one thing that Michael Ambrust @ databrick recommended was this:
You can start a JDBC server with an existing context.  See my answer here: 
http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html

So something like this based on example from Cheng Lian:

Server

import  org.apache.spark.sql.hive.HiveContext

import  org.apache.spark.sql.catalyst.types._



val  sparkContext  =  sc

import  sparkContext._

val  sqlContext  =  new  HiveContext(sparkContext)

import  sqlContext._

makeRDD((1,hello) :: (2,world) 
::Nil).toSchemaRDD.cache().registerTempTable(t)

// replace the above with the C* + spark-casandra-connectore to generate 
SchemaRDD and registerTempTable



import  org.apache.spark.sql.hive.thriftserver._

HiveThriftServer2.startWithContext(sqlContext)
Then Startup

./bin/beeline -u jdbc:hive2://localhost:1/default

0: jdbc:hive2://localhost:1/default select * from t;


I have not tried this yet from Tableau.   My understanding is that the 
tempTable is only valid as long as the sqlContext is, so if one terminates the 
code representing the Server, and then restarts the standard thrift server, 
sbin/start-thriftserver ..., the table won't be available.

Another possibility is to perhaps use the tuplejump cash project, 
https://github.com/tuplejump/cash.

HTH.

-Todd

On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar 
pkv...@gmail.commailto:pkv...@gmail.com wrote:

Thanks mohammed. Will give it a try today. We would also need the sparksSQL 
piece as we are migrating our data store from oracle to C* and it would be 
easier to maintain all the reports rather recreating each one from scratch.

Thanks,
Pawan Venugopal.
On Apr 3, 2015 7:59 AM, Mohammed Guller 
moham...@glassbeam.commailto:moham...@glassbeam.com wrote:
Hi Todd,

We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly with C* 
using the ODBC driver, but now would like to add Spark SQL to the mix. I 
haven’t been able to find any documentation for how to make this combination 
work.

We are using the Spark-Cassandra-Connector in our applications, but haven’t 
been able to figure out how to get the Spark SQL Thrift Server to use it and 
connect to C*. That is the missing piece. Once we solve that piece of the 
puzzle then Tableau should be able to see the tables in C*.

Hi Pawan,
Tableau + C* is pretty straight forward, especially if you are using DSE. 
Create a new DSN in Tableau using the ODBC driver that comes with DSE. Once you 
connect, Tableau allows to use C* keyspace as schema and column families as 
tables.

Mohammed

From: pawan kumar [mailto:pkv...@gmail.commailto:pkv...@gmail.com]
Sent: Friday, April 3, 2015 7:41 AM
To: Todd Nist
Cc: user@spark.apache.orgmailto:user@spark.apache.org; Mohammed Guller
Subject: Re: Tableau + Spark SQL Thrift Server + Cassandra


Hi Todd,

Thanks for the link. I would be interested in this solution. I am using DSE for 
cassandra. Would you provide me with info on connecting with DSE either through 
Tableau or zeppelin. The goal here is query cassandra through spark sql so that 
I could perform joins and groupby on my queries. Are you able to perform spark 
sql queries with tableau?

Thanks,
Pawan Venugopal
On Apr 3, 2015 5:03 AM, Todd Nist 
tsind...@gmail.commailto:tsind...@gmail.com wrote:
What version of Cassandra

Re: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread Todd Nist
What version of Cassandra are you using?  Are you using DSE or the stock
Apache Cassandra version?  I have connected it with DSE, but have not
attempted it with the standard Apache Cassandra version.

FWIW,
http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
provides an ODBC driver tor accessing C* from Tableau.  Granted it does not
provide all the goodness of Spark.  Are you attempting to leverage the
spark-cassandra-connector for this?



On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller moham...@glassbeam.com
wrote:

  Hi –



 Is anybody using Tableau to analyze data in Cassandra through the Spark
 SQL Thrift Server?



 Thanks!



 Mohammed





RE: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread pawan kumar
Thanks mohammed. Will give it a try today. We would also need the sparksSQL
piece as we are migrating our data store from oracle to C* and it would be
easier to maintain all the reports rather recreating each one from scratch.

Thanks,
Pawan Venugopal.
On Apr 3, 2015 7:59 AM, Mohammed Guller moham...@glassbeam.com wrote:

  Hi Todd,



 We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly
 with C* using the ODBC driver, but now would like to add Spark SQL to the
 mix. I haven’t been able to find any documentation for how to make this
 combination work.



 We are using the Spark-Cassandra-Connector in our applications, but
 haven’t been able to figure out how to get the Spark SQL Thrift Server to
 use it and connect to C*. That is the missing piece. Once we solve that
 piece of the puzzle then Tableau should be able to see the tables in C*.



 Hi Pawan,

 Tableau + C* is pretty straight forward, especially if you are using DSE.
 Create a new DSN in Tableau using the ODBC driver that comes with DSE. Once
 you connect, Tableau allows to use C* keyspace as schema and column
 families as tables.



 Mohammed



 *From:* pawan kumar [mailto:pkv...@gmail.com]
 *Sent:* Friday, April 3, 2015 7:41 AM
 *To:* Todd Nist
 *Cc:* user@spark.apache.org; Mohammed Guller
 *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra



 Hi Todd,

 Thanks for the link. I would be interested in this solution. I am using
 DSE for cassandra. Would you provide me with info on connecting with DSE
 either through Tableau or zeppelin. The goal here is query cassandra
 through spark sql so that I could perform joins and groupby on my queries.
 Are you able to perform spark sql queries with tableau?

 Thanks,
 Pawan Venugopal

 On Apr 3, 2015 5:03 AM, Todd Nist tsind...@gmail.com wrote:

 What version of Cassandra are you using?  Are you using DSE or the stock
 Apache Cassandra version?  I have connected it with DSE, but have not
 attempted it with the standard Apache Cassandra version.



 FWIW,
 http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
 provides an ODBC driver tor accessing C* from Tableau.  Granted it does not
 provide all the goodness of Spark.  Are you attempting to leverage the
 spark-cassandra-connector for this?







 On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller moham...@glassbeam.com
 wrote:

 Hi –



 Is anybody using Tableau to analyze data in Cassandra through the Spark
 SQL Thrift Server?



 Thanks!



 Mohammed







Re: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread pawan kumar
Hi Todd,

Thanks for the link. I would be interested in this solution. I am using DSE
for cassandra. Would you provide me with info on connecting with DSE either
through Tableau or zeppelin. The goal here is query cassandra through spark
sql so that I could perform joins and groupby on my queries. Are you able
to perform spark sql queries with tableau?

Thanks,
Pawan Venugopal
On Apr 3, 2015 5:03 AM, Todd Nist tsind...@gmail.com wrote:

 What version of Cassandra are you using?  Are you using DSE or the stock
 Apache Cassandra version?  I have connected it with DSE, but have not
 attempted it with the standard Apache Cassandra version.

 FWIW,
 http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
 provides an ODBC driver tor accessing C* from Tableau.  Granted it does not
 provide all the goodness of Spark.  Are you attempting to leverage the
 spark-cassandra-connector for this?



 On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller moham...@glassbeam.com
 wrote:

  Hi –



 Is anybody using Tableau to analyze data in Cassandra through the Spark
 SQL Thrift Server?



 Thanks!



 Mohammed







RE: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread Mohammed Guller
Hi Todd,

We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly with C* 
using the ODBC driver, but now would like to add Spark SQL to the mix. I 
haven’t been able to find any documentation for how to make this combination 
work.

We are using the Spark-Cassandra-Connector in our applications, but haven’t 
been able to figure out how to get the Spark SQL Thrift Server to use it and 
connect to C*. That is the missing piece. Once we solve that piece of the 
puzzle then Tableau should be able to see the tables in C*.

Hi Pawan,
Tableau + C* is pretty straight forward, especially if you are using DSE. 
Create a new DSN in Tableau using the ODBC driver that comes with DSE. Once you 
connect, Tableau allows to use C* keyspace as schema and column families as 
tables.

Mohammed

From: pawan kumar [mailto:pkv...@gmail.com]
Sent: Friday, April 3, 2015 7:41 AM
To: Todd Nist
Cc: user@spark.apache.org; Mohammed Guller
Subject: Re: Tableau + Spark SQL Thrift Server + Cassandra


Hi Todd,

Thanks for the link. I would be interested in this solution. I am using DSE for 
cassandra. Would you provide me with info on connecting with DSE either through 
Tableau or zeppelin. The goal here is query cassandra through spark sql so that 
I could perform joins and groupby on my queries. Are you able to perform spark 
sql queries with tableau?

Thanks,
Pawan Venugopal
On Apr 3, 2015 5:03 AM, Todd Nist 
tsind...@gmail.commailto:tsind...@gmail.com wrote:
What version of Cassandra are you using?  Are you using DSE or the stock Apache 
Cassandra version?  I have connected it with DSE, but have not attempted it 
with the standard Apache Cassandra version.

FWIW, 
http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
 provides an ODBC driver tor accessing C* from Tableau.  Granted it does not 
provide all the goodness of Spark.  Are you attempting to leverage the 
spark-cassandra-connector for this?



On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller 
moham...@glassbeam.commailto:moham...@glassbeam.com wrote:
Hi –

Is anybody using Tableau to analyze data in Cassandra through the Spark SQL 
Thrift Server?

Thanks!

Mohammed




Re: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread Todd Nist
Hi Mohammed,

Not sure if you have tried this or not.  You could try using the below api
to start the thriftserver with an existing context.

https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42

The one thing that Michael Ambrust @ databrick recommended was this:

 You can start a JDBC server with an existing context.  See my answer here:
 http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html

So something like this based on example from Cheng Lian:

*Server*

import  org.apache.spark.sql.hive.HiveContext
import  org.apache.spark.sql.catalyst.types._

val  sparkContext  =  sc
import  sparkContext._
val  sqlContext  =  new  HiveContext(sparkContext)
import  sqlContext._
makeRDD((1,hello) :: (2,world)
::Nil).toSchemaRDD.cache().registerTempTable(t)
// replace the above with the C* + spark-casandra-connectore to
generate SchemaRDD and registerTempTable

import  org.apache.spark.sql.hive.thriftserver._
HiveThriftServer2.startWithContext(sqlContext)

Then Startup

./bin/beeline -u jdbc:hive2://localhost:1/default
0: jdbc:hive2://localhost:1/default select * from t;


I have not tried this yet from Tableau.   My understanding is that the
tempTable is only valid as long as the sqlContext is, so if one terminates
the code representing the *Server*, and then restarts the standard thrift
server, sbin/start-thriftserver ..., the table won't be available.

Another possibility is to perhaps use the tuplejump cash project,
https://github.com/tuplejump/cash.

HTH.

-Todd

On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar pkv...@gmail.com wrote:

 Thanks mohammed. Will give it a try today. We would also need the
 sparksSQL piece as we are migrating our data store from oracle to C* and it
 would be easier to maintain all the reports rather recreating each one from
 scratch.

 Thanks,
 Pawan Venugopal.
 On Apr 3, 2015 7:59 AM, Mohammed Guller moham...@glassbeam.com wrote:

  Hi Todd,



 We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly
 with C* using the ODBC driver, but now would like to add Spark SQL to the
 mix. I haven’t been able to find any documentation for how to make this
 combination work.



 We are using the Spark-Cassandra-Connector in our applications, but
 haven’t been able to figure out how to get the Spark SQL Thrift Server to
 use it and connect to C*. That is the missing piece. Once we solve that
 piece of the puzzle then Tableau should be able to see the tables in C*.



 Hi Pawan,

 Tableau + C* is pretty straight forward, especially if you are using DSE.
 Create a new DSN in Tableau using the ODBC driver that comes with DSE. Once
 you connect, Tableau allows to use C* keyspace as schema and column
 families as tables.



 Mohammed



 *From:* pawan kumar [mailto:pkv...@gmail.com]
 *Sent:* Friday, April 3, 2015 7:41 AM
 *To:* Todd Nist
 *Cc:* user@spark.apache.org; Mohammed Guller
 *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra



 Hi Todd,

 Thanks for the link. I would be interested in this solution. I am using
 DSE for cassandra. Would you provide me with info on connecting with DSE
 either through Tableau or zeppelin. The goal here is query cassandra
 through spark sql so that I could perform joins and groupby on my queries.
 Are you able to perform spark sql queries with tableau?

 Thanks,
 Pawan Venugopal

 On Apr 3, 2015 5:03 AM, Todd Nist tsind...@gmail.com wrote:

 What version of Cassandra are you using?  Are you using DSE or the stock
 Apache Cassandra version?  I have connected it with DSE, but have not
 attempted it with the standard Apache Cassandra version.



 FWIW,
 http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
 provides an ODBC driver tor accessing C* from Tableau.  Granted it does not
 provide all the goodness of Spark.  Are you attempting to leverage the
 spark-cassandra-connector for this?







 On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller moham...@glassbeam.com
 wrote:

 Hi –



 Is anybody using Tableau to analyze data in Cassandra through the Spark
 SQL Thrift Server?



 Thanks!



 Mohammed








Re: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread Todd Nist
@Pawan

Not sure if you have seen this or not, but here is a good example by
Jonathan Lacefield of Datastax's on hooking up sparksql with DSE, adding
Tableau is as simple as Mohammed stated with DSE.
https://github.com/jlacefie/sparksqltest.

HTH,
Todd

On Fri, Apr 3, 2015 at 2:39 PM, Todd Nist tsind...@gmail.com wrote:

 Hi Mohammed,

 Not sure if you have tried this or not.  You could try using the below api
 to start the thriftserver with an existing context.


 https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42

 The one thing that Michael Ambrust @ databrick recommended was this:

 You can start a JDBC server with an existing context.  See my answer
 here:
 http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html

 So something like this based on example from Cheng Lian:

 *Server*

 import  org.apache.spark.sql.hive.HiveContext
 import  org.apache.spark.sql.catalyst.types._

 val  sparkContext  =  sc
 import  sparkContext._
 val  sqlContext  =  new  HiveContext(sparkContext)
 import  sqlContext._
 makeRDD((1,hello) :: (2,world) 
 ::Nil).toSchemaRDD.cache().registerTempTable(t)
 // replace the above with the C* + spark-casandra-connectore to generate 
 SchemaRDD and registerTempTable

 import  org.apache.spark.sql.hive.thriftserver._
 HiveThriftServer2.startWithContext(sqlContext)

 Then Startup

 ./bin/beeline -u jdbc:hive2://localhost:1/default
 0: jdbc:hive2://localhost:1/default select * from t;


 I have not tried this yet from Tableau.   My understanding is that the
 tempTable is only valid as long as the sqlContext is, so if one terminates
 the code representing the *Server*, and then restarts the standard thrift
 server, sbin/start-thriftserver ..., the table won't be available.

 Another possibility is to perhaps use the tuplejump cash project,
 https://github.com/tuplejump/cash.

 HTH.

 -Todd

 On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar pkv...@gmail.com wrote:

 Thanks mohammed. Will give it a try today. We would also need the
 sparksSQL piece as we are migrating our data store from oracle to C* and it
 would be easier to maintain all the reports rather recreating each one from
 scratch.

 Thanks,
 Pawan Venugopal.
 On Apr 3, 2015 7:59 AM, Mohammed Guller moham...@glassbeam.com wrote:

  Hi Todd,



 We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly
 with C* using the ODBC driver, but now would like to add Spark SQL to the
 mix. I haven’t been able to find any documentation for how to make this
 combination work.



 We are using the Spark-Cassandra-Connector in our applications, but
 haven’t been able to figure out how to get the Spark SQL Thrift Server to
 use it and connect to C*. That is the missing piece. Once we solve that
 piece of the puzzle then Tableau should be able to see the tables in C*.



 Hi Pawan,

 Tableau + C* is pretty straight forward, especially if you are using
 DSE. Create a new DSN in Tableau using the ODBC driver that comes with DSE.
 Once you connect, Tableau allows to use C* keyspace as schema and column
 families as tables.



 Mohammed



 *From:* pawan kumar [mailto:pkv...@gmail.com]
 *Sent:* Friday, April 3, 2015 7:41 AM
 *To:* Todd Nist
 *Cc:* user@spark.apache.org; Mohammed Guller
 *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra



 Hi Todd,

 Thanks for the link. I would be interested in this solution. I am using
 DSE for cassandra. Would you provide me with info on connecting with DSE
 either through Tableau or zeppelin. The goal here is query cassandra
 through spark sql so that I could perform joins and groupby on my queries.
 Are you able to perform spark sql queries with tableau?

 Thanks,
 Pawan Venugopal

 On Apr 3, 2015 5:03 AM, Todd Nist tsind...@gmail.com wrote:

 What version of Cassandra are you using?  Are you using DSE or the stock
 Apache Cassandra version?  I have connected it with DSE, but have not
 attempted it with the standard Apache Cassandra version.



 FWIW,
 http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
 provides an ODBC driver tor accessing C* from Tableau.  Granted it does not
 provide all the goodness of Spark.  Are you attempting to leverage the
 spark-cassandra-connector for this?







 On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller moham...@glassbeam.com
 wrote:

 Hi –



 Is anybody using Tableau to analyze data in Cassandra through the Spark
 SQL Thrift Server?



 Thanks!



 Mohammed









Re: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread Todd Nist
@Pawan,

So it's been a couple of months since I have had a chance to do anything
with Zeppelin, but here is a link to a post on what I did to get it working
https://groups.google.com/forum/#!topic/zeppelin-developers/mCNdyOXNikI.
This may or may not work with the newer releases from Zeppelin.

-Todd

On Fri, Apr 3, 2015 at 3:02 PM, pawan kumar pkv...@gmail.com wrote:

 Hi Todd,

 Thanks for the help. So i was able to get the DSE working with tableau as
 per the link provided by Mohammed. Now i trying to figure out if i could
 write sparksql queries from tableau and get data from DSE. My end goal is
 to get a web based tool where i could write sql queries which will pull
 data from cassandra.

 With Zeppelin I was able to build and run it in EC2 but not sure if
 configurations are right. I am pointing to a spark master which is a remote
 DSE node and all spark and sparksql dependencies are in the remote node. I
 am not sure if i need to install spark and its dependencies in the webui
 (zepplene) node.

 I am not sure talking about zepplelin in this thread is right.

 Thanks once again for all the help.

 Thanks,
 Pawan Venugopal


 On Fri, Apr 3, 2015 at 11:48 AM, Todd Nist tsind...@gmail.com wrote:

 @Pawan

 Not sure if you have seen this or not, but here is a good example by
 Jonathan Lacefield of Datastax's on hooking up sparksql with DSE, adding
 Tableau is as simple as Mohammed stated with DSE.
 https://github.com/jlacefie/sparksqltest.

 HTH,
 Todd

 On Fri, Apr 3, 2015 at 2:39 PM, Todd Nist tsind...@gmail.com wrote:

 Hi Mohammed,

 Not sure if you have tried this or not.  You could try using the below
 api to start the thriftserver with an existing context.


 https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42

 The one thing that Michael Ambrust @ databrick recommended was this:

 You can start a JDBC server with an existing context.  See my answer
 here:
 http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html

 So something like this based on example from Cheng Lian:

 *Server*

 import  org.apache.spark.sql.hive.HiveContext
 import  org.apache.spark.sql.catalyst.types._

 val  sparkContext  =  sc
 import  sparkContext._
 val  sqlContext  =  new  HiveContext(sparkContext)
 import  sqlContext._
 makeRDD((1,hello) :: (2,world) 
 ::Nil).toSchemaRDD.cache().registerTempTable(t)
 // replace the above with the C* + spark-casandra-connectore to generate 
 SchemaRDD and registerTempTable

 import  org.apache.spark.sql.hive.thriftserver._
 HiveThriftServer2.startWithContext(sqlContext)

 Then Startup

 ./bin/beeline -u jdbc:hive2://localhost:1/default
 0: jdbc:hive2://localhost:1/default select * from t;


 I have not tried this yet from Tableau.   My understanding is that the
 tempTable is only valid as long as the sqlContext is, so if one terminates
 the code representing the *Server*, and then restarts the standard
 thrift server, sbin/start-thriftserver ..., the table won't be available.

 Another possibility is to perhaps use the tuplejump cash project,
 https://github.com/tuplejump/cash.

 HTH.

 -Todd

 On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar pkv...@gmail.com wrote:

 Thanks mohammed. Will give it a try today. We would also need the
 sparksSQL piece as we are migrating our data store from oracle to C* and it
 would be easier to maintain all the reports rather recreating each one from
 scratch.

 Thanks,
 Pawan Venugopal.
 On Apr 3, 2015 7:59 AM, Mohammed Guller moham...@glassbeam.com
 wrote:

  Hi Todd,



 We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly
 with C* using the ODBC driver, but now would like to add Spark SQL to the
 mix. I haven’t been able to find any documentation for how to make this
 combination work.



 We are using the Spark-Cassandra-Connector in our applications, but
 haven’t been able to figure out how to get the Spark SQL Thrift Server to
 use it and connect to C*. That is the missing piece. Once we solve that
 piece of the puzzle then Tableau should be able to see the tables in C*.



 Hi Pawan,

 Tableau + C* is pretty straight forward, especially if you are using
 DSE. Create a new DSN in Tableau using the ODBC driver that comes with 
 DSE.
 Once you connect, Tableau allows to use C* keyspace as schema and column
 families as tables.



 Mohammed



 *From:* pawan kumar [mailto:pkv...@gmail.com]
 *Sent:* Friday, April 3, 2015 7:41 AM
 *To:* Todd Nist
 *Cc:* user@spark.apache.org; Mohammed Guller
 *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra



 Hi Todd,

 Thanks for the link. I would be interested in this solution. I am
 using DSE for cassandra. Would you provide me with info on connecting with
 DSE either through Tableau or zeppelin. The goal here is query cassandra
 through spark sql so that I could perform joins and groupby on my queries.
 Are you able to perform spark sql

Re: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread pawan kumar
Hi Todd,

Thanks for the help. So i was able to get the DSE working with tableau as
per the link provided by Mohammed. Now i trying to figure out if i could
write sparksql queries from tableau and get data from DSE. My end goal is
to get a web based tool where i could write sql queries which will pull
data from cassandra.

With Zeppelin I was able to build and run it in EC2 but not sure if
configurations are right. I am pointing to a spark master which is a remote
DSE node and all spark and sparksql dependencies are in the remote node. I
am not sure if i need to install spark and its dependencies in the webui
(zepplene) node.

I am not sure talking about zepplelin in this thread is right.

Thanks once again for all the help.

Thanks,
Pawan Venugopal


On Fri, Apr 3, 2015 at 11:48 AM, Todd Nist tsind...@gmail.com wrote:

 @Pawan

 Not sure if you have seen this or not, but here is a good example by
 Jonathan Lacefield of Datastax's on hooking up sparksql with DSE, adding
 Tableau is as simple as Mohammed stated with DSE.
 https://github.com/jlacefie/sparksqltest.

 HTH,
 Todd

 On Fri, Apr 3, 2015 at 2:39 PM, Todd Nist tsind...@gmail.com wrote:

 Hi Mohammed,

 Not sure if you have tried this or not.  You could try using the below
 api to start the thriftserver with an existing context.


 https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42

 The one thing that Michael Ambrust @ databrick recommended was this:

 You can start a JDBC server with an existing context.  See my answer
 here:
 http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html

 So something like this based on example from Cheng Lian:

 *Server*

 import  org.apache.spark.sql.hive.HiveContext
 import  org.apache.spark.sql.catalyst.types._

 val  sparkContext  =  sc
 import  sparkContext._
 val  sqlContext  =  new  HiveContext(sparkContext)
 import  sqlContext._
 makeRDD((1,hello) :: (2,world) 
 ::Nil).toSchemaRDD.cache().registerTempTable(t)
 // replace the above with the C* + spark-casandra-connectore to generate 
 SchemaRDD and registerTempTable

 import  org.apache.spark.sql.hive.thriftserver._
 HiveThriftServer2.startWithContext(sqlContext)

 Then Startup

 ./bin/beeline -u jdbc:hive2://localhost:1/default
 0: jdbc:hive2://localhost:1/default select * from t;


 I have not tried this yet from Tableau.   My understanding is that the
 tempTable is only valid as long as the sqlContext is, so if one terminates
 the code representing the *Server*, and then restarts the standard
 thrift server, sbin/start-thriftserver ..., the table won't be available.

 Another possibility is to perhaps use the tuplejump cash project,
 https://github.com/tuplejump/cash.

 HTH.

 -Todd

 On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar pkv...@gmail.com wrote:

 Thanks mohammed. Will give it a try today. We would also need the
 sparksSQL piece as we are migrating our data store from oracle to C* and it
 would be easier to maintain all the reports rather recreating each one from
 scratch.

 Thanks,
 Pawan Venugopal.
 On Apr 3, 2015 7:59 AM, Mohammed Guller moham...@glassbeam.com
 wrote:

  Hi Todd,



 We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly
 with C* using the ODBC driver, but now would like to add Spark SQL to the
 mix. I haven’t been able to find any documentation for how to make this
 combination work.



 We are using the Spark-Cassandra-Connector in our applications, but
 haven’t been able to figure out how to get the Spark SQL Thrift Server to
 use it and connect to C*. That is the missing piece. Once we solve that
 piece of the puzzle then Tableau should be able to see the tables in C*.



 Hi Pawan,

 Tableau + C* is pretty straight forward, especially if you are using
 DSE. Create a new DSN in Tableau using the ODBC driver that comes with DSE.
 Once you connect, Tableau allows to use C* keyspace as schema and column
 families as tables.



 Mohammed



 *From:* pawan kumar [mailto:pkv...@gmail.com]
 *Sent:* Friday, April 3, 2015 7:41 AM
 *To:* Todd Nist
 *Cc:* user@spark.apache.org; Mohammed Guller
 *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra



 Hi Todd,

 Thanks for the link. I would be interested in this solution. I am using
 DSE for cassandra. Would you provide me with info on connecting with DSE
 either through Tableau or zeppelin. The goal here is query cassandra
 through spark sql so that I could perform joins and groupby on my queries.
 Are you able to perform spark sql queries with tableau?

 Thanks,
 Pawan Venugopal

 On Apr 3, 2015 5:03 AM, Todd Nist tsind...@gmail.com wrote:

 What version of Cassandra are you using?  Are you using DSE or the
 stock Apache Cassandra version?  I have connected it with DSE, but have not
 attempted it with the standard Apache Cassandra version.



 FWIW,
 http://www.datastax.com/dev/blog/datastax-odbc-cql-connector

RE: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread Mohammed Guller
Thanks, Todd.

It is an interesting idea; worth trying.

I think the cash project is old. The tuplejump guy has created another project 
called CalliopeServer2, which works like a charm with BI tools that use JDBC, 
but unfortunately Tableau throws an error when it connects to it.

Mohammed

From: Todd Nist [mailto:tsind...@gmail.com]
Sent: Friday, April 3, 2015 11:39 AM
To: pawan kumar
Cc: Mohammed Guller; user@spark.apache.org
Subject: Re: Tableau + Spark SQL Thrift Server + Cassandra

Hi Mohammed,

Not sure if you have tried this or not.  You could try using the below api to 
start the thriftserver with an existing context.

https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42

The one thing that Michael Ambrust @ databrick recommended was this:
You can start a JDBC server with an existing context.  See my answer here: 
http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html

So something like this based on example from Cheng Lian:

Server

import  org.apache.spark.sql.hive.HiveContext

import  org.apache.spark.sql.catalyst.types._



val  sparkContext  =  sc

import  sparkContext._

val  sqlContext  =  new  HiveContext(sparkContext)

import  sqlContext._

makeRDD((1,hello) :: (2,world) 
::Nil).toSchemaRDD.cache().registerTempTable(t)

// replace the above with the C* + spark-casandra-connectore to generate 
SchemaRDD and registerTempTable



import  org.apache.spark.sql.hive.thriftserver._

HiveThriftServer2.startWithContext(sqlContext)
Then Startup

./bin/beeline -u jdbc:hive2://localhost:1/default

0: jdbc:hive2://localhost:1/default select * from t;


I have not tried this yet from Tableau.   My understanding is that the 
tempTable is only valid as long as the sqlContext is, so if one terminates the 
code representing the Server, and then restarts the standard thrift server, 
sbin/start-thriftserver ..., the table won't be available.

Another possibility is to perhaps use the tuplejump cash project, 
https://github.com/tuplejump/cash.

HTH.

-Todd

On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar 
pkv...@gmail.commailto:pkv...@gmail.com wrote:

Thanks mohammed. Will give it a try today. We would also need the sparksSQL 
piece as we are migrating our data store from oracle to C* and it would be 
easier to maintain all the reports rather recreating each one from scratch.

Thanks,
Pawan Venugopal.
On Apr 3, 2015 7:59 AM, Mohammed Guller 
moham...@glassbeam.commailto:moham...@glassbeam.com wrote:
Hi Todd,

We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly with C* 
using the ODBC driver, but now would like to add Spark SQL to the mix. I 
haven’t been able to find any documentation for how to make this combination 
work.

We are using the Spark-Cassandra-Connector in our applications, but haven’t 
been able to figure out how to get the Spark SQL Thrift Server to use it and 
connect to C*. That is the missing piece. Once we solve that piece of the 
puzzle then Tableau should be able to see the tables in C*.

Hi Pawan,
Tableau + C* is pretty straight forward, especially if you are using DSE. 
Create a new DSN in Tableau using the ODBC driver that comes with DSE. Once you 
connect, Tableau allows to use C* keyspace as schema and column families as 
tables.

Mohammed

From: pawan kumar [mailto:pkv...@gmail.commailto:pkv...@gmail.com]
Sent: Friday, April 3, 2015 7:41 AM
To: Todd Nist
Cc: user@spark.apache.orgmailto:user@spark.apache.org; Mohammed Guller
Subject: Re: Tableau + Spark SQL Thrift Server + Cassandra


Hi Todd,

Thanks for the link. I would be interested in this solution. I am using DSE for 
cassandra. Would you provide me with info on connecting with DSE either through 
Tableau or zeppelin. The goal here is query cassandra through spark sql so that 
I could perform joins and groupby on my queries. Are you able to perform spark 
sql queries with tableau?

Thanks,
Pawan Venugopal
On Apr 3, 2015 5:03 AM, Todd Nist 
tsind...@gmail.commailto:tsind...@gmail.com wrote:
What version of Cassandra are you using?  Are you using DSE or the stock Apache 
Cassandra version?  I have connected it with DSE, but have not attempted it 
with the standard Apache Cassandra version.

FWIW, 
http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
 provides an ODBC driver tor accessing C* from Tableau.  Granted it does not 
provide all the goodness of Spark.  Are you attempting to leverage the 
spark-cassandra-connector for this?



On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller 
moham...@glassbeam.commailto:moham...@glassbeam.com wrote:
Hi –

Is anybody using Tableau to analyze data in Cassandra through the Spark SQL 
Thrift Server?

Thanks!

Mohammed





Re: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread pawan kumar
.



 Mohammed



 *From:* pawan kumar [mailto:pkv...@gmail.com]
 *Sent:* Friday, April 3, 2015 7:41 AM
 *To:* Todd Nist
 *Cc:* user@spark.apache.org; Mohammed Guller
 *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra



 Hi Todd,

 Thanks for the link. I would be interested in this solution. I am
 using DSE for cassandra. Would you provide me with info on connecting 
 with
 DSE either through Tableau or zeppelin. The goal here is query cassandra
 through spark sql so that I could perform joins and groupby on my 
 queries.
 Are you able to perform spark sql queries with tableau?

 Thanks,
 Pawan Venugopal

 On Apr 3, 2015 5:03 AM, Todd Nist tsind...@gmail.com wrote:

 What version of Cassandra are you using?  Are you using DSE or the
 stock Apache Cassandra version?  I have connected it with DSE, but have 
 not
 attempted it with the standard Apache Cassandra version.



 FWIW,
 http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
 provides an ODBC driver tor accessing C* from Tableau.  Granted it does 
 not
 provide all the goodness of Spark.  Are you attempting to leverage the
 spark-cassandra-connector for this?







 On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller 
 moham...@glassbeam.com wrote:

 Hi –



 Is anybody using Tableau to analyze data in Cassandra through the
 Spark SQL Thrift Server?



 Thanks!



 Mohammed












Re: Tableau + Spark SQL Thrift Server + Cassandra

2015-04-03 Thread Todd Nist
Thanks Mohammed,

I was aware of Calliope, but haven't used it since with since the
spark-cassandra-connector project got released.  I was not aware of the
CalliopeServer2; cool thanks for sharing that one.

I would appreciate it if you could lmk how you decide to proceed with this;
I can see this coming up on my radar in the next few months; thanks.

-Todd

On Fri, Apr 3, 2015 at 5:53 PM, Mohammed Guller moham...@glassbeam.com
wrote:

  Thanks, Todd.



 It is an interesting idea; worth trying.



 I think the cash project is old. The tuplejump guy has created another
 project called CalliopeServer2, which works like a charm with BI tools that
 use JDBC, but unfortunately Tableau throws an error when it connects to it.



 Mohammed



 *From:* Todd Nist [mailto:tsind...@gmail.com]
 *Sent:* Friday, April 3, 2015 11:39 AM
 *To:* pawan kumar
 *Cc:* Mohammed Guller; user@spark.apache.org

 *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra



 Hi Mohammed,



 Not sure if you have tried this or not.  You could try using the below api
 to start the thriftserver with an existing context.


 https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42

 The one thing that Michael Ambrust @ databrick recommended was this:

 You can start a JDBC server with an existing context.  See my answer here:
 http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html

 So something like this based on example from Cheng Lian:


 * Server*

 import  org.apache.spark.sql.hive.HiveContext

 import  org.apache.spark.sql.catalyst.types._



 val  sparkContext  =  sc

 import  sparkContext._

 val  sqlContext  =  new  HiveContext(sparkContext)

 import  sqlContext._

 makeRDD((1,hello) :: (2,world) 
 ::Nil).toSchemaRDD.cache().registerTempTable(t)

 // replace the above with the C* + spark-casandra-connectore to generate 
 SchemaRDD and registerTempTable



 import  org.apache.spark.sql.hive.thriftserver._

 HiveThriftServer2.startWithContext(sqlContext)

   Then Startup

 ./bin/beeline -u jdbc:hive2://localhost:1/default

 0: jdbc:hive2://localhost:1/default select * from t;



   I have not tried this yet from Tableau.   My understanding is that the
 tempTable is only valid as long as the sqlContext is, so if one terminates
 the code representing the *Server*, and then restarts the standard thrift
 server, sbin/start-thriftserver ..., the table won't be available.



 Another possibility is to perhaps use the tuplejump cash project,
 https://github.com/tuplejump/cash.



 HTH.



 -Todd



 On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar pkv...@gmail.com wrote:

 Thanks mohammed. Will give it a try today. We would also need the
 sparksSQL piece as we are migrating our data store from oracle to C* and it
 would be easier to maintain all the reports rather recreating each one from
 scratch.

 Thanks,
 Pawan Venugopal.

 On Apr 3, 2015 7:59 AM, Mohammed Guller moham...@glassbeam.com wrote:

 Hi Todd,



 We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly
 with C* using the ODBC driver, but now would like to add Spark SQL to the
 mix. I haven’t been able to find any documentation for how to make this
 combination work.



 We are using the Spark-Cassandra-Connector in our applications, but
 haven’t been able to figure out how to get the Spark SQL Thrift Server to
 use it and connect to C*. That is the missing piece. Once we solve that
 piece of the puzzle then Tableau should be able to see the tables in C*.



 Hi Pawan,

 Tableau + C* is pretty straight forward, especially if you are using DSE.
 Create a new DSN in Tableau using the ODBC driver that comes with DSE. Once
 you connect, Tableau allows to use C* keyspace as schema and column
 families as tables.



 Mohammed



 *From:* pawan kumar [mailto:pkv...@gmail.com]
 *Sent:* Friday, April 3, 2015 7:41 AM
 *To:* Todd Nist
 *Cc:* user@spark.apache.org; Mohammed Guller
 *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra



 Hi Todd,

 Thanks for the link. I would be interested in this solution. I am using
 DSE for cassandra. Would you provide me with info on connecting with DSE
 either through Tableau or zeppelin. The goal here is query cassandra
 through spark sql so that I could perform joins and groupby on my queries.
 Are you able to perform spark sql queries with tableau?

 Thanks,
 Pawan Venugopal

 On Apr 3, 2015 5:03 AM, Todd Nist tsind...@gmail.com wrote:

 What version of Cassandra are you using?  Are you using DSE or the stock
 Apache Cassandra version?  I have connected it with DSE, but have not
 attempted it with the standard Apache Cassandra version.



 FWIW,
 http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
 provides an ODBC driver tor accessing C* from Tableau.  Granted it does not
 provide all the goodness of Spark.  Are you attempting