Re: Storage Handlers in Spark SQL

2014-08-26 Thread chutium
it seems he means to query RDBMS or cassandra using Spark SQL, multi data
sources for spark SQL.

i looked through the link he posted
https://docs.wso2.com/display/BAM241/Creating+Hive+Queries+to+Analyze+Data#CreatingHiveQueriestoAnalyzeData-CreatingHivetablesforvariousdatasources

using their storage handlers, users can create hive external table from c*
table or RDBMS table (JDBC)

so Niranda, maybe you can take a look at this API:
https://issues.apache.org/jira/browse/SPARK-2179
and there is some doc in pull request pool:
https://github.com/apache/spark/pull/1774

there is a similar implementation to your JDBC storage handlers in spark
SQL, it could also be a sample of the Public API for DataTypes and Schema:
https://github.com/apache/spark/pull/1612
(https://issues.apache.org/jira/browse/SPARK-2710)

and, in some other userlist threads, i saw that, some kind of c* mapper is
also in development by datastax?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Re-Storage-Handlers-in-Spark-SQL-tp12780p12818.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Storage Handlers in Spark SQL

2014-08-25 Thread Michael Armbrust
- dev list
+ user list

You should be able to query Spark SQL using JDBC, starting with the 1.1
release.  There is some documentation is the repo
https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md#running-the-thrift-jdbc-server,
and we'll update the official docs once the release is out.


On Thu, Aug 21, 2014 at 4:43 AM, Niranda Perera nira...@wso2.com wrote:

 Hi,

 I have been playing around with Spark for the past few days, and evaluating
 the possibility of migrating into Spark (Spark SQL) from Hive/Hadoop.

 I am working on the WSO2 Business Activity Monitor (WSO2 BAM,

 https://docs.wso2.com/display/BAM241/WSO2+Business+Activity+Monitor+Documentation
 ) which has currently employed Hive. We are considering Spark as a
 successor for Hive, given it's performance enhancement.

 We have currently employed several custom storage-handlers in Hive.
 Example:
 WSO2 JDBC and Cassandra storage handlers:
 https://docs.wso2.com/display/BAM241/JDBC+Storage+Handler+for+Hive

 https://docs.wso2.com/display/BAM241/Creating+Hive+Queries+to+Analyze+Data#CreatingHiveQueriestoAnalyzeData-cas

 I would like to know where Spark SQL can work with these storage
 handlers (while using HiveContext may be) ?

 Best regards
 --
 *Niranda Perera*
 Software Engineer, WSO2 Inc.
 Mobile: +94-71-554-8430
 Twitter: @n1r44 https://twitter.com/N1R44