[jira] [Updated] (SPARK-1054) Get Cassandra support in Spark Core/Spark Cassandra Module

2014-07-02 Thread Rohit Rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Rai updated SPARK-1054:
-

Summary: Get Cassandra support in Spark Core/Spark Cassandra Module  (was: 
Contribute Calliope Core to Spark as spark-cassandra)

 Get Cassandra support in Spark Core/Spark Cassandra Module
 --

 Key: SPARK-1054
 URL: https://issues.apache.org/jira/browse/SPARK-1054
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Reporter: Rohit Rai
  Labels: calliope, cassandra

 Calliope is a library providing an interface to consume data from Cassandra 
 to spark and store RDDs from Spark to Cassandra. 
 Building as wrapper over Cassandra's Hadoop I/O it provides a simplified and 
 very generic API to consume and produces data from and to Cassandra. It 
 allows you to consume data from Legacy as well as CQL3 Cassandra Storage.  It 
 can also harness C* to speed up your process by fetching only the relevant 
 data from C* harnessing CQL3 and C*'s secondary indexes. Though it currently 
 uses only the Hadoop I/O formats for Cassandra in near future we see the same 
 API harnessing other means of consuming Cassandra data like using the 
 StorageProxy or even reading from SSTables directly.
 Over the basic data fetch functionality, the Calliope API harnesses Scala and 
 it's implicit parameters and conversions for you to work on a higher 
 abstraction dealing with tuples/objects instead of Cassandra's Row/Columns in 
 your MapRed jobs.
 Over past few months we have seen the combination of Spark+Cassandra gaining 
 a lot of traction. And we feel Calliope provides the path of least friction 
 for developers to start working with this combination.
 We have been using this ins production for over a year now and the Calliope 
 early access repository has 30+ users.  I am putting this issue to start a 
 discussion around whether we would want Calliope to be a part of Spark and if 
 yes, what will be involved in doing so.
 You can read more about Calliope here -
 http://tuplejump.github.io/calliope



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1054) Get Cassandra support in Spark Core/Spark Cassandra Module

2014-07-02 Thread Rohit Rai (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050544#comment-14050544
 ] 

Rohit Rai commented on SPARK-1054:
--

With the https://github.com/datastax/cassandra-driver-spark from Datastax, we 
should work on getting a united standard API in Spark, getting good things from 
both worlds.

 Get Cassandra support in Spark Core/Spark Cassandra Module
 --

 Key: SPARK-1054
 URL: https://issues.apache.org/jira/browse/SPARK-1054
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Reporter: Rohit Rai
  Labels: calliope, cassandra

 Calliope is a library providing an interface to consume data from Cassandra 
 to spark and store RDDs from Spark to Cassandra. 
 Building as wrapper over Cassandra's Hadoop I/O it provides a simplified and 
 very generic API to consume and produces data from and to Cassandra. It 
 allows you to consume data from Legacy as well as CQL3 Cassandra Storage.  It 
 can also harness C* to speed up your process by fetching only the relevant 
 data from C* harnessing CQL3 and C*'s secondary indexes. Though it currently 
 uses only the Hadoop I/O formats for Cassandra in near future we see the same 
 API harnessing other means of consuming Cassandra data like using the 
 StorageProxy or even reading from SSTables directly.
 Over the basic data fetch functionality, the Calliope API harnesses Scala and 
 it's implicit parameters and conversions for you to work on a higher 
 abstraction dealing with tuples/objects instead of Cassandra's Row/Columns in 
 your MapRed jobs.
 Over past few months we have seen the combination of Spark+Cassandra gaining 
 a lot of traction. And we feel Calliope provides the path of least friction 
 for developers to start working with this combination.
 We have been using this ins production for over a year now and the Calliope 
 early access repository has 30+ users.  I am putting this issue to start a 
 discussion around whether we would want Calliope to be a part of Spark and if 
 yes, what will be involved in doing so.
 You can read more about Calliope here -
 http://tuplejump.github.io/calliope



--
This message was sent by Atlassian JIRA
(v6.2#6252)