Hi Antonio,
First, what version of the Spark Cassandra Connector are you using? You are
using Spark 1.3.1, which the Cassandra connector today supports in builds from
the master branch only - the release with public artifacts supporting Spark
1.3.1 is coming soon ;)
Please see
)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
15/06/01 16:43:30 WARN TaskSetManager: Lost task 1.0 in stage 61.0 (TID 82,
localhost): org.apache.spark.TaskKilledException
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
A G
2015-06-01 13:26 GMT+02:00 Helena
Consider using cassandra with spark streaming and timeseries, cassandra has
been doing time series for years.
Here’s some snippets with kafka streaming and writing/reading the data back:
Streaming _from_ cassandra, CassandraInputDStream, is coming BTW
https://issues.apache.org/jira/browse/SPARK-6283
https://issues.apache.org/jira/browse/SPARK-6283
I am working on it now.
Helena
@helenaedelson
On Mar 23, 2015, at 5:22 AM, Khanderao Kand Gmail khanderao.k...@gmail.com
wrote:
Rizal anriza...@gmail.com wrote:
Helena,
The CassandraInputDStream sounds interesting. I dont find many things in the
jira though. Do you have more details on what it tries to achieve ?
Thanks,
Anwar.
On Tue, Mar 24, 2015 at 2:39 PM, Helena Edelson helena.edel...@datastax.com
Hi Cui,
What version of Spark are you using? There was a bug ticket that may be related
to this, fixed in core/src/main/scala/org/apache/spark/rdd/RDD.scala that is
merged into versions 1.3.0 and 1.2.1 . If you are using 1.1.1 that may be the
reason but it’s a stretch
[MonthlyCommits]}
.saveToCassandra(githubstats,monthly_commits)
HELENA EDELSON
Senior Software Engineer, DSE Analytics
On Mar 5, 2015, at 9:33 AM, Ted Yu yuzhih...@gmail.com wrote:
Cui:
You can check messages.partitions.size to determine whether messages is an
empty RDD.
Cheers
I am curious why you use the 1.0.4 java artifact with the latest 1.1.0? This
might be your compilation problem - The older java version.
dependency
groupIdcom.datastax.spark/groupId
artifactIdspark-cassandra-connector_2.10/artifactId
version1.1.0/version
/dependency
dependency
One solution can be found here:
https://spark.apache.org/docs/1.1.0/sql-programming-guide.html#json-datasets
- Helena
@helenaedelson
On Dec 13, 2014, at 11:18 AM, Madabhattula Rajesh Kumar mrajaf...@gmail.com
wrote:
Hi Team,
I have a large JSON file in Hadoop. Could you please let me know
You can just do
You can just do something like this, the Spark Cassandra Connector handles the
rest
KafkaUtils.createStream[String, String, StringDecoder, StringDecoder](
ssc, kafkaParams, Map(KafkaTopicRaw - 10), StorageLevel.DISK_ONLY_2)
.map { case (_, line) = line.split(,)}
.
Thanks and Regards,
Md. Aiman Sarosh.
Accenture Services Pvt. Ltd.
Mob #: (+91) - 9836112841.
From: Helena Edelson helena.edel...@datastax.com
Sent: Friday, December 5, 2014 6:26 PM
To: Sarosh, M.
Cc: user@spark.apache.org
Subject: Re: Spark-Streaming: output to cassandra
You
I encounter no issues with streaming from kafka to spark in 1.1.0. Do you
perhaps have a version conflict?
Helena
On Nov 13, 2014 12:55 AM, Jay Vyas jayunit100.apa...@gmail.com wrote:
Yup , very important that n1 for spark streaming jobs, If local use
local[2]
The thing to remember is
Hi,
It looks like you are building from master
(spark-cassandra-connector-assembly-1.2.0).
- Append this to your com.google.guava declaration: % provided
- Be sure your version of the connector dependency is the same as the assembly
build. For instance, if you are using 1.1.0-beta1, build your
Hi Harold,
Can you include the versions of spark and spark-cassandra-connector you are
using?
Thanks!
Helena
@helenaedelson
On Oct 30, 2014, at 12:58 PM, Harold Nguyen har...@nexgate.com wrote:
Hi all,
I'd like to be able to modify values in a DStream, and then send it off to an
Hi Shahab,
I’m just curious, are you explicitly needing to use thrift? Just using the
connector with spark does not require any thrift dependencies.
Simply: com.datastax.spark %% spark-cassandra-connector % 1.1.0-beta1”
But to your question, you declare the keyspace but also unnecessarily
,
Harold
On Fri, Oct 31, 2014 at 10:31 AM, Helena Edelson
helena.edel...@datastax.com wrote:
Hi Harold,
Can you include the versions of spark and spark-cassandra-connector you are
using?
Thanks!
Helena
@helenaedelson
On Oct 30, 2014, at 12:58 PM, Harold Nguyen har...@nexgate.com
Hi Harold,
This is a great use case, and here is how you could do it, for example, with
Spark Streaming:
Using a Kafka stream:
https://github.com/killrweather/killrweather/blob/master/killrweather-app/src/main/scala/com/datastax/killrweather/KafkaStreamingActor.scala#L50
Save raw data to
,
org.slf4j % slf4j-api % 1.7.7,
org.slf4j % slf4j-simple % 1.7.7,
org.clapper %% grizzled-slf4j % 1.0.2,
log4j % log4j % 1.2.17
On Fri, Oct 31, 2014 at 6:42 PM, Helena Edelson helena.edel...@datastax.com
wrote:
Hi Shahab,
I’m just curious, are you explicitly needing
-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L26-L37
Cheers,
Helena
@helenaedelson
On Oct 30, 2014, at 1:12 PM, Helena Edelson helena.edel...@datastax.com wrote:
Hi Shahab,
-How many spark/cassandra nodes are in your cluster?
-What is your deploy
Nice!
- Helena
@helenaedelson
On Oct 29, 2014, at 12:01 PM, Mike Sukmanowsky mike.sukmanow...@gmail.com
wrote:
Hey all,
Just thought I'd share this with the list in case any one else would benefit.
Currently working on a proper integration of PySpark and DataStax's new
Hi Harold,
It seems like, based on your previous post, you are using one version of the
connector as a dependency yet building the assembly jar from master? You were
using 1.1.0-alpha3 (you can upgrade to alpha4, beta coming this week) yet your
assembly is
absolutely new to Spark
and Scala and sbt). I'll write a blog post on how to get this working later,
in case it can help someone.
I really appreciate the help!
Harold
On Tue, Oct 28, 2014 at 11:55 AM, Helena Edelson
helena.edel...@datastax.com wrote:
Hi Harold,
It seems like, based
Hi Sasi,
Thrift is not needed to integrate Cassandra with Spark. In fact the only dep
you need is spark-cassandra-connector_2.10-1.1.0-alpha3.jar, and you can
upgrade to alpha4; we’re publishing beta very soon. For future reference,
questions/tickets can be created
Hi,
It is very easy to integrate using Cassandra in a use case such as this. For
instance, do your joins in Spark and do your data storage in Cassandra which
allows a very flexible schema, unlike a relational DB, and is much faster,
fault tolerant, and with spark and colocation WRT data
24 matches
Mail list logo