We do - using Spark streaming, Kafka, HDFS all collocated on the same nodes.
Works great so far.
Spark picks up the location information and reads data from the partitions
hosted by the local broker, showing up as NODE_LOCAL in the UI.
You also need to look at the locality options
The direct stream already uses the kafka leader for a given partition as
the preferred location.
I don't run kafka on the same nodes as spark, and I don't know anyone who
does, so that situation isn't particularly well tested.
On Mon, Sep 21, 2015 at 1:15 PM, Ashish Soni
You can't set it to less than 1
Just set it to max int if that's really what you want to do
On Mon, Aug 31, 2015 at 6:00 AM, Shushant Arora
wrote:
> Say if my cluster takes long time for rebalance for some reason
> intermittently . So to handle that Can I have
Say if my cluster takes long time for rebalance for some reason
intermittently . So to handle that Can I have infinite retries instead of
killing the app? What should be the value of retries (-1) will work or
something else ?
On Thu, Aug 27, 2015 at 6:46 PM, Cody Koeninger
Dears,
I needs to commit DB Transaction for each partition,Not for each row.
below didn't work for me.
rdd.mapPartitions(partitionOfRecords = {
DBConnectionInit()
val results = partitionOfRecords.map(..)
DBConnection.commit()
})
Best regards,
Ahmed Atef Nawwar
Data Management
Map is lazy. You need an actual action, or nothing will happen. Use
foreachPartition, or do an empty foreach after the map.
On Thu, Aug 27, 2015 at 8:53 AM, Ahmed Nawar ahmed.na...@gmail.com wrote:
Dears,
I needs to commit DB Transaction for each partition,Not for each row.
below
:
whats the default buffer in spark streaming 1.3 for kafka messages.
Say In this run it has to fetch messages from offset 1 to 1. will it
fetch all in one go or internally it fetches messages in few messages
batch.
Is there any setting to configure this no of offsets fetched in one
batch?
whats the default buffer in spark streaming 1.3 for kafka messages.
Say In this run it has to fetch messages from offset 1 to 1. will it
fetch all in one go or internally it fetches messages in few messages
batch.
Is there any setting to configure this no of offsets fetched in one batch?
see http://kafka.apache.org/documentation.html#consumerconfigs
fetch.message.max.bytes
in the kafka params passed to the constructor
On Wed, Aug 26, 2015 at 10:39 AM, Shushant Arora shushantaror...@gmail.com
wrote:
whats the default buffer in spark streaming 1.3 for kafka messages.
Say
Hi
My streaming application gets killed with below error
5/08/26 21:55:20 ERROR kafka.DirectKafkaInputDStream:
ArrayBuffer(kafka.common.NotLeaderForPartitionException,
kafka.common.NotLeaderForPartitionException,
kafka.common.NotLeaderForPartitionException,
...@gmail.com wrote:
whats the default buffer in spark streaming 1.3 for kafka messages.
Say In this run it has to fetch messages from offset 1 to 1. will it
fetch all in one go or internally it fetches messages in few messages
batch.
Is there any setting to configure this no of offsets
Exception comes when client has so many connections to some another
external server also.
So I think Exception is coming because of client side issue only- server
side there is no issue.
Want to understand is executor(simple consumer) not making new connection
to kafka broker at start of each
On trying the consumer without external connections or with low number of
external conections its working fine -
so doubt is how socket got closed -
java.io.EOFException: Received -1 when reading from channel, socket
has likely been closed.
On Sat, Aug 22, 2015 at 7:24 PM, Akhil Das
On trying the consumer without external connections or with low
number of external conections its working fine -
so doubt is how socket got closed -
15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in
stage 130.0 (TID 16332)
java.io.EOFException: Received -1 when reading
Can you try some other consumer and see if the issue still exists?
On Aug 22, 2015 12:47 AM, Shushant Arora shushantaror...@gmail.com
wrote:
Exception comes when client has so many connections to some another
external server also.
So I think Exception is coming because of client side issue
I think you also can give a try to this consumer :
http://spark-packages.org/package/dibbhatt/kafka-spark-consumer in your
environment. This has been running fine for topic with large number of
Kafka partition ( 200 ) like yours without any issue.. no issue with
connection as this consumer re-use
To be perfectly clear, the direct kafka stream will also recover from any
failures, because it does the simplest thing possible - fail the task and
let spark retry it.
If you're consistently having socket closed problems on one task after
another, there's probably something else going on in your
it comes at start of each tasks when there is new data inserted in kafka.(
data inserted is very few)
kafka topic has 300 partitions - data inserted is ~10 MB.
Tasks gets failed and it retries which succeed and after certain no of fail
tasks it kills the job.
On Sat, Aug 22, 2015 at 2:08 AM,
Sounds like that's happening consistently, not an occasional network
problem?
Look at the Kafka broker logs
Make sure you've configured the correct kafka broker hosts / ports (note
that direct stream does not use zookeeper host / port).
Make sure that host / port is reachable from your driver
That looks like you are choking your kafka machine. Do a top on the kafka
machines and see the workload, it may happen that you are spending too much
time on disk io etc.
On Aug 21, 2015 7:32 AM, Cody Koeninger c...@koeninger.org wrote:
Sounds like that's happening consistently, not an
Hi
Getting below error in spark streaming 1.3 while consuming from kafka
using directkafka stream. Few of tasks are getting failed in each run.
What is the reason /solution of this error?
15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in
stage 130.0 (TID 16332)
There's a long recent thread in this list about stopping apps, subject was
stopping spark stream app
at 1 second I wouldn't run repeated rdds, no.
I'd take a look at subclassing, personally (you'll have to rebuild the
streaming kafka project since a lot is private), but if topic changes dont
Hi Cody,
by start/stopping, do you mean the streaming context or the app entirely?
From what I understand once a streaming context has been stopped it cannot
be restarted, but I also haven't found a way to stop the app
programmatically.
The batch duration will probably be around 1-10 seconds. I
Hi all,
I want to write a Spark Streaming program that listens to Kafka for a list
of topics.
The list of topics that I want to consume is stored in a DB and might
change dynamically. I plan to periodically refresh this list of topics in
the Spark Streaming app.
My question is is it possible to
The current kafka stream implementation assumes the set of topics doesn't
change during operation.
You could either take a crack at writing a subclass that does what you
need; stop/start; or if your batch duration isn't too small, you could run
it as a series of RDDs (using the existing
to KafkaUtils.createDirectStream to get
access to all of the MessageAndMetadata, including partition and offset, on
a per-message basis.
On Tue, Jul 28, 2015 at 7:48 AM, Shushant Arora shushantaror...@gmail.com
wrote:
Hi
I am processing kafka messages using spark streaming 1.3.
I am using
Hi
I am processing kafka messages using spark streaming 1.3.
I am using mapPartitions function to process kafka message.
How can I access offset no of individual message getting being processed.
JavaPairInputDStreambyte[], byte[] directKafkaStream
=KafkaUtils.createDirectStream
we use simple Kafka API that does not use Zookeeper and offsets
tracked only by Spark Streaming within its checkpoints. This
eliminates inconsistencies between Spark Streaming and Zookeeper/Kafka,
and
so each record is received by Spark Streaming effectively exactly once
despite failures
writes and
read is not strictly consistent? So
we use simple Kafka API that does not use Zookeeper and offsets
tracked only by Spark Streaming within its checkpoints. This
eliminates inconsistencies between Spark Streaming and Zookeeper/Kafka,
and
so each record is received by Spark
.
This eliminates inconsistencies between Spark Streaming and
Zookeeper/Kafka, and so each record is received by Spark Streaming
effectively exactly once despite failures.
So we have to call context.checkpoint(hdfsdir)? Or is it
implicit checkoint location ? Means does hdfs be used for small
data(just
not use Zookeeper and offsets
tracked only by Spark Streaming within its checkpoints. This
eliminates inconsistencies between Spark Streaming and
Zookeeper/Kafka, and
so each record is received by Spark Streaming effectively exactly once
despite failures.
So we have to call context.checkpoint
in zookeeper, is it because of zookeeper is not efficient for high
writes
and read is not strictly consistent? So
we use simple Kafka API that does not use Zookeeper and offsets
tracked only by Spark Streaming within its checkpoints. This
eliminates inconsistencies between Spark Streaming
is not efficient for high writes
and
read is not strictly consistent? So
we use simple Kafka API that does not use Zookeeper and offsets
tracked only by Spark Streaming within its checkpoints. This
eliminates inconsistencies between Spark Streaming and Zookeeper/Kafka,
and
so each record
between Spark Streaming and
Zookeeper/Kafka, and
so each record is received by Spark Streaming effectively exactly once
despite failures.
So we have to call context.checkpoint(hdfsdir)? Or is it implicit
checkoint location ? Means does hdfs be used for small data(just
offset?)
On Sat
that does not use Zookeeper and offsets
tracked only by Spark Streaming within its checkpoints. This eliminates
inconsistencies between Spark Streaming and Zookeeper/Kafka, and so each
record is received by Spark Streaming effectively exactly once despite
failures.
So we have to call
in
zookeeper, is it because of zookeeper is not efficient for high writes and
read is not strictly consistent? So
we use simple Kafka API that does not use Zookeeper and offsets
tracked only by Spark Streaming within its checkpoints. This
eliminates inconsistencies between Spark Streaming
is not efficient for high writes and
read is not strictly consistent? So
we use simple Kafka API that does not use Zookeeper and offsets tracked
only by Spark Streaming within its checkpoints. This eliminates
inconsistencies between Spark Streaming and Zookeeper/Kafka, and so each
record is received
. This eliminates
inconsistencies between Spark Streaming and Zookeeper/Kafka, and so each
record is received by Spark Streaming effectively exactly once despite
failures.
So we have to call context.checkpoint(hdfsdir)? Or is it implicit
checkoint location ? Means does hdfs be used for small data(just offset
, is it because of zookeeper is not efficient for high writes and
read is not strictly consistent? So
we use simple Kafka API that does not use Zookeeper and offsets tracked
only by Spark Streaming within its checkpoints. This eliminates
inconsistencies between Spark Streaming and Zookeeper/Kafka
tracked
only by Spark Streaming within its checkpoints. This eliminates
inconsistencies between Spark Streaming and Zookeeper/Kafka, and so each
record is received by Spark Streaming effectively exactly once despite
failures.
So we have to call context.checkpoint(hdfsdir)? Or is it implicit
In the receiver based approach, If the receiver crashes for any reason
(receiver crashed or executor crashed) the receiver should get restarted on
another executor and should start reading data from the offset present in
the zookeeper. There is some chance of data loss which can alleviated using
Hi,
There is another option to try for Receiver Based Low Level Kafka Consumer
which is part of Spark-Packages (
http://spark-packages.org/package/dibbhatt/kafka-spark-consumer) . This can
be used with WAL as well for end to end zero data loss.
This is also Reliable Receiver and Commit offset to
The receiver-based kafka createStream in spark 1.2 uses zookeeper to store
offsets. If you want finer-grained control over offsets, you can update
the values in zookeeper yourself before starting the job.
createDirectStream in spark 1.3 is still marked as experimental, and
subject to change.
Read the spark streaming guide ad the kafka integration guide for a better
understanding of how the receiver based stream works.
Capacity planning is specific to your environment and what the job is
actually doing, youll need to determine it empirically.
On Friday, June 26, 2015, Shushant Arora
In 1.2 how to handle offset management after stream application starts in
each job . I should commit offset after job completion manually?
And what is recommended no of consumer threads. Say I have 300 partitions
in kafka cluster . Load is ~ 1 million events per second.Each event is of
~500bytes.
I am using spark streaming 1.2.
If processing executors get crashed will receiver rest the offset back to
last processed offset?
If receiver itself got crashed is there a way to reset the offset without
restarting streaming application other than smallest or largest.
Is spark streaming 1.3
, 2015 at 11:56 AM, Shushant Arora shushantaror...@gmail.com
wrote:
hi
While using spark streaming (1.2) with kafka . I am getting below error
and receivers are getting killed but jobs get scheduled at each stream
interval.
15/06/23 18:42:35 WARN TaskSetManager: Lost task 0.1 in stage 18.0 (TID
hi
While using spark streaming (1.2) with kafka . I am getting below error
and receivers are getting killed but jobs get scheduled at each stream
interval.
15/06/23 18:42:35 WARN TaskSetManager: Lost task 0.1 in stage 18.0 (TID 82,
ip(XX)): java.io.IOException: Failed to connect to ip
...@gmail.com wrote:
hi
While using spark streaming (1.2) with kafka . I am getting below error
and receivers are getting killed but jobs get scheduled at each stream
interval.
15/06/23 18:42:35 WARN TaskSetManager: Lost task 0.1 in stage 18.0 (TID
82, ip(XX)): java.io.IOException: Failed
, Shushant Arora
shushantaror...@gmail.com wrote:
hi
While using spark streaming (1.2) with kafka . I am getting below error
and receivers are getting killed but jobs get scheduled at each stream
interval.
15/06/23 18:42:35 WARN TaskSetManager: Lost task 0.1 in stage 18.0 (TID
82, ip(XX
make the
assembly marking the those dependencies as scope=provided.
On Tue, Jun 23, 2015 at 11:56 AM, Shushant Arora
shushantaror...@gmail.com wrote:
hi
While using spark streaming (1.2) with kafka . I am getting below error
and receivers are getting killed but jobs get scheduled
/adopted approached to monitoring Spark
Streaming
from Kafka? I see that there are things like
http://quantifind.github.io/KafkaOffsetMonitor, for example. Do they
all
assume that Receiver-based streaming is used?
Then Note that one disadvantage of this approach (Receiverless Approach,
#2
Hi,
What are some of the good/adopted approached to monitoring Spark Streaming
from Kafka? I see that there are things like
http://quantifind.github.io/KafkaOffsetMonitor, for example. Do they all
assume that Receiver-based streaming is used?
Then Note that one disadvantage of this approach
.
TD
On Mon, Jun 1, 2015 at 2:23 PM, dgoldenberg dgoldenberg...@gmail.com
wrote:
Hi,
What are some of the good/adopted approached to monitoring Spark Streaming
from Kafka? I see that there are things like
http://quantifind.github.io/KafkaOffsetMonitor, for example. Do they all
assume
are some of the good/adopted approached to monitoring Spark Streaming
from Kafka? I see that there are things like
http://quantifind.github.io/KafkaOffsetMonitor, for example. Do they all
assume that Receiver-based streaming is used?
Then Note that one disadvantage of this approach (Receiverless
/22/monitoring-stream-processing-tools-cassandra-kafka-and-spark/
Otis
On Mon, Jun 1, 2015 at 5:23 PM, dgoldenberg dgoldenberg...@gmail.com
wrote:
Hi,
What are some of the good/adopted approached to monitoring Spark Streaming
from Kafka? I see that there are things like
http
are some of the good/adopted approached to monitoring Spark Streaming
from Kafka? I see that there are things like
http://quantifind.github.io/KafkaOffsetMonitor, for example. Do they all
assume that Receiver-based streaming is used?
Then Note that one disadvantage of this approach
Hi guys,
I using spark streaming with kafka... In local machine (start as java
application without using spark-submit) it's work, connect to kafka and do
the job (*). I tried to put into spark docker container (hadoop 2.6, spark
1.3.1, try spark submit wil local[5] and yarn-client too ) but I'm
as not to overwhelm
the Spark consumers?
What would be some of the ways to throttle the streamed messages so that
the
consumers don't run out of memory?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-from-Kafka-no-receivers-and-spark-streaming
the Spark consumers?
What would be some of the ways to throttle the streamed messages so that
the
consumers don't run out of memory?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-from-Kafka-no-receivers-and-spark-streaming-receiver
-Streaming-from-Kafka-no-receivers-and-spark-streaming-receiver-maxRate-tp23061.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-from-Kafka-no-receivers-and-spark-streaming-receiver-maxRate-tp23061.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
integrated with consuming messages
from Kafka, so I thought of asking the forum, that is there any
implementation available for pushing data to Kafka from Spark Streaming too?
Any link(s) will be helpful.
Thanks and Regards,
Twinkle
, that is there any implementation
available for pushing data to Kafka from Spark Streaming too?
Any link(s) will be helpful.
Thanks and Regards,
Twinkle
Hi,
As Spark streaming is being nicely integrated with consuming messages from
Kafka, so I thought of asking the forum, that is there any implementation
available for pushing data to Kafka from Spark Streaming too?
Any link(s) will be helpful.
Thanks and Regards,
Twinkle
.) You can say, eah receiver will run on a single core.
Thanks
Best Regards
On Wed, Apr 15, 2015 at 3:46 PM, Shushant Arora shushantaror...@gmail.com
wrote:
Hi
I want to understand the flow of spark streaming with kafka.
In spark Streaming is the executor nodes at each run of streaming interval
on a single core.
Thanks
Best Regards
On Wed, Apr 15, 2015 at 3:46 PM, Shushant Arora shushantaror...@gmail.com
wrote:
Hi
I want to understand the flow of spark streaming with kafka.
In spark Streaming is the executor nodes at each run of streaming
interval same or At each stream interval
Hi
I want to understand the flow of spark streaming with kafka.
In spark Streaming is the executor nodes at each run of streaming interval
same or At each stream interval cluster manager assigns new executor nodes
for processing this batch input. If yes then at each batch interval new
executors
...@sigmoidanalytics.com
*Date:* 2015-04-15 19:12
*To:* Shushant Arora shushantaror...@gmail.com
*CC:* user user@spark.apache.org
*Subject:* Re: spark streaming with kafka
Once you start your streaming application to read from Kafka, it will
launch receivers on the executor nodes. And you can see them
Or you could build an uber jar ( you could google that )
https://eradiating.wordpress.com/2015/02/15/getting-spark-streaming-on-kafka-to-work/
--- Original Message ---
From: Akhil Das ak...@sigmoidanalytics.com
Sent: April 4, 2015 11:52 PM
To: Priya Ch learnings.chitt...@gmail.com
Cc: user
Somewhat agree on subclassing and its issues. It looks like the alternative
in spark 1.3.0 to create a custom build. Is there an enhancement filed for
this? If not, I'll file one.
Thanks!
-neelesh
On Wed, Apr 1, 2015 at 12:46 PM, Tathagata Das t...@databricks.com wrote:
The challenge of
with the following exception:
java.lang.ClassNotFoundException:
org/apache/spark/streaming/kafka/KafkaUtils.
I am using spark-1.2.1 version. when i checked the source files of
streaming, the source files related to kafka are missing. Are these not
included in spark-1.3.0 and spark-1.2.1 versions
from the IDE, the application runs fine.
But when I submit the same to spark cluster in standalone mode, I end up
with the following exception:
java.lang.ClassNotFoundException:
org/apache/spark/streaming/kafka/KafkaUtils.
I am using spark-1.2.1 version. when i checked the source files
With receivers, it was pretty obvious which code ran where - each receiver
occupied a core and ran on the workers. However, with the new kafka direct
input streams, its hard for me to understand where the code that's reading
from kafka brokers runs. Does it run on the driver (I hope not), or does
Thanks Cody, that was really helpful. I have a much better understanding
now. One last question - Kafka topics are initialized once in the driver,
is there an easy way of adding/removing topics on the fly?
KafkaRDD#getPartitions() seems to be computed only once, and no way of
refreshing them.
https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md
The kafka consumers run in the executors.
On Wed, Apr 1, 2015 at 11:18 AM, Neelesh neele...@gmail.com wrote:
With receivers, it was pretty obvious which code ran where - each receiver
occupied a core and ran on the
If you want to change topics from batch to batch, you can always just
create a KafkaRDD repeatedly.
The streaming code as it stands assumes a consistent set of topics though.
The implementation is private so you cant subclass it without building your
own spark.
On Wed, Apr 1, 2015 at 1:09 PM,
As I said in the original ticket, I think the implementation classes should
be exposed so that people can subclass and override compute() to suit their
needs.
Just adding a function from Time = Set[TopicAndPartition] wouldn't be
sufficient for some of my current production use cases.
compute()
Thanks Cody!
On Wed, Apr 1, 2015 at 11:21 AM, Cody Koeninger c...@koeninger.org wrote:
If you want to change topics from batch to batch, you can always just
create a KafkaRDD repeatedly.
The streaming code as it stands assumes a consistent set of topics
though. The implementation is
We should be able to support that use case in the direct API. It may be as
simple as allowing the users to pass on a function that returns the set of
topic+partitions to read from.
That is function (Time) = Set[TopicAndPartition] This gets called every
batch interval before the offsets are
The challenge of opening up these internal classes to public (even with
Developer API tag) is that it prevents us from making non-trivial changes
without breaking API compatibility for all those who had subclassed. Its a
tradeoff that is hard to optimize. That's why we favor exposing more
optional
Hello,
@Akhil Das I'm trying to use the experimental API
https://github.com/apache/spark/blob/master/examples/scala-2.10/src/main/scala/org/apache/spark/examples/streaming/DirectKafkaWordCount.scala
Can you show us the output of DStream#print() if you have it ?
Thanks
On Tue, Mar 31, 2015 at 2:55 AM, Nicolas Phung nicolas.ph...@gmail.com
wrote:
Hello,
@Akhil Das I'm trying to use the experimental API
.
On Mon, Mar 30, 2015 at 11:05 AM, Nicolas Phung nicolas.ph...@gmail.com
wrote:
Hello,
I'm using spark-streaming-kafka 1.3.0 with the new consumer Approach 2:
Direct Approach (No Receivers) (
http://spark.apache.org/docs/latest/streaming-kafka-integration.html).
I'm using the following
Phung nicolas.ph...@gmail.com
wrote:
Hello,
I'm using spark-streaming-kafka 1.3.0 with the new consumer Approach 2:
Direct Approach (No Receivers) (
http://spark.apache.org/docs/latest/streaming-kafka-integration.html).
I'm using the following code snippets :
// Create direct kafka stream
AM, Nicolas Phung nicolas.ph...@gmail.com
wrote:
Hello,
I'm using spark-streaming-kafka 1.3.0 with the new consumer Approach 2:
Direct Approach (No Receivers) (
http://spark.apache.org/docs/latest/streaming-kafka-integration.html).
I'm using the following code snippets :
// Create direct
Hello,
I'm using spark-streaming-kafka 1.3.0 with the new consumer Approach 2:
Direct Approach (No Receivers) (
http://spark.apache.org/docs/latest/streaming-kafka-integration.html). I'm
using the following code snippets :
// Create direct kafka stream with brokers and topics
val messages
I want to write a spark streaming consumer for kafka in java. I want to
process the data in real-time as well as store the data in hdfs in
year/month/day/hour/ format. I am not sure how to achieve this. Should I
write separate kafka consumers, one for writing data to HDFS and one for
spark
), it will easily put the data into the
directory structure you are after.
On Fri, Feb 6, 2015 at 12:19 AM, Mohit Durgapal durgapalmo...@gmail.com
wrote:
I want to write a spark streaming consumer for kafka in java. I want to
process the data in real-time as well as store the data in hdfs in
year/month/day
(in which case refer to #1).
On Fri Feb 06 2015 at 6:16:39 AM Mohit Durgapal durgapalmo...@gmail.com
wrote:
I want to write a spark streaming consumer for kafka in java. I want to
process the data in real-time as well as store the data in hdfs in
year/month/day/hour/ format. I am not sure how
I want to write a spark streaming consumer for kafka in java. I want to
process the data in real-time as well as store the data in hdfs in
year/month/day/hour/ format. I am not sure how to achieve this. Should I
write separate kafka consumers, one for writing data to HDFS and one for
spark
Maybe, you can use alternative kafka receiver which I wrote:
https://github.com/mykidong/spark-kafka-simple-consumer-receiver
- Kidong.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Integrerate-Spark-Streaming-and-Kafka-but-get-bad-symbolic-reference
On Wed, Jan 21, 2015 at 7:46 AM, firemonk9 dhiraj.peech...@gmail.com
wrote:
Hi,
I am having similar issues. Have you found any resolution ?
Thank you
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-with-Kafka-tp21222p21276.html
Hi,
I am having similar issues. Have you found any resolution ?
Thank you
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-with-Kafka-tp21222p21276.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
I have the same issue.
- Messaggio originale -
Da: Rasika Pohankar rasikapohan...@gmail.com
Inviato: 18/01/2015 18:48
A: user@spark.apache.org user@spark.apache.org
Oggetto: Spark Streaming with Kafka
I am using Spark Streaming to process data received through Kafka. The Spark
if the problem was in
that version. But after upgrading also, it is happening.
Is this a known issue? Can someone please help.
Thanking you.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-with-Kafka-tp21222.html
Sent from the Apache Spark
There is a WIP pull request[1] working on this, it should be merged
into master soon.
[1] https://github.com/apache/spark/pull/3715
On Fri, Dec 19, 2014 at 2:15 AM, Oleg Ruchovets oruchov...@gmail.com wrote:
Hi ,
I've just seen that streaming spark supports python from 1.2 version.
Hi ,
I've just seen that streaming spark supports python from 1.2 version.
Question, does spark streaming (python version ) supports kafka integration?
Thanks
Oleg.
Hi,
While running my spark streaming application built on spark 1.1.0 I am
getting below error.
*14/11/18 15:35:30 ERROR ReceiverTracker: Deregistered receiver for stream
0: Error starting receiver 0 - java.lang.AbstractMethodError*
* at org.apache.spark.Logging$class.log(Logging.scala:52)*
* at
Hi,
do you have some logging backend (log4j, logback) on your classpath? This
seems a bit like there is no particular implementation of the abstract
`log()` method available.
Tobias
101 - 200 of 226 matches
Mail list logo