Exception comes when client has so many connections to some another
external server also.
So I think Exception is coming because of client side issue only- server
side there is no issue.
Want to understand is executor(simple consumer) not making new connection
to kafka broker at start of each
HI All,
Currently using DSE 4.7 and Spark 1.2.2 version
Regards,
Satish
On Fri, Aug 21, 2015 at 7:30 PM, java8964 java8...@hotmail.com wrote:
What version of Spark you are using, or comes with DSE 4.7?
We just cannot reproduce it in Spark.
yzhang@localhost$ more test.spark
val pairs =
Hmm for a singl core VM you will have to run it in local mode(specifying
master= local[4]). The flag is available in all the versions of spark i
guess.
On Aug 22, 2015 5:04 AM, Sateesh Kavuri sateesh.kav...@gmail.com wrote:
Thanks Akhil. Does this mean that the executor running in the VM can
1. how to work with partition in spark streaming from kafka
2. how to create partition in spark streaming from kafka
when i send the message from kafka topic having three partitions.
Spark will listen the message when i say kafkautils.createStream or
createDirectstSream have local[4]
Now i
Hi Rishitesh,
We are not using any RDD's to parallelize the processing and all of the
algorithm runs on a single core (and in a single thread). The parallelism
is done at the user level
The disk can be started in a separate IO, but then the executor will not be
able to take up more jobs, since
On trying the consumer without external connections or with low number of
external conections its working fine -
so doubt is how socket got closed -
java.io.EOFException: Received -1 when reading from channel, socket
has likely been closed.
On Sat, Aug 22, 2015 at 7:24 PM, Akhil Das
subscribe
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Reposting my question from SO:
http://stackoverflow.com/questions/32161865/elasticsearch-analyze-not-compatible-with-spark-in-python
I'm using the elasticsearch-py client within PySpark using Python 3 and I'm
running into a problem using the analyze() function with ES in conjunction
with an RDD.
Hi Akhil,
Think of the scenario as running a piece of code in normal Java with
multiple threads. Lets say there are 4 threads spawned by a Java process to
handle reading from database, some processing and storing to database. In
this process, while a thread is performing a database I/O, the CPU
Last time I checked, Camus doesn't support storing data as parquet, which
is a deal breaker for me. Otherwise it works well for my Kafka topics with
low data volume.
I am currently using spark streaming to ingest data, generate semi-realtime
stats and publish to a dashboard, and dump full dataset
https://www.youtube.com/watch?v=umDr0mPuyQc
On Sat, Aug 22, 2015 at 8:01 AM, Ted Yu yuzhih...@gmail.com wrote:
See http://spark.apache.org/community.html
Cheers
On Sat, Aug 22, 2015 at 2:51 AM, Lars Hermes
li...@hermes-it-consulting.de wrote:
subscribe
Hi All,
We have a spark standalone cluster running 1.4.1 and we are setting
spark.io.compression.codec to lzf.
I have a long running interactive application which behaves as normal,
but after a few days I get the following exception in multiple jobs. Any
ideas on what could be causing this
currently i am using spark 0.9 on my data i wrote code in java for
sparksql.now i want to use spark 1.4 so how to do and what changes i have to
do for tables.i ahve .sql file,pom file,.py file. iam using s3 for storage
--
View this message in context:
Interesting. TD, can you please throw some light on why this is and point
to the relevant code in Spark repo. It will help in a better understanding
of things that can affect a long running streaming job.
On Aug 21, 2015 1:44 PM, Tathagata Das t...@databricks.com wrote:
Could you periodically
Thanks Akhil. Does this mean that the executor running in the VM can spawn
two concurrent jobs on the same core? If this is the case, this is what we
are looking for. Also, which version of Spark is this flag in?
Thanks,
Sateesh
On Sat, Aug 22, 2015 at 1:44 AM, Akhil Das
On trying the consumer without external connections or with low
number of external conections its working fine -
so doubt is how socket got closed -
15/08/21 08:54:54 ERROR executor.Executor: Exception in task 262.0 in
stage 130.0 (TID 16332)
java.io.EOFException: Received -1 when reading
See http://spark.apache.org/community.html
Cheers
On Sat, Aug 22, 2015 at 2:51 AM, Lars Hermes li...@hermes-it-consulting.de
wrote:
subscribe
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
In Spark 1.4, there was considerable refactoring around interaction with
Hive, such as SPARK-7491.
It would not be straight forward to port ORC support to 1.3
FYI
On Fri, Aug 21, 2015 at 10:21 PM, dong.yajun dongt...@gmail.com wrote:
hi Ted,
thanks for your reply, are there any other way to
Can you try some other consumer and see if the issue still exists?
On Aug 22, 2015 12:47 AM, Shushant Arora shushantaror...@gmail.com
wrote:
Exception comes when client has so many connections to some another
external server also.
So I think Exception is coming because of client side issue
I think you also can give a try to this consumer :
http://spark-packages.org/package/dibbhatt/kafka-spark-consumer in your
environment. This has been running fine for topic with large number of
Kafka partition ( 200 ) like yours without any issue.. no issue with
connection as this consumer re-use
To be perfectly clear, the direct kafka stream will also recover from any
failures, because it does the simplest thing possible - fail the task and
let spark retry it.
If you're consistently having socket closed problems on one task after
another, there's probably something else going on in your
21 matches
Mail list logo