It would seem that I have hit SPARK-10787, an OOME
during ClosureCleaner#ensureSerializable(). I am trying to run LSH over a
SparseVector consisting of ~4M features with no more than 3K non-zero
values per vector. I am hitting this OOME before even the hashes are
calculated.
I know the issue is re
Hi All,
can we expect UUID type in Spark 2.3? It looks like it can help lot of
downstream sources to model.
Thanks!
Hi list,
I have a Spark cluster with 3 nodes. I'm calling spark-shell with some
packages to connect to AWS S3 and Cassandra:
spark-shell \
--packages
org.apache.hadoop:hadoop-aws:2.7.3,com.amazonaws:aws-java-sdk:1.7.4,datastax:spark-cassandra-connector:2.0.6-s_2.11
\
--conf spark.cassandra.co
Hi Vishu/Jacek:
Thanks for your responses.
Jacek - At the moment, the current time for my use case is processing time.
Vishnu - Spark documentation
(https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html)
does indicate that it can dedup using watermark. So I believe th
Hello,
I would like to limit task duration to prevent big task such as « SELECT * FROM
toto » , or limit the CPU-time, then kill the task/job.
Is that possible ? (A kind of watch dog)
Many thanks,
Thomas Decaux
Turns out it was the master recovery directory, that was messing things up. What was written there was on spark 2.0.2 and after replacing the master, the recovery process would fail with that error, but there were no clues that's what was happening.
What version of java?
On Feb 1, 2018 11:30 AM, "Mihai Iacob" wrote:
> I am setting up a spark 2.2.1 cluster, however, when I bring up the master
> and workers (both on spark 2.2.1) I get this error. I tried spark 2.2.0 and
> get the same error. It works fine on spark 2.0.2. Have you seen this
>
I am using spark 2.1.0
On Fri, Feb 2, 2018 at 5:08 PM, Pralabh Kumar
wrote:
> Hi
>
> I am performing broadcast join where my small table is 1 gb . I am
> getting following error .
>
> I am using
>
>
> org.apache.spark.SparkException:
> . Available: 0, required: 28869232. To avoid this, increase
Hi
I am performing broadcast join where my small table is 1 gb . I am getting
following error .
I am using
org.apache.spark.SparkException:
. Available: 0, required: 28869232. To avoid this, increase
spark.kryoserializer.buffer.max value
I increase the value to
spark.conf.set("spark.kryose
Great to hear 2 different viewpoints, and thanks a lot for your input
Michael. For now, our application perform an etl process where it reads
data from kafka and stores it in HBase and then performs basic enhancement
and pushes data out on a kafka topic.
We have a conflict of opinion here as few p
10 matches
Mail list logo