[ 
https://issues.apache.org/jira/browse/SPARK-20389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482408#comment-16482408
 ] 

Arseniy Tashoyan commented on SPARK-20389:
------------------------------------------

I confirm: the issue is in place for Spark 2.2.1. I hit it when running KMeans 
on a dataset of ~4M records when trying to get ~500 clusters.

I tried to avoid the problem by increasing spark.sql.shuffle.partitions to a 
large number like 500, but then I got OutOfMemoryError: Heap space on 
executors. So this workaround is suitable only with a capable cluster at hands.

> Upgrade kryo to fix NegativeArraySizeException
> ----------------------------------------------
>
>                 Key: SPARK-20389
>                 URL: https://issues.apache.org/jira/browse/SPARK-20389
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, Spark Submit
>    Affects Versions: 2.1.0
>         Environment: Linux, Centos7, jdk8
>            Reporter: Georg Heiler
>            Priority: Major
>
> I am experiencing an issue with Kryo when writing parquet files. Similar to 
> https://github.com/broadinstitute/gatk/issues/1524 a 
> NegativeArraySizeException occurs. Apparently this is fixed in a current Kryo 
> version. Spark is still using the very old 3.3 Kryo. 
> Can you please upgrade to a fixed Kryo version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to