[ https://issues.apache.org/jira/browse/SPARK-20389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482408#comment-16482408 ]
Arseniy Tashoyan commented on SPARK-20389: ------------------------------------------ I confirm: the issue is in place for Spark 2.2.1. I hit it when running KMeans on a dataset of ~4M records when trying to get ~500 clusters. I tried to avoid the problem by increasing spark.sql.shuffle.partitions to a large number like 500, but then I got OutOfMemoryError: Heap space on executors. So this workaround is suitable only with a capable cluster at hands. > Upgrade kryo to fix NegativeArraySizeException > ---------------------------------------------- > > Key: SPARK-20389 > URL: https://issues.apache.org/jira/browse/SPARK-20389 > Project: Spark > Issue Type: Bug > Components: Spark Core, Spark Submit > Affects Versions: 2.1.0 > Environment: Linux, Centos7, jdk8 > Reporter: Georg Heiler > Priority: Major > > I am experiencing an issue with Kryo when writing parquet files. Similar to > https://github.com/broadinstitute/gatk/issues/1524 a > NegativeArraySizeException occurs. Apparently this is fixed in a current Kryo > version. Spark is still using the very old 3.3 Kryo. > Can you please upgrade to a fixed Kryo version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org