Giambattista created SPARK-19796: ------------------------------------ Summary: taskScheduler fails serializing long statements received by thrift server Key: SPARK-19796 URL: https://issues.apache.org/jira/browse/SPARK-19796 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.2.0 Reporter: Giambattista
This problem was observed after the changes made for SPARK-17931. In my use-case I'm sending very long insert statements to Spark thrift server and they are failing at TaskDescription.scala:89 because writeUTF fails if requested to write strings longer than 64Kb (see https://www.drillio.com/en/2009/java-encoded-string-too-long-64kb-limit/ for a description of the issue). As suggested by Imran Rashid I tracked down the offending key: it is "spark.job.description" and it contains the complete SQL statement. The problem can be reproduced by creating a table like: create table test (a int) using parquet and by sending an insert statement like: scala> val r = 1 to 128000 scala> println("insert into table test values (" + r.mkString("),(") + ")") -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org