[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893376#comment-15893376 ]
Kay Ousterhout commented on SPARK-19796: ---------------------------------------- Do you think we should (separately) fix the underlying problem? Specifically, we could: (a) not send the SPARK_JOB_DESCRIPTION property to the workers, since it's only used on the master for the UI (and while users *could* access it, the variable name SPARK_JOB_DESCRIPTION is spark-private, which suggests that it shouldn't be used by users). Perhaps this is too risky because users could be using it? (b) Truncate SPARK_JOB_DESCRIPTION to something reasonable (100 characters?) before sending it to the workers. This is more backwards compatible if users are actually reading the property, but maybe a useless intermediate approach? (c) (Possibly in addition to one of the above) Log a warning if any of the properties is longer than 100 characters (or some threshold). Thoughts? I can file a JIRA if you think any of these is worthwhile. > taskScheduler fails serializing long statements received by thrift server > ------------------------------------------------------------------------- > > Key: SPARK-19796 > URL: https://issues.apache.org/jira/browse/SPARK-19796 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: Giambattista > Priority: Blocker > > This problem was observed after the changes made for SPARK-17931. > In my use-case I'm sending very long insert statements to Spark thrift server > and they are failing at TaskDescription.scala:89 because writeUTF fails if > requested to write strings longer than 64Kb (see > https://www.drillio.com/en/2009/java-encoded-string-too-long-64kb-limit/ for > a description of the issue). > As suggested by Imran Rashid I tracked down the offending key: it is > "spark.job.description" and it contains the complete SQL statement. > The problem can be reproduced by creating a table like: > create table test (a int) using parquet > and by sending an insert statement like: > scala> val r = 1 to 128000 > scala> println("insert into table test values (" + r.mkString("),(") + ")") -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org