[ 
https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893376#comment-15893376
 ] 

Kay Ousterhout commented on SPARK-19796:
----------------------------------------

Do you think we should (separately) fix the underlying problem?  Specifically, 
we could:

(a) not send the SPARK_JOB_DESCRIPTION property to the workers, since it's only 
used on the master for the UI (and while users *could* access it, the variable 
name SPARK_JOB_DESCRIPTION is spark-private, which suggests that it shouldn't 
be used by users).  Perhaps this is too risky because users could be using it?

(b) Truncate SPARK_JOB_DESCRIPTION to something reasonable (100 characters?) 
before sending it to the workers.  This is more backwards compatible if users 
are actually reading the property, but maybe a useless intermediate approach?

(c) (Possibly in addition to one of the above) Log a warning if any of the 
properties is longer than 100 characters (or some threshold).

Thoughts?  I can file a JIRA if you think any of these is worthwhile.

> taskScheduler fails serializing long statements received by thrift server
> -------------------------------------------------------------------------
>
>                 Key: SPARK-19796
>                 URL: https://issues.apache.org/jira/browse/SPARK-19796
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Giambattista
>            Priority: Blocker
>
> This problem was observed after the changes made for SPARK-17931.
> In my use-case I'm sending very long insert statements to Spark thrift server 
> and they are failing at TaskDescription.scala:89 because writeUTF fails if 
> requested to write strings longer than 64Kb (see 
> https://www.drillio.com/en/2009/java-encoded-string-too-long-64kb-limit/ for 
> a description of the issue).
> As suggested by Imran Rashid I tracked down the offending key: it is 
> "spark.job.description" and it contains the complete SQL statement.
> The problem can be reproduced by creating a table like:
> create table test (a int) using parquet
> and by sending an insert statement like:
> scala> val r = 1 to 128000
> scala> println("insert into table test values (" + r.mkString("),(") + ")")



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to