[ https://issues.apache.org/jira/browse/SPARK-35667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359710#comment-17359710 ]
yuanxm edited comment on SPARK-35667 at 6/9/21, 11:36 AM: ---------------------------------------------------------- Tried 3.1.2 and 2.4.8 with spark.dynamicAllocation.enabled=false, it behaves normal. [~angerszhuuu] [~maropu] was (Author: yuanxm5): Tried 3.1.2, still have this problem [~angerszhuuu] [~maropu] > spark.speculation causes incorrect query results with TRANSFORM > --------------------------------------------------------------- > > Key: SPARK-35667 > URL: https://issues.apache.org/jira/browse/SPARK-35667 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.8 > Reporter: yuanxm > Priority: Major > Attachments: image-2021-06-08-10-02-34-979.png > > > SQL as follow gets incorrect results sometimes when spark.speculation is > true: > {code:java} > SELECT count(1) > FROM > (SELECT TRANSFORM(tmpa1.*) USING "python test.py" AS (dt) > FROM > (SELECT dt > FROM test_table)tmpa1)tmpa2{code} > With spark.speculation=true, the count result is less than the correct one. > It's more likely to get incorrect result when there are more speculative > tasks. > `test.py`: > {code:java} > import sys > for line in sys.stdin: > line = line.strip() > arr = line.split() > print "\t".join(arr){code} > > spark-sql command: > {code:java} > ./bin/spark-sql --master yarn \ > --conf spark.speculation=true \ > --conf spark.shuffle.service.enabled=true \ > --conf spark.dynamicAllocation.enabled=true \ > --conf spark.dynamicAllocation.executorIdleTimeout=5s \ > --conf spark.dynamicAllocation.initialExecutor=1 \ > --conf spark.dynamicAllocation.maxExecutors=40 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org