[jira] [Comment Edited] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517532#comment-16517532 ] Wenbo Zhao edited comment on SPARK-24578 at 6/19/18 8:55 PM: -

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517532#comment-16517532 ] Wenbo Zhao commented on SPARK-24578: woop, [~attilapiros], sorry, I didn't you have

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517465#comment-16517465 ] Wenbo Zhao commented on SPARK-24578: [~attilapiros] if don't mind, I could create a

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517458#comment-16517458 ] Wenbo Zhao commented on SPARK-24578: Hi [~vanzin].  the commit  [https://github.com/

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517441#comment-16517441 ] Wenbo Zhao commented on SPARK-24578: [~irashid] Yes, that is exactly what I saw in o

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517440#comment-16517440 ] Wenbo Zhao commented on SPARK-24578: Hi [~attilapiros], I guess what you suggest is 

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517180#comment-16517180 ] Wenbo Zhao commented on SPARK-24578: After digging more details, this commit  [http

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516621#comment-16516621 ] Wenbo Zhao commented on SPARK-24578: Hi [~irashid], many thanks for clarifying my qu

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516609#comment-16516609 ] Wenbo Zhao commented on SPARK-24578: For now, we could reproduce this issue in compl

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following in some of our producti

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516602#comment-16516602 ] Wenbo Zhao commented on SPARK-24578: [~irashid] We didn't touch "spark.maxRemoteBloc

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following in some of our producti

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515858#comment-16515858 ] Wenbo Zhao commented on SPARK-24578: An easier reproduciable cluster setting is 10 e

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following in some of our producti

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following {code:java} 18/06/15 20:

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following   {code:java} 18/06/15

[jira] [Created] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
Wenbo Zhao created SPARK-24578: -- Summary: Reading remote cache block behavior changes and causes timeout issue Key: SPARK-24578 URL: https://issues.apache.org/jira/browse/SPARK-24578 Project: Spark

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-29 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493569#comment-16493569 ] Wenbo Zhao commented on SPARK-24373: [~mgaido] Thanks. I didn't look the comment car

[jira] [Issue Comment Deleted] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-29 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24373: --- Comment: was deleted (was: Same question as [~icexelloss]. Also, any plan to make your fix into a m

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-29 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493545#comment-16493545 ] Wenbo Zhao commented on SPARK-24373: Same question as [~icexelloss]. Also, any plan

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-24 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489820#comment-16489820 ] Wenbo Zhao commented on SPARK-24373: I guess we should use `planWithBarrier` in the '

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-24 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489534#comment-16489534 ] Wenbo Zhao commented on SPARK-24373: It is not apparently to me that they are the sam

[jira] [Comment Edited] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-24 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489233#comment-16489233 ] Wenbo Zhao edited comment on SPARK-24373 at 5/24/18 3:37 PM: -

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-24 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489233#comment-16489233 ] Wenbo Zhao commented on SPARK-24373: I turned on the log trace of RuleExecutor and fo

[jira] [Updated] (SPARK-24373) Spark Dataset groupby.agg/count doesn't respect cache with UDF

2018-05-23 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24373: --- Description: Here is the code to reproduce in local mode {code:java} scala> val df = sc.range(1, 2).t

[jira] [Updated] (SPARK-24373) Spark Dataset groupby.agg/count doesn't respect cache with UDF

2018-05-23 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24373: --- Description: Here is the code to reproduce in local mode {code:java} scala> val df = sc.range(1, 2).t

[jira] [Updated] (SPARK-24373) Spark Dataset groupby.agg/count doesn't respect cache with UDF

2018-05-23 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24373: --- Summary: Spark Dataset groupby.agg/count doesn't respect cache with UDF (was: Spark Dataset groupby.

[jira] [Created] (SPARK-24373) Spark Dataset groupby.agg/count doesn't respect cache

2018-05-23 Thread Wenbo Zhao (JIRA)
Wenbo Zhao created SPARK-24373: -- Summary: Spark Dataset groupby.agg/count doesn't respect cache Key: SPARK-24373 URL: https://issues.apache.org/jira/browse/SPARK-24373 Project: Spark Issue Type: