[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19763 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19763 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19763 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84161/ Test PASSed. ---

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19763 **[Test build #84161 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84161/testReport)** for PR 19763 at commit

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19763 **[Test build #84161 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84161/testReport)** for PR 19763 at commit

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19763 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-24 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19763 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19763 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19763 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84134/ Test PASSed. ---

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19763 **[Test build #84134 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84134/testReport)** for PR 19763 at commit

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19763 **[Test build #84134 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84134/testReport)** for PR 19763 at commit

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19763 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19763 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19763 @cloud-fan Seems Jenkins's not started? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-22 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/19763 > We can shut down the pool after some certain idle time, but not sure if it's worth the complexity Yeah, that's just what the cached thread pool does :) ---

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19763 OK to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19763 cc @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-16 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19763 Actually, the time gap is O(number of mappers * shuffle partitions). In this case, number of mappers is not very large, while users are more likely to get slowed down when they run on a big data

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-16 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19763 Looks not a significant difference. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19763 Seems like not a big deal for the end-to-end performance? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-16 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19763 This happens a lot in our TPC-DS 100TB test. We have a Intel Xeon CPU E5-2699 v4 @2.2GHz CPU as master, this will influence the driver's performance. And we set `spark.sql.shuffle.partitions` to

[GitHub] spark issue #19763: [SPARK-22537][core] Aggregation of map output statistics...

2017-11-15 Thread CodingCat
Github user CodingCat commented on the issue: https://github.com/apache/spark/pull/19763 my question is "how many times we have seen this operation of collecting statistics is the bottleneck?" --- - To unsubscribe,