Github user LuciferYang commented on the issue:
https://github.com/apache/spark/pull/23214
As @cloud-fan said `the hash join metrics is wrongly implemented`, we will
partial revert # SPARK-21052, no longer need this patch, close it ~
---
Github user LuciferYang commented on the issue:
https://github.com/apache/spark/pull/23214
On the other hand, if is only a `multi-thread problem`, may not affect
performance because there is no synchronized code part ...
---
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23214
I think there is a problem, but no one found out because it's only about
metrics.
---
-
To unsubscribe, e-mail:
Github user LuciferYang commented on the issue:
https://github.com/apache/spark/pull/23214
```
For broadcast hash join, we will copy the broadcasted hash relation to
avoid multi-thread problem, via HashedRelation.asReadOnlyCopy. However, this is
a shallow copy, the
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/23214
Thanks for doing this. I think we are more close to the root cause.
---
-
To unsubscribe, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23214
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99651/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23214
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23214
**[Test build #99651 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99651/testReport)**
for PR 23214 at commit
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23214
It's easy to track `numKeyLookups` at `HashedRelation`, but it's hard to
track `numProbes`. One idea is, we pass a `MutableInt` to
`LongToUnsafeRowMap.getValue` as a parameter, and in the method
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23214
I might know the root cause: `LongToUnsafeRowMap` is acutally accessed by
multiple threads.
For broadcast hash join, we will copy the broadcasted hash relation to
avoid multi-thread
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23214
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23214
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23214
**[Test build #99651 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99651/testReport)**
for PR 23214 at commit
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23214
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user LuciferYang commented on the issue:
https://github.com/apache/spark/pull/23214
@adrian-wang ok~ I will add some comments to explain the reason
---
-
To unsubscribe, e-mail:
Github user LuciferYang commented on the issue:
https://github.com/apache/spark/pull/23214
@JkSelf thx~
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user JkSelf commented on the issue:
https://github.com/apache/spark/pull/23214
@LuciferYang the patch is fine in my test environment.
@adrian-wang I will run all the tpcds queries in spark2.3 and spark2.3
with this patch later.
---
Github user adrian-wang commented on the issue:
https://github.com/apache/spark/pull/23214
maybe add some detailed test result in description and explain the reason
for this in code comment?
---
-
To unsubscribe,
Github user LuciferYang commented on the issue:
https://github.com/apache/spark/pull/23214
ping @viirya
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user LuciferYang commented on the issue:
https://github.com/apache/spark/pull/23214
cc @cloud-fan , help to review this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23214
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user LuciferYang commented on the issue:
https://github.com/apache/spark/pull/23214
cc @JkSelf help to check this patch.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23214
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23214
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
24 matches
Mail list logo