Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19560
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19560
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91092/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19560
**[Test build #91092 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91092/testReport)**
for PR 19560 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19560
**[Test build #91092 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91092/testReport)**
for PR 19560 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19560
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19560
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88775/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19560
**[Test build #88775 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88775/testReport)**
for PR 19560 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19560
**[Test build #88775 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88775/testReport)**
for PR 19560 at commit
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/19560
@wangyum
Make sense.
You can also try approach in this pr.
If there are many(tens of thousands of) ETLs in the warehouse, we cannot
afford to give that many hints or fix all the
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/19560
I also hint this issues:
```sql
select * from A join B on a.key = b.key
```
table A is small but table B is big and table B's stats are incorrect. so
It will Broadcast table B.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19560
I can see the value and also the potential extra overhead (more expensive
for object stores), although this does not resolve the root cause.
Before we providing adaptive runtime
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/19560
>My main concern is, we'd better not to put burden on Spark to deal with
metastore failures
I think this make sense. I was also thinking about this when proposing this
pr. I do agree
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/19560
My main concern is, we'd better not to put burden on Spark to deal with
metastore failures, because Spark doesn't have control on metastores. The
system using Spark and metastore should be
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/19560
> Users always do not know there's error in stats.
Isn't there any exceptions or error messages when updating table/stats
fails? I suppose the system is able to know it through logging or
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/19560
@wzhfy
Thanks for comment;
I know your point.
In my cluster, namenode is under heavy pressure. Errors in stats happen
with big chance. Users always do not know there's error in stats.
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/19560
I wonder when this config should be used. If user knows there's some error
in stats, why not just analyze the table (specify "noscan" if only size is
needed)? This can fix the problem instead of
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19560
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19560
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83008/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19560
**[Test build #83008 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83008/testReport)**
for PR 19560 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19560
**[Test build #83008 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83008/testReport)**
for PR 19560 at commit
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/19560
@viirya
Thanks a lot for comments.
1. In current change, I verify the stats from file system only when the
relation is under join.
2. I added a warning when the size from file system
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19560
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83002/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19560
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19560
**[Test build #83002 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83002/testReport)**
for PR 19560 at commit
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/19560
@gatorsmile @dongjoon-hyun
Thanks a lot for looking into this.
This pr aims to avoid OOM if metastore fails to update table properties
after the data is already produced. With the
25 matches
Mail list logo