Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
cc@wzhfy
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
do we need to handle this scenario? do we have any PR for handling this
issue?
---
-
To unsubscribe, e-mail:
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
I think this issue shall not be in improvement category, it shall be
Critical Bug which is affecting the normal join query performances. Hope we
address this issue.
"Insert query flow
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
Thanks for the comment Sean , there are certain areas which i found
inconsistencies, if i get some inputs from experts i think i can update
the PR , if we are planning to tackle this
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/22758
I don't know this code well enough to review. I think there is skepticism
from people who know this code whether this is change is correct and
beneficial. If there's doubt, I think it should be
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
cc @srowen
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
@cloud-fan @HyukjinKwon @wangyum Any suggestions on this issue , because
of this defect we are facing some performance issues in our customer
environment. Requesting you all to please have a
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
@cloud-fan @HyukjinKwon @srowen
As result of my above observations
a) I am having some doubt like if we are expecting the stats shall estimate
the data size with files then why in the
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
@cloud-fan Shall i update this PR based on the second approach, will that
be fine?I tested with the second approach also and the usecases are working
fine which is mentioned in this JIRA .
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
> Inorder to make this flow consistent either
> a) we need to record HiveStats for insert command flow and always
consider this stats while compting
> OR
> b) As mentioned above in
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
Inorder to make this flow consistent either
a) we need to record HiveStats for insert command flow and always consider
this stats while compting
OR
b) As mentioned above in
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
> I think the cost of get the stats from `HadoopFileSystem` may be quite
high.
Then we shall depend on HiveStats always to get the statistics, which is
happening now also but
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/22758
I think the cost of get the stats from `HadoopFileSystem` may be quite high.
---
-
To unsubscribe, e-mail:
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
@cloud-fan
I can think as one solution, that In DetermineStats flow we can add one
more condition to not update the stats for convertable relations, since we
always get the stats from
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
@cloud-fan Please find my understanding of the flow as mentioned below, its
bit tricky :)
Lets elaborate this flow might be we get more suggestions.
Step 1 : insert command
Github user sujith71955 commented on the issue:
https://github.com/apache/spark/pull/22758
@srowen @cloud-fan @HyukjinKwon @felixcheung.
@wangyum i think this PR shall also solves the problem mentioned in
SPARK-25403.
Please review and provide me any suggestions. Thanks all
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22758
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22758
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22758
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
19 matches
Mail list logo