Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20023
thanks, merging to master/2.3!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86330/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86330 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86330/testReport)**
for PR 20023 at commit
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20023
LGTM, pending jenkins
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86330 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86330/testReport)**
for PR 20023 at commit
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
LGTM except a few comments about the doc
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86271/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86271 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86271/testReport)**
for PR 20023 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86277/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86277 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86277/testReport)**
for PR 20023 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86276/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86276 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86276/testReport)**
for PR 20023 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86272/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86272 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86272/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86277 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86277/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86276 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86276/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86272 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86272/testReport)**
for PR 20023 at commit
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20023
LGTM.
One thing we can improve is the golden file test framework. I found we
sometimes repeat the test cases with a config on and off. We should write the
test cases once and list the
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86271 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86271/testReport)**
for PR 20023 at commit
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
the test failure is unrelated. A proof is that I just updated the migration
section in the last commit and the previous one was passing all the tests.
@cloud-fan @gatorsmile any more
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
Jenkins, retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86260/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86260 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86260/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86260 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86260/testReport)**
for PR 20023 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86182/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86182 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86182/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86182 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86182/testReport)**
for PR 20023 at commit
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@gatorsmile in the migration guide should I put this change in the section
Spark 2.2 -> Spark 2.3? Will this change be in Spark 2.3?
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
Update the PR description to explain the related difference between Hive
and MS SQL Server and also document the solution this PR uses.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
Since this introduces a behavior change, please update the [migration guide
of Spark
SQL](https://spark.apache.org/docs/latest/sql-programming-guide.html#migration-guide)
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86138/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86138 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86138/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86138 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86138/testReport)**
for PR 20023 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86137/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86137 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86137/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #86137 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86137/testReport)**
for PR 20023 at commit
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
sorry @gatorsmile, but I can't follow you. Why do you say it is different
from the doc? I can't see any difference, sorry. May you explain me please?
Thanks.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
@mgaido91 At least it is different from the doc
https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql
---
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@gatorsmile as far as I know, Hive implementation is consistent with MS SQL
Server 2006 and the implementation is taken by the description of its behavior
in MSDN blogs. DB2 implementation I think
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
The proposal of this PR is to follow Hive instead of MS SQL Server and DB2.
I read the comments in HIVE-15331. Nobody pointed out why Hive decided to
introduce new behaviors which are actually
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
any more comments @gatorsmile @cloud-fan @dongjoon-hyun @viirya @hvanhovell
?
---
-
To unsubscribe, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85914/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #85914 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85914/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #85914 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85914/testReport)**
for PR 20023 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85863/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #85863 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85863/testReport)**
for PR 20023 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85864/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #85864 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85864/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #85864 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85864/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #85863 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85863/testReport)**
for PR 20023 at commit
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
thanks @gatorsmile, do you have any suggestion for the conf name?
---
-
To unsubscribe, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
Sounds OK to me. Add a conf for this behavior change and also document the
behavior change in migration section?
---
-
To
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20023
hmmm, even we can add many modes like hive mode, postgres mode, etc. we
should still make the default/native mode reasonable. It seems more reasonable
to truncate the values instead of returning
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@cloud-fan thanks for your feedback. Honestly I don't have a string opinion
on how we should behave in that case. That's why I wanted some feedbacks from
the community on that point, while this PR
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20023
I think for big data, throwing exception during runtime is pretty bad. I
think it's ok to be incompatible with SQL standard for some cases, and return
null.
Personally I feel it's ok to
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@gatorsmile created #20084 for adding tests. Thanks.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
The problem found by this PR is just one of the issues that returns NULL
when SPARK SQL is unable to process it. Below is another example. I believe we
can find more.
```SQL
SELECT
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
Thanks @gatorsmile. Then should I create a follow up PR for #20008 in order
to cover the cases 2 and 3 before going on with this PR or can we go on with
this PR and the test cases added in this
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
Following ANSI SQL compliance sounds good to me. However, many details are
vendor-specific. That means, the query results still varies even if we can be
100% ANSI SQL compliant.
To
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
Thanks for your analysis @gatorsmile. Actually the rule you specified for
Oracle is what it uses when casting, rather then when doing arithmetic
operations. Yes DB2 has rather different rules to
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
Thanks for your careful summary!
We do not have a SQLCA. Thus, it is hard for us to send a warning message
back like [what DB2
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@gatorsmile that is scenario 3. I will explain you why and after I will do
and errata corrige of the summary I did in my last e-mail, because I made a
mistake about how DB2 computes the result
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
```
db2 => create table decimals_test(id int, a decimal(31,18), b
decimal(31,18))
DB2I The SQL command completed successfully.
db2 => insert into decimals_test values (1, 2.33,
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@gatorsmile I answered to your comments about DB2 in the e-mail.
@cloud-fan that would help, but not solve the problem. It would just make
the problem being generated by bigger numbers.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20023
it looks like other databases are very careful about precision lose.
However, following DB2 and throwing exception is pretty bad for big data
applications, but returning null is also bad as it
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
Regarding the rules for deciding precision and scales, DB2 z/OS also has
its own rules:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
In DB2, `a * b` will just stop with an error message: SQL0802N Arithmetic
overflow or other arithmetic exception occurred.
Thus, it might be more straightforward for us to follow what
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@gatorsmile, please refer to the [e-mail to the dev mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20023
Thanks for your efforts! This change needs more careful review and
investigation.
Could you post the outputs of Oracle, DB2, SQL server and Hive? Are their
results consistent?
---
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/20023
I think that we should be careful on suddenly changing behavior.
---
-
To unsubscribe, e-mail:
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/20023
I thought adding more cases into `decimals.sql`. :) But, for now, never
mind about that~
---
-
To unsubscribe, e-mail:
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
Thank you for your review @dongjoon-hyun. I think what we can do is add
more test to the whitelist in `HiveCompatibilitySuite`, updating them according
to HIVE-15331. Were you thinking to this or
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@hvanhovell, as far as 1 is regarded, I was referring to [this
comment](https://github.com/apache/spark/pull/19449#pullrequestreview-67789784)
and [this
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/20023
Thank you for pining me, @mgaido91 . The approach of PR looks good to me.
BTW, do we need to borrow more Hive 2.2 test cases?
---
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/20023
I don't fully agree :)...
1. You can use `SQLConf.get` for this. Or you can wire up the rules using
the `SessionStateBuilder`.
2. I am reluctant to change this for a minor version. I
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
thanks for looking at this @hvanhovell. The reasons why I didn't introduce
a configuration variable for this behavior are:
1. As far as I know, currently there is no way to read reliably
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/20023
In am generally in favor of following the SQL standard. How about we do
this. Let's make the standard behavior the default, and add a flag to revert to
the old behavior. This allows us to ease
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@cloud-fan yes, Hive changed and most important at the moment we are not
compliant with SQL standard. So currently Spark is returning results which are
different from Hive and not compliant with
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/20023
Ideally we should not change behaviors as possible as we can, but since
this behavior is from Hive and Hive also changed it, might be OK to follow Hive
and also change it? cc @hvanhovell too
Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@cloud-fan @dongjoon-hyun @gatorsmile @rxin @viirya I saw you worked on
this files. Maybe you can help reviewing the PR. For further details about the
reasons of this PR, please refer to the
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20023
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85116/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #85116 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85116/testReport)**
for PR 20023 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20023
**[Test build #85116 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85116/testReport)**
for PR 20023 at commit
96 matches
Mail list logo