date:20190714

[GitHub] [spark] felixcheung commented on issue #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions

2019-07-14 Thread GitBox

felixcheung commented on issue #24939: [SPARK-18569][ML][R] Support RFormula 
arithmetic, I() and spark functions
URL: https://github.com/apache/spark/pull/24939#issuecomment-511281967
 
 
   can we get more test coverage for this change? 
https://github.com/apache/spark/pull/24939/commits/88be1dcc67283c90922512477062805f63105703
   seems a bit more involved


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25155: [SPARK-28392][SQL][TESTS] Add traits for UDF and PostgreSQL tests to share initialization

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25155: [SPARK-28392][SQL][TESTS] Add 
traits for UDF and PostgreSQL tests to share initialization
URL: https://github.com/apache/spark/pull/25155#issuecomment-511280685
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107655/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25155: [SPARK-28392][SQL][TESTS] Add traits for UDF and PostgreSQL tests to share initialization

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25155: [SPARK-28392][SQL][TESTS] Add 
traits for UDF and PostgreSQL tests to share initialization
URL: https://github.com/apache/spark/pull/25155#issuecomment-511280682
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25155: [SPARK-28392][SQL][TESTS] Add traits for UDF and PostgreSQL tests to share initialization

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25155: [SPARK-28392][SQL][TESTS] Add traits 
for UDF and PostgreSQL tests to share initialization
URL: https://github.com/apache/spark/pull/25155#issuecomment-511280682
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25155: [SPARK-28392][SQL][TESTS] Add traits for UDF and PostgreSQL tests to share initialization

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25155: [SPARK-28392][SQL][TESTS] Add traits 
for UDF and PostgreSQL tests to share initialization
URL: https://github.com/apache/spark/pull/25155#issuecomment-511280685
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107655/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25155: [SPARK-28392][SQL][TESTS] Add traits for UDF and PostgreSQL tests to share initialization

2019-07-14 Thread GitBox

SparkQA commented on issue #25155: [SPARK-28392][SQL][TESTS] Add traits for UDF 
and PostgreSQL tests to share initialization
URL: https://github.com/apache/spark/pull/25155#issuecomment-511280421
 
 
   **[Test build #107655 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107655/testReport)**
 for PR 25155 at commit 
[`feea2d9`](https://github.com/apache/spark/commit/feea2d9b19f0948adaed8b9705cc28b366e3f1d5).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] felixcheung commented on issue #24991: [SPARK-28188] Materialize Dataframe API

2019-07-14 Thread GitBox

felixcheung commented on issue #24991: [SPARK-28188] Materialize Dataframe API
URL: https://github.com/apache/spark/pull/24991#issuecomment-511280363
 
 
   > @rxin, this runs the query up to the point where `materialize` is called. 
The underlying RDD can then pick up from the last shuffle the next time it is 
used. This works better than caching in most cases when using dynamic 
allocation because executors are not sitting idle, but work can be resumed and 
shared across queries. We could rename the method if that would be more clear.
   > 
   > @srowen, I've seen this suggested on the dev list a few times and I think 
it is a good idea to add it. There is not guarantee that `count` does the same 
thing -- it could be optimized -- and it is a little tricky to get this to work 
with the dataset API. This version creates a new DataFrame from the underlying 
RDD so that the work is reused from the last shuffle, instead of allowing the 
planner to re-optimize with later changes (usually projections) and discard the 
intermediate result. We have found this really useful for better control over 
the planner, as well as to cache data using the shuffle system.
   
   I have to agree with this - I've seen `count()` or `cache()` mis-used too 
many times and too many times people need to go back to clean up and remove all 
calls to `count()`. So much so I'm planning to write a optimizer rule to remove 
them. I'm only partly kidding.
   
   Maybe this isn't the API for it, and that's ok, let's improve it then and 
make good suggestion to the community/contributor etc.
   
   I'm not sure `df.write.format("noop").save` is a good suggestion to general 
spark user.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25155: [SPARK-28392][SQL][TESTS] Add traits for UDF and PostgreSQL tests to share initialization

2019-07-14 Thread GitBox

SparkQA removed a comment on issue #25155: [SPARK-28392][SQL][TESTS] Add traits 
for UDF and PostgreSQL tests to share initialization
URL: https://github.com/apache/spark/pull/25155#issuecomment-511259721
 
 
   **[Test build #107655 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107655/testReport)**
 for PR 25155 at commit 
[`feea2d9`](https://github.com/apache/spark/commit/feea2d9b19f0948adaed8b9705cc28b366e3f1d5).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] felixcheung edited a comment on issue #24991: [SPARK-28188] Materialize Dataframe API

2019-07-14 Thread GitBox

felixcheung edited a comment on issue #24991: [SPARK-28188] Materialize 
Dataframe API
URL: https://github.com/apache/spark/pull/24991#issuecomment-511280363
 
 
   > @rxin, this runs the query up to the point where `materialize` is called. 
The underlying RDD can then pick up from the last shuffle the next time it is 
used. This works better than caching in most cases when using dynamic 
allocation because executors are not sitting idle, but work can be resumed and 
shared across queries. We could rename the method if that would be more clear.
   > 
   > @srowen, I've seen this suggested on the dev list a few times and I think 
it is a good idea to add it. There is not guarantee that `count` does the same 
thing -- it could be optimized -- and it is a little tricky to get this to work 
with the dataset API. This version creates a new DataFrame from the underlying 
RDD so that the work is reused from the last shuffle, instead of allowing the 
planner to re-optimize with later changes (usually projections) and discard the 
intermediate result. We have found this really useful for better control over 
the planner, as well as to cache data using the shuffle system.
   
   I have to agree with this - I've seen `count()` or `cache()` mis-used too 
many times and too many times people need to go back to clean up and remove all 
calls to `count()`. So much so I'm planning to write an optimizer rule to 
remove them. I'm only partly kidding.
   
   Maybe this isn't the API for it, and that's ok, let's improve it then and 
make good suggestion to the community/contributor etc.
   
   I'm not sure `df.write.format("noop").save` is a good suggestion to general 
spark user.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25111: [SPARK-28346][SQL] clone the query plan between analyzer, optimizer and planner

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25111: [SPARK-28346][SQL] clone the 
query plan between analyzer, optimizer and planner
URL: https://github.com/apache/spark/pull/25111#issuecomment-511278971
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107657/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25111: [SPARK-28346][SQL] clone the query plan between analyzer, optimizer and planner

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25111: [SPARK-28346][SQL] clone the query 
plan between analyzer, optimizer and planner
URL: https://github.com/apache/spark/pull/25111#issuecomment-511278971
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107657/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25111: [SPARK-28346][SQL] clone the query plan between analyzer, optimizer and planner

2019-07-14 Thread GitBox

SparkQA removed a comment on issue #25111: [SPARK-28346][SQL] clone the query 
plan between analyzer, optimizer and planner
URL: https://github.com/apache/spark/pull/25111#issuecomment-511262826
 
 
   **[Test build #107657 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107657/testReport)**
 for PR 25111 at commit 
[`f614cc6`](https://github.com/apache/spark/commit/f614cc6cd4fe16e493d912493884883f34a77f30).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25111: [SPARK-28346][SQL] clone the query plan between analyzer, optimizer and planner

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25111: [SPARK-28346][SQL] clone the 
query plan between analyzer, optimizer and planner
URL: https://github.com/apache/spark/pull/25111#issuecomment-511278968
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25111: [SPARK-28346][SQL] clone the query plan between analyzer, optimizer and planner

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25111: [SPARK-28346][SQL] clone the query 
plan between analyzer, optimizer and planner
URL: https://github.com/apache/spark/pull/25111#issuecomment-511278968
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25111: [SPARK-28346][SQL] clone the query plan between analyzer, optimizer and planner

2019-07-14 Thread GitBox

SparkQA commented on issue #25111: [SPARK-28346][SQL] clone the query plan 
between analyzer, optimizer and planner
URL: https://github.com/apache/spark/pull/25111#issuecomment-511278862
 
 
   **[Test build #107657 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107657/testReport)**
 for PR 25111 at commit 
[`f614cc6`](https://github.com/apache/spark/commit/f614cc6cd4fe16e493d912493884883f34a77f30).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25133: [SPARK-28365][ML] Fallback 
locale to en_US in StopWordsRemover if system default locale isn't in available 
locales in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511278744
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25133: [SPARK-28365][ML] Fallback locale to 
en_US in StopWordsRemover if system default locale isn't in available locales 
in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511278746
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107660/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25133: [SPARK-28365][ML] Fallback 
locale to en_US in StopWordsRemover if system default locale isn't in available 
locales in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511278746
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107660/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25133: [SPARK-28365][ML] Fallback locale to 
en_US in StopWordsRemover if system default locale isn't in available locales 
in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511278744
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

SparkQA removed a comment on issue #25133: [SPARK-28365][ML] Fallback locale to 
en_US in StopWordsRemover if system default locale isn't in available locales 
in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511269919
 
 
   **[Test build #107660 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107660/testReport)**
 for PR 25133 at commit 
[`f985eba`](https://github.com/apache/spark/commit/f985eba8c006327f934c254fedf91ff3987b7297).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

SparkQA commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US 
in StopWordsRemover if system default locale isn't in available locales in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511278616
 
 
   **[Test build #107660 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107660/testReport)**
 for PR 25133 at commit 
[`f985eba`](https://github.com/apache/spark/commit/f985eba8c006327f934c254fedf91ff3987b7297).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] sujith71955 commented on issue #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command

2019-07-14 Thread GitBox

sujith71955 commented on issue #24903: [SPARK-28084][SQL] Resolving the 
partition column name based on the resolver in sql load command 
URL: https://github.com/apache/spark/pull/24903#issuecomment-511277045
 
 
   @dongjoon-hyun  Gentle ping, Let me know for any inputs. thanks all for the 
valuable inputs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] 
Support GPU-aware resources scheduling in Standalone
URL: https://github.com/apache/spark/pull/25047#issuecomment-511276916
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107659/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone

2019-07-14 Thread GitBox

SparkQA removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support 
GPU-aware resources scheduling in Standalone
URL: https://github.com/apache/spark/pull/25047#issuecomment-511265225
 
 
   **[Test build #107659 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107659/testReport)**
 for PR 25047 at commit 
[`e19a9b6`](https://github.com/apache/spark/commit/e19a9b670b25ffc00f1a16eef89febfe244a4088).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] 
Support GPU-aware resources scheduling in Standalone
URL: https://github.com/apache/spark/pull/25047#issuecomment-511276913
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support 
GPU-aware resources scheduling in Standalone
URL: https://github.com/apache/spark/pull/25047#issuecomment-511276913
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support 
GPU-aware resources scheduling in Standalone
URL: https://github.com/apache/spark/pull/25047#issuecomment-511276916
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107659/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone

2019-07-14 Thread GitBox

SparkQA commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware 
resources scheduling in Standalone
URL: https://github.com/apache/spark/pull/25047#issuecomment-511276873
 
 
   **[Test build #107659 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107659/testReport)**
 for PR 25047 at commit 
[`e19a9b6`](https://github.com/apache/spark/commit/e19a9b670b25ffc00f1a16eef89febfe244a4088).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on a change in pull request #25092: [SPARK-28312][SQL][TEST] Port numeric.sql

2019-07-14 Thread GitBox

wangyum commented on a change in pull request #25092: [SPARK-28312][SQL][TEST] 
Port numeric.sql
URL: https://github.com/apache/spark/pull/25092#discussion_r303289190
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/numeric.sql
 ##
 @@ -0,0 +1,1095 @@
+--
+-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+--
+--
+-- NUMERIC
+-- 
https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/numeric.sql
+--
+
+CREATE TABLE num_data (id int, val decimal(38,10)) USING parquet;
+CREATE TABLE num_exp_add (id1 int, id2 int, expected decimal(38,10)) USING 
parquet;
+CREATE TABLE num_exp_sub (id1 int, id2 int, expected decimal(38,10)) USING 
parquet;
+CREATE TABLE num_exp_div (id1 int, id2 int, expected decimal(38,10)) USING 
parquet;
+CREATE TABLE num_exp_mul (id1 int, id2 int, expected decimal(38,10)) USING 
parquet;
+CREATE TABLE num_exp_sqrt (id int, expected decimal(38,10)) USING parquet;
+CREATE TABLE num_exp_ln (id int, expected decimal(38,10)) USING parquet;
+CREATE TABLE num_exp_log10 (id int, expected decimal(38,10)) USING parquet;
+CREATE TABLE num_exp_power_10_ln (id int, expected decimal(38,10)) USING 
parquet;
+
+CREATE TABLE num_result (id1 int, id2 int, result decimal(38,10)) USING 
parquet;
+
+
+-- **
+-- * The following EXPECTED results are computed by bc(1)
+-- * with a scale of 200
+-- **
+
+-- BEGIN TRANSACTION;
+INSERT INTO num_exp_add VALUES (0,0,'0');
+INSERT INTO num_exp_sub VALUES (0,0,'0');
+INSERT INTO num_exp_mul VALUES (0,0,'0');
+INSERT INTO num_exp_div VALUES (0,0,'NaN');
+INSERT INTO num_exp_add VALUES (0,1,'0');
+INSERT INTO num_exp_sub VALUES (0,1,'0');
+INSERT INTO num_exp_mul VALUES (0,1,'0');
+INSERT INTO num_exp_div VALUES (0,1,'NaN');
+INSERT INTO num_exp_add VALUES (0,2,'-34338492.215397047');
+INSERT INTO num_exp_sub VALUES (0,2,'34338492.215397047');
+INSERT INTO num_exp_mul VALUES (0,2,'0');
+INSERT INTO num_exp_div VALUES (0,2,'0');
+INSERT INTO num_exp_add VALUES (0,3,'4.31');
+INSERT INTO num_exp_sub VALUES (0,3,'-4.31');
+INSERT INTO num_exp_mul VALUES (0,3,'0');
+INSERT INTO num_exp_div VALUES (0,3,'0');
+INSERT INTO num_exp_add VALUES (0,4,'7799461.4119');
+INSERT INTO num_exp_sub VALUES (0,4,'-7799461.4119');
+INSERT INTO num_exp_mul VALUES (0,4,'0');
+INSERT INTO num_exp_div VALUES (0,4,'0');
+INSERT INTO num_exp_add VALUES (0,5,'16397.038491');
+INSERT INTO num_exp_sub VALUES (0,5,'-16397.038491');
+INSERT INTO num_exp_mul VALUES (0,5,'0');
+INSERT INTO num_exp_div VALUES (0,5,'0');
+INSERT INTO num_exp_add VALUES (0,6,'93901.57763026');
+INSERT INTO num_exp_sub VALUES (0,6,'-93901.57763026');
+INSERT INTO num_exp_mul VALUES (0,6,'0');
+INSERT INTO num_exp_div VALUES (0,6,'0');
+INSERT INTO num_exp_add VALUES (0,7,'-83028485');
+INSERT INTO num_exp_sub VALUES (0,7,'83028485');
+INSERT INTO num_exp_mul VALUES (0,7,'0');
+INSERT INTO num_exp_div VALUES (0,7,'0');
+INSERT INTO num_exp_add VALUES (0,8,'74881');
+INSERT INTO num_exp_sub VALUES (0,8,'-74881');
+INSERT INTO num_exp_mul VALUES (0,8,'0');
+INSERT INTO num_exp_div VALUES (0,8,'0');
+INSERT INTO num_exp_add VALUES (0,9,'-24926804.045047420');
+INSERT INTO num_exp_sub VALUES (0,9,'24926804.045047420');
+INSERT INTO num_exp_mul VALUES (0,9,'0');
+INSERT INTO num_exp_div VALUES (0,9,'0');
+INSERT INTO num_exp_add VALUES (1,0,'0');
+INSERT INTO num_exp_sub VALUES (1,0,'0');
+INSERT INTO num_exp_mul VALUES (1,0,'0');
+INSERT INTO num_exp_div VALUES (1,0,'NaN');
+INSERT INTO num_exp_add VALUES (1,1,'0');
+INSERT INTO num_exp_sub VALUES (1,1,'0');
+INSERT INTO num_exp_mul VALUES (1,1,'0');
+INSERT INTO num_exp_div VALUES (1,1,'NaN');
+INSERT INTO num_exp_add VALUES (1,2,'-34338492.215397047');
+INSERT INTO num_exp_sub VALUES (1,2,'34338492.215397047');
+INSERT INTO num_exp_mul VALUES (1,2,'0');
+INSERT INTO num_exp_div VALUES (1,2,'0');
+INSERT INTO num_exp_add VALUES (1,3,'4.31');
+INSERT INTO num_exp_sub VALUES (1,3,'-4.31');
+INSERT INTO num_exp_mul VALUES (1,3,'0');
+INSERT INTO num_exp_div VALUES (1,3,'0');
+INSERT INTO num_exp_add VALUES (1,4,'7799461.4119');
+INSERT INTO num_exp_sub VALUES (1,4,'-7799461.4119');
+INSERT INTO num_exp_mul VALUES (1,4,'0');
+INSERT INTO num_exp_div VALUES (1,4,'0');
+INSERT INTO num_exp_add VALUES (1,5,'16397.038491');
+INSERT INTO num_exp_sub VALUES (1,5,'-16397.038491');
+INSERT INTO num_exp_mul VALUES (1,5,'0');
+INSERT INTO num_exp_div VALUES (1,5,'0');
+INSERT INTO num_exp_add VALUES (1,6,'93901.57763026');
+INSERT INTO num_exp_sub VALUES (1,6,'-93901.57763026');
+INSERT INTO num_exp_mul VALUES (1,6,'0');
+INSERT INTO num_exp_div VALUES (1,6,'0');
+INSERT INTO num_exp_add VALUES (1,7,'-83028485');
+INSERT INTO num_exp_sub VALUES (1,7,'83028485');
+INSERT INTO num_exp_mul VALUES (1,7,'0');
+INSERT INTO num_exp_div VALUES (1,7,'0');
+INSERT INTO num_exp_add VALUES (1,8,'74881');
+INSERT INTO num_exp_sub VALUES (1,8,'-74881');
+INSERT INTO

[GitHub] [spark] AmplabJenkins removed a comment on issue #24793: [SPARK-27944][ML] Unify the behavior of checking empty output column names

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #24793: [SPARK-27944][ML] Unify the 
behavior of checking empty output column names
URL: https://github.com/apache/spark/pull/24793#issuecomment-511275368
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24793: [SPARK-27944][ML] Unify the behavior of checking empty output column names

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #24793: [SPARK-27944][ML] Unify the 
behavior of checking empty output column names
URL: https://github.com/apache/spark/pull/24793#issuecomment-511275374
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12793/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24793: [SPARK-27944][ML] Unify the behavior of checking empty output column names

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #24793: [SPARK-27944][ML] Unify the behavior 
of checking empty output column names
URL: https://github.com/apache/spark/pull/24793#issuecomment-511275374
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12793/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24793: [SPARK-27944][ML] Unify the behavior of checking empty output column names

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #24793: [SPARK-27944][ML] Unify the behavior 
of checking empty output column names
URL: https://github.com/apache/spark/pull/24793#issuecomment-511275368
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #25150: [SPARK-28384][SQL][TEST] Port select_distinct.sql

2019-07-14 Thread GitBox

dongjoon-hyun closed pull request #25150: [SPARK-28384][SQL][TEST] Port 
select_distinct.sql
URL: https://github.com/apache/spark/pull/25150
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25092: [SPARK-28312][SQL][TEST] Port 
numeric.sql
URL: https://github.com/apache/spark/pull/25092#issuecomment-511273476
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12792/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql

2019-07-14 Thread GitBox

SparkQA commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql
URL: https://github.com/apache/spark/pull/25092#issuecomment-511273751
 
 
   **[Test build #107664 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107664/testReport)**
 for PR 25092 at commit 
[`bc11e02`](https://github.com/apache/spark/commit/bc11e02fba3f0f9627e25048dbbd4d2daa7b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25150: [SPARK-28384][SQL][TEST] Port select_distinct.sql

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25150: [SPARK-28384][SQL][TEST] Port 
select_distinct.sql
URL: https://github.com/apache/spark/pull/25150#issuecomment-511273458
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12790/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25150: [SPARK-28384][SQL][TEST] Port select_distinct.sql

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25150: [SPARK-28384][SQL][TEST] Port 
select_distinct.sql
URL: https://github.com/apache/spark/pull/25150#issuecomment-511273453
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25156: [SPARK-28394][SCHEDULER]: Performance Metrics for Driver

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25156: [SPARK-28394][SCHEDULER]: 
Performance Metrics for Driver
URL: https://github.com/apache/spark/pull/25156#issuecomment-511273373
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25092: [SPARK-28312][SQL][TEST] Port 
numeric.sql
URL: https://github.com/apache/spark/pull/25092#issuecomment-511273471
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making UDFs (virtually) no-op

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25130: 
[SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making 
UDFs (virtually) no-op
URL: https://github.com/apache/spark/pull/25130#issuecomment-511273444
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making UDFs (virtually) no-op

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25130: 
[SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making 
UDFs (virtually) no-op
URL: https://github.com/apache/spark/pull/25130#issuecomment-511273449
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12791/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making UDFs (virtually) no-op

2019-07-14 Thread GitBox

SparkQA commented on issue #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make 
integrated UDF tests robust by making UDFs (virtually) no-op
URL: https://github.com/apache/spark/pull/25130#issuecomment-511273759
 
 
   **[Test build #107663 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107663/testReport)**
 for PR 25130 at commit 
[`1164ee9`](https://github.com/apache/spark/commit/1164ee940e507fc209e57933a5598ffd13e925fd).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25147: [SPARK-28333][SQL] NULLS FIRST for DESC and NULLS LAST for ASC

2019-07-14 Thread GitBox

HyukjinKwon commented on a change in pull request #25147: [SPARK-28333][SQL] 
NULLS FIRST for DESC and NULLS LAST for ASC
URL: https://github.com/apache/spark/pull/25147#discussion_r303287018
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala
 ##
 @@ -36,12 +36,12 @@ abstract sealed class NullOrdering {
 
 case object Ascending extends SortDirection {
   override def sql: String = "ASC"
-  override def defaultNullOrdering: NullOrdering = NullsFirst
+  override def defaultNullOrdering: NullOrdering = NullsLast
 }
 
 case object Descending extends SortDirection {
   override def sql: String = "DESC"
-  override def defaultNullOrdering: NullOrdering = NullsLast
+  override def defaultNullOrdering: NullOrdering = NullsFirst
 
 Review comment:
   I think the current behaviour already makes sense. I wouldn't fix it. cc 
@dongjoon-hyun, @wangyum, @gatorsmile WDYT?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25156: [SPARK-28394][SCHEDULER]: Performance Metrics for Driver

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25156: [SPARK-28394][SCHEDULER]: Performance 
Metrics for Driver
URL: https://github.com/apache/spark/pull/25156#issuecomment-511273709
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25092: [SPARK-28312][SQL][TEST] Port 
numeric.sql
URL: https://github.com/apache/spark/pull/25092#issuecomment-511273471
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25156: [SPARK-28394][SCHEDULER]: Performance Metrics for Driver

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25156: [SPARK-28394][SCHEDULER]: Performance 
Metrics for Driver
URL: https://github.com/apache/spark/pull/25156#issuecomment-511273373
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25156: [SPARK-28394][SCHEDULER]: Performance Metrics for Driver

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25156: [SPARK-28394][SCHEDULER]: 
Performance Metrics for Driver
URL: https://github.com/apache/spark/pull/25156#issuecomment-511273301
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making UDFs (virtually) no-op

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make 
integrated UDF tests robust by making UDFs (virtually) no-op
URL: https://github.com/apache/spark/pull/25130#issuecomment-511273449
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12791/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25150: [SPARK-28384][SQL][TEST] Port select_distinct.sql

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25150: [SPARK-28384][SQL][TEST] Port 
select_distinct.sql
URL: https://github.com/apache/spark/pull/25150#issuecomment-511273458
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12790/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25150: [SPARK-28384][SQL][TEST] Port select_distinct.sql

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25150: [SPARK-28384][SQL][TEST] Port 
select_distinct.sql
URL: https://github.com/apache/spark/pull/25150#issuecomment-511273453
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql

2019-07-14 Thread GitBox

wangyum commented on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql
URL: https://github.com/apache/spark/pull/24931#issuecomment-511273469
 
 
   Thank you @dongjoon-hyun . I will update it later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25092: [SPARK-28312][SQL][TEST] Port numeric.sql

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25092: [SPARK-28312][SQL][TEST] Port 
numeric.sql
URL: https://github.com/apache/spark/pull/25092#issuecomment-511273476
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12792/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making UDFs (virtually) no-op

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make 
integrated UDF tests robust by making UDFs (virtually) no-op
URL: https://github.com/apache/spark/pull/25130#issuecomment-511273444
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25156: [SPARK-28394][SCHEDULER]: Performance Metrics for Driver

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25156: [SPARK-28394][SCHEDULER]: Performance 
Metrics for Driver
URL: https://github.com/apache/spark/pull/25156#issuecomment-511273301
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on a change in pull request #25092: [SPARK-28312][SQL][TEST] Port numeric.sql

2019-07-14 Thread GitBox

wangyum commented on a change in pull request #25092: [SPARK-28312][SQL][TEST] 
Port numeric.sql
URL: https://github.com/apache/spark/pull/25092#discussion_r303286758
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/numeric.sql
 ##
 @@ -0,0 +1,1095 @@
+--
+-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+--
+--
+-- NUMERIC
+-- 
https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/numeric.sql
+--
+
+CREATE TABLE num_data (id int, val decimal(38,10)) USING parquet;
 
 Review comment:
   OK. Added it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon removed a comment on issue #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making UDFs (virtually) no-op

2019-07-14 Thread GitBox

HyukjinKwon removed a comment on issue #25130: 
[SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making 
UDFs (virtually) no-op
URL: https://github.com/apache/spark/pull/25130#issuecomment-511263305
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] hthuynh2 opened a new pull request #25156: initial commit

2019-07-14 Thread GitBox

hthuynh2 opened a new pull request #25156: initial commit
URL: https://github.com/apache/spark/pull/25156
 
 
   ## What changes were proposed in this pull request?
   Description
   This PR added the framework for monitoring the schedulers and displaying 
performance metrics to the UI. More specifically, for each event type that is 
handled by the DAGScheduler, CoarsedGrainedSchedulerBackend, and 
TaskResultGetter, it keeps track of the following information: number of 
handled events of this type, number of pending events of this type, average 
time to handle an event of this type, total time spending on handling all 
events of this type.
   
   ---
   Design and implementation:
   General Idea: 
   For each event type, every time we handle an event of this type, we collect 
the information (e.g. the time taken to handle this event, number of pending 
events,...), and having a background thread that periodically collects the 
information and display to the UI.
   
   UI Explanation
   https://user-images.githubusercontent.com/15680678/61195997-50592900-a691-11e9-8e8b-5a4faa5ecc05.png;>
   
   Explanation of columns:
   - Event Type: The type of event that the information in this row below to
   - Number of Handled Events: The number of events of this type that is handled
   - Number of Pending Events: The number of events of this type that is 
pending in the queue
   - Processing Speed: The average time that is used to handle an event of this 
type
   - Total time: The total time that is used to handled all events of this type
   - Timestamp: The time at the end of the interval that we collect the 
information displayed in this row.
   
   Driver Schedulers Summary Metrics table
   - This table contains information accumulatively collected since the 
application starts (e.g. the “Number of Handled Events” is the total number of 
events that are handled since the application starts)
   
   Driver Schedulers Busy Intervals Metrics table
   - This table contains the information for only some busy intervals. The 
purpose of this table is to pinpoint the time that the scheduler is busy. An 
interval is considered busy if the total time spending on handling all events 
of a type during this interval is greater than 70% of the interval length (This 
70% is configurable). And the information displayed is the information 
collected during that interval only (e.g. the “Number of Handled Events” is the 
number of events that are handled during that interval). Note that we only keep 
track of top 5 busiest intervals sorted by the “total time” (This number 5 is 
configurable). 
   
   An example of how to read the table and how it can be useful for monitoring 
and debugging the application:
   For an example shown in the figure below, let’s look at the first row of the 
“Driver Schedulers Busy Intervals Metrics” table. This row contains the 
information about the ReviveOffers event that is handled by the 
CoarseGrainedSchedulerBackend. There are 12 ReviveOffers events that is handled 
by the CoarseGrainedSchedulerBackend during the interval the ends at 
"2019/06/19 04:03:13". The number of ReviveOffers events that are pending is 
N/A (This is because getting the number of pending events that are waiting to 
be handled by the CoarseGrainedSchedulerBackend is difficult, so I leave it as 
N/A for now). The time for the CoarseGrainedSchedulerBackend to handle an 
ReviveOffers is 763ms. The total time spending on handling all 12 ReviveOffers 
events during this interval is 9s (For this example, the interval length is 
10s). Finally, this interval ends at 2019/06/19 04:03:13.
   
   From the information above, we can easily see that it takes too long for the 
CoarseGrainedSchedulerBackend to handle the event ReviveOffers (for some 
interval, it takes up to 700ms to handle just one ReviveOffers event).  This is 
actually an issue that is mentioned in 
[SPARK-26755](https://issues.apache.org/jira/browse/SPARK-26755). And this 
example is the application mentioned in [PR 
#23677](https://github.com/apache/spark/pull/23677). We can see that having 
these metrics can help to identify the bottleneck for the application. 
   
   ---
   Implementation:
   Introduce 2 new classes:
   - SchedulerEventHandlingMetricTracker: This is used to keep track of the 
information for an event type
   - SchedulerMetricsManager: This is used to manage all the 
SchedulerEventHandlingMetricTracker for all event types
   -
   Configurations:
   spark.scheduler.metric.compute.enabled: Indicate if this feature is enable 
or not (default value is false)
   spark.scheduler.metric.compute.interval: This is the interval length 
(default value is 10s)
   spark.scheduler.metric.compute.numTopBusiestInterval: This is the number of 
busiest intervals that we will keep (default value is 5)
   spark.scheduler.metric.compute.busyIntervalThreshold: This is the threshold 
(in percent) for an interval to be considered busy

[GitHub] [spark] wangyum commented on a change in pull request #25150: [SPARK-28384][SQL][TEST] Port select_distinct.sql

2019-07-14 Thread GitBox

wangyum commented on a change in pull request #25150: [SPARK-28384][SQL][TEST] 
Port select_distinct.sql
URL: https://github.com/apache/spark/pull/25150#discussion_r303286587
 
 

 ##
 File path: 
sql/core/src/test/resources/sql-tests/inputs/pgSQL/select_distinct.sql
 ##
 @@ -0,0 +1,83 @@
+--
+-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+--
+--
+-- SELECT_DISTINCT
+-- 
https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/select_distinct.sql
+--
+
+CREATE OR REPLACE TEMPORARY VIEW tmp AS
+SELECT two, stringu1, ten, string4
+FROM onek;
+
+--
+-- awk '{print $3;}' onek.data | sort -n | uniq
+--
+SELECT DISTINCT two FROM tmp ORDER BY 1;
+
+--
+-- awk '{print $5;}' onek.data | sort -n | uniq
+--
+SELECT DISTINCT ten FROM tmp ORDER BY 1;
+
+--
+-- awk '{print $16;}' onek.data | sort -d | uniq
+--
+SELECT DISTINCT string4 FROM tmp ORDER BY 1;
+
+-- [SPARK-28010] Support ORDER BY ... USING syntax
+--
+-- awk '{print $3,$16,$5;}' onek.data | sort -d | uniq |
+-- sort +0n -1 +1d -2 +2n -3
+--
 
 Review comment:
   OK. Added it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25092: [SPARK-28312][SQL][TEST] Port numeric.sql

2019-07-14 Thread GitBox

dongjoon-hyun commented on a change in pull request #25092: 
[SPARK-28312][SQL][TEST] Port numeric.sql
URL: https://github.com/apache/spark/pull/25092#discussion_r303286519
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/numeric.sql
 ##
 @@ -0,0 +1,1095 @@
+--
+-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+--
+--
+-- NUMERIC
+-- 
https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/numeric.sql
+--
+
+CREATE TABLE num_data (id int, val decimal(38,10)) USING parquet;
+CREATE TABLE num_exp_add (id1 int, id2 int, expected decimal(38,10)) USING 
parquet;
+CREATE TABLE num_exp_sub (id1 int, id2 int, expected decimal(38,10)) USING 
parquet;
+CREATE TABLE num_exp_div (id1 int, id2 int, expected decimal(38,10)) USING 
parquet;
+CREATE TABLE num_exp_mul (id1 int, id2 int, expected decimal(38,10)) USING 
parquet;
+CREATE TABLE num_exp_sqrt (id int, expected decimal(38,10)) USING parquet;
+CREATE TABLE num_exp_ln (id int, expected decimal(38,10)) USING parquet;
+CREATE TABLE num_exp_log10 (id int, expected decimal(38,10)) USING parquet;
+CREATE TABLE num_exp_power_10_ln (id int, expected decimal(38,10)) USING 
parquet;
+
+CREATE TABLE num_result (id1 int, id2 int, result decimal(38,10)) USING 
parquet;
+
+
+-- **
+-- * The following EXPECTED results are computed by bc(1)
+-- * with a scale of 200
+-- **
+
+-- BEGIN TRANSACTION;
+INSERT INTO num_exp_add VALUES (0,0,'0');
+INSERT INTO num_exp_sub VALUES (0,0,'0');
+INSERT INTO num_exp_mul VALUES (0,0,'0');
+INSERT INTO num_exp_div VALUES (0,0,'NaN');
+INSERT INTO num_exp_add VALUES (0,1,'0');
+INSERT INTO num_exp_sub VALUES (0,1,'0');
+INSERT INTO num_exp_mul VALUES (0,1,'0');
+INSERT INTO num_exp_div VALUES (0,1,'NaN');
+INSERT INTO num_exp_add VALUES (0,2,'-34338492.215397047');
+INSERT INTO num_exp_sub VALUES (0,2,'34338492.215397047');
+INSERT INTO num_exp_mul VALUES (0,2,'0');
+INSERT INTO num_exp_div VALUES (0,2,'0');
+INSERT INTO num_exp_add VALUES (0,3,'4.31');
+INSERT INTO num_exp_sub VALUES (0,3,'-4.31');
+INSERT INTO num_exp_mul VALUES (0,3,'0');
+INSERT INTO num_exp_div VALUES (0,3,'0');
+INSERT INTO num_exp_add VALUES (0,4,'7799461.4119');
+INSERT INTO num_exp_sub VALUES (0,4,'-7799461.4119');
+INSERT INTO num_exp_mul VALUES (0,4,'0');
+INSERT INTO num_exp_div VALUES (0,4,'0');
+INSERT INTO num_exp_add VALUES (0,5,'16397.038491');
+INSERT INTO num_exp_sub VALUES (0,5,'-16397.038491');
+INSERT INTO num_exp_mul VALUES (0,5,'0');
+INSERT INTO num_exp_div VALUES (0,5,'0');
+INSERT INTO num_exp_add VALUES (0,6,'93901.57763026');
+INSERT INTO num_exp_sub VALUES (0,6,'-93901.57763026');
+INSERT INTO num_exp_mul VALUES (0,6,'0');
+INSERT INTO num_exp_div VALUES (0,6,'0');
+INSERT INTO num_exp_add VALUES (0,7,'-83028485');
+INSERT INTO num_exp_sub VALUES (0,7,'83028485');
+INSERT INTO num_exp_mul VALUES (0,7,'0');
+INSERT INTO num_exp_div VALUES (0,7,'0');
+INSERT INTO num_exp_add VALUES (0,8,'74881');
+INSERT INTO num_exp_sub VALUES (0,8,'-74881');
+INSERT INTO num_exp_mul VALUES (0,8,'0');
+INSERT INTO num_exp_div VALUES (0,8,'0');
+INSERT INTO num_exp_add VALUES (0,9,'-24926804.045047420');
+INSERT INTO num_exp_sub VALUES (0,9,'24926804.045047420');
+INSERT INTO num_exp_mul VALUES (0,9,'0');
+INSERT INTO num_exp_div VALUES (0,9,'0');
+INSERT INTO num_exp_add VALUES (1,0,'0');
+INSERT INTO num_exp_sub VALUES (1,0,'0');
+INSERT INTO num_exp_mul VALUES (1,0,'0');
+INSERT INTO num_exp_div VALUES (1,0,'NaN');
+INSERT INTO num_exp_add VALUES (1,1,'0');
+INSERT INTO num_exp_sub VALUES (1,1,'0');
+INSERT INTO num_exp_mul VALUES (1,1,'0');
+INSERT INTO num_exp_div VALUES (1,1,'NaN');
+INSERT INTO num_exp_add VALUES (1,2,'-34338492.215397047');
+INSERT INTO num_exp_sub VALUES (1,2,'34338492.215397047');
+INSERT INTO num_exp_mul VALUES (1,2,'0');
+INSERT INTO num_exp_div VALUES (1,2,'0');
+INSERT INTO num_exp_add VALUES (1,3,'4.31');
+INSERT INTO num_exp_sub VALUES (1,3,'-4.31');
+INSERT INTO num_exp_mul VALUES (1,3,'0');
+INSERT INTO num_exp_div VALUES (1,3,'0');
+INSERT INTO num_exp_add VALUES (1,4,'7799461.4119');
+INSERT INTO num_exp_sub VALUES (1,4,'-7799461.4119');
+INSERT INTO num_exp_mul VALUES (1,4,'0');
+INSERT INTO num_exp_div VALUES (1,4,'0');
+INSERT INTO num_exp_add VALUES (1,5,'16397.038491');
+INSERT INTO num_exp_sub VALUES (1,5,'-16397.038491');
+INSERT INTO num_exp_mul VALUES (1,5,'0');
+INSERT INTO num_exp_div VALUES (1,5,'0');
+INSERT INTO num_exp_add VALUES (1,6,'93901.57763026');
+INSERT INTO num_exp_sub VALUES (1,6,'-93901.57763026');
+INSERT INTO num_exp_mul VALUES (1,6,'0');
+INSERT INTO num_exp_div VALUES (1,6,'0');
+INSERT INTO num_exp_add VALUES (1,7,'-83028485');
+INSERT INTO num_exp_sub VALUES (1,7,'83028485');
+INSERT INTO num_exp_mul VALUES (1,7,'0');
+INSERT INTO num_exp_div VALUES (1,7,'0');
+INSERT INTO num_exp_add VALUES (1,8,'74881');
+INSERT INTO num_exp_sub VALUES (1,8,'-74881');
+INSERT

[GitHub] [spark] SparkQA commented on issue #25150: [SPARK-28384][SQL][TEST] Port select_distinct.sql

2019-07-14 Thread GitBox

SparkQA commented on issue #25150: [SPARK-28384][SQL][TEST] Port 
select_distinct.sql
URL: https://github.com/apache/spark/pull/25150#issuecomment-511272636
 
 
   **[Test build #107662 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107662/testReport)**
 for PR 25150 at commit 
[`e06f467`](https://github.com/apache/spark/commit/e06f4677606f2029f66f2264fa4d293f4ac56e52).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #25035: [SPARK-28235][SQL] Sum of decimals should return a decimal with MAX_PRECISION

2019-07-14 Thread GitBox

cloud-fan commented on issue #25035: [SPARK-28235][SQL] Sum of decimals should 
return a decimal with MAX_PRECISION
URL: https://github.com/apache/spark/pull/25035#issuecomment-511272440
 
 
   I do feel it's just a heuristic. Even if you set the precision to 38 it can 
still overflow. I don't think this is a better solution compared to `p + 10`, 
as it's already hard to hit overflow according to 
https://github.com/apache/spark/pull/25035#discussion_r300271855


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #24232: [SPARK-27297] [SQL] Add higher order functions to scala API

2019-07-14 Thread GitBox

HyukjinKwon commented on issue #24232: [SPARK-27297] [SQL] Add higher order 
functions to scala API
URL: https://github.com/apache/spark/pull/24232#issuecomment-511272125
 
 
   ping @nvander1 are you able to compile locally?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] shivusondur commented on a change in pull request #25147: [SPARK-28333][SQL] NULLS FIRST for DESC and NULLS LAST for ASC

2019-07-14 Thread GitBox

shivusondur commented on a change in pull request #25147: [SPARK-28333][SQL] 
NULLS FIRST for DESC and NULLS LAST for ASC
URL: https://github.com/apache/spark/pull/25147#discussion_r303286142
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala
 ##
 @@ -36,12 +36,12 @@ abstract sealed class NullOrdering {
 
 case object Ascending extends SortDirection {
   override def sql: String = "ASC"
-  override def defaultNullOrdering: NullOrdering = NullsFirst
+  override def defaultNullOrdering: NullOrdering = NullsLast
 }
 
 case object Descending extends SortDirection {
   override def sql: String = "DESC"
-  override def defaultNullOrdering: NullOrdering = NullsLast
+  override def defaultNullOrdering: NullOrdering = NullsFirst
 
 Review comment:
   @HyukjinKwon 
   From the below link,  spark is consistent with **mysql, sqlserver**
   But inconsitent with **DB2, Oracle, Postgresql**
   
https://docs.mendix.com/refguide/null-ordering-behavior#3-overview-of-default-nulls-sort-order
   
   
![image](https://user-images.githubusercontent.com/7912929/61195812-3dbe0e80-a6e8-11e9-90e1-bb496843a2bd.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511271629
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107661/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511271628
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

SparkQA removed a comment on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511269920
 
 
   **[Test build #107661 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107661/testReport)**
 for PR 25117 at commit 
[`1d8761a`](https://github.com/apache/spark/commit/1d8761ae3d426d3fb2a6a577e6f245154b35a59d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

SparkQA commented on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511271594
 
 
   **[Test build #107661 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107661/testReport)**
 for PR 25117 at commit 
[`1d8761a`](https://github.com/apache/spark/commit/1d8761ae3d426d3fb2a6a577e6f245154b35a59d).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511271628
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511271629
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107661/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE and REPLACE TABLE AS SELECT with V2

2019-07-14 Thread GitBox

cloud-fan commented on issue #24798: [SPARK-27724][SQL] Implement REPLACE TABLE 
and REPLACE TABLE AS SELECT with V2
URL: https://github.com/apache/spark/pull/24798#issuecomment-511271344
 
 
   I was not saying that REPLACE TABLE isn't important, I just want to 
emphasize that, it's better to have smaller Spark PRs when possible. Many 
people try to enforce it when reviewing PRs, and I don't want to break the rule 
here. It not only easy the review, but also make the commit history clearer.
   
   This PR does 2 things:
   1.  introduce `StagingTableCatalog` API
   2. implement REPLACE TABLE (depends on 1)
   
   We can create a new PR for `StagingTableCatalog`, and keep this PR for 
review/discussion. After the new PR is merged, rebase this PR. This is also 
what I do for individual stuff in big PRs: 
https://github.com/apache/spark/pull/25107


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25092: [SPARK-28312][SQL][TEST] Port numeric.sql

2019-07-14 Thread GitBox

dongjoon-hyun commented on a change in pull request #25092: 
[SPARK-28312][SQL][TEST] Port numeric.sql
URL: https://github.com/apache/spark/pull/25092#discussion_r303285429
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/numeric.sql
 ##
 @@ -0,0 +1,1095 @@
+--
+-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+--
+--
+-- NUMERIC
+-- 
https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/numeric.sql
+--
+
+CREATE TABLE num_data (id int, val decimal(38,10)) USING parquet;
 
 Review comment:
   `decimal(38,10)` vs `numeric(210,10)`? Is this difference covered by 
SPARK-28318? Could you add a JIRA ID before this line?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #25151: [SPARK-28387][SQL][TEST] Port select_having.sql

2019-07-14 Thread GitBox

dongjoon-hyun closed pull request #25151: [SPARK-28387][SQL][TEST] Port 
select_having.sql
URL: https://github.com/apache/spark/pull/25151
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25150: [SPARK-28384][SQL][TEST] Port select_distinct.sql

2019-07-14 Thread GitBox

dongjoon-hyun commented on a change in pull request #25150: 
[SPARK-28384][SQL][TEST] Port select_distinct.sql
URL: https://github.com/apache/spark/pull/25150#discussion_r303284929
 
 

 ##
 File path: 
sql/core/src/test/resources/sql-tests/inputs/pgSQL/select_distinct.sql
 ##
 @@ -0,0 +1,83 @@
+--
+-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+--
+--
+-- SELECT_DISTINCT
+-- 
https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/select_distinct.sql
+--
+
+CREATE OR REPLACE TEMPORARY VIEW tmp AS
+SELECT two, stringu1, ten, string4
+FROM onek;
+
+--
+-- awk '{print $3;}' onek.data | sort -n | uniq
+--
+SELECT DISTINCT two FROM tmp ORDER BY 1;
+
+--
+-- awk '{print $5;}' onek.data | sort -n | uniq
+--
+SELECT DISTINCT ten FROM tmp ORDER BY 1;
+
+--
+-- awk '{print $16;}' onek.data | sort -d | uniq
+--
+SELECT DISTINCT string4 FROM tmp ORDER BY 1;
+
+-- [SPARK-28010] Support ORDER BY ... USING syntax
+--
+-- awk '{print $3,$16,$5;}' onek.data | sort -d | uniq |
+-- sort +0n -1 +1d -2 +2n -3
+--
 
 Review comment:
   Could you put the original query?
   ```
   SELECT DISTINCT two, string4, ten
  FROM tmp
  ORDER BY two using <, string4 using <, ten using <;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2

2019-07-14 Thread GitBox

HyukjinKwon commented on a change in pull request #25115: [SPARK-28351][SQL] 
Support DELETE in DataSource V2
URL: https://github.com/apache/spark/pull/25115#discussion_r303284920
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/sources/filters.scala
 ##
 @@ -163,6 +187,7 @@ case class IsNotNull(attribute: String) extends Filter {
  */
 @Stable
 case class And(left: Filter, right: Filter) extends Filter {
+  override def sql: String = s"${left.sql} AND ${right.sql}"
 
 Review comment:
   > for simple case like DELETE by filters in this pr, just pass the filter to 
datasource is more suitable, a 'spark job' is not needed.
   
   BTW, I don't know why this is needed but it won't work with the combinations 
of `AND` and `OR`. `a OR b AND C`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #24991: [SPARK-28188] Materialize Dataframe API

2019-07-14 Thread GitBox

cloud-fan commented on issue #24991: [SPARK-28188] Materialize Dataframe API
URL: https://github.com/apache/spark/pull/24991#issuecomment-511269953
 
 
   It's simple to write to noop sink: `df.write.format("noop").save`, why do we 
need this extra public API?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

SparkQA commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US 
in StopWordsRemover if system default locale isn't in available locales in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511269919
 
 
   **[Test build #107660 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107660/testReport)**
 for PR 25133 at commit 
[`f985eba`](https://github.com/apache/spark/commit/f985eba8c006327f934c254fedf91ff3987b7297).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

SparkQA commented on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511269920
 
 
   **[Test build #107661 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107661/testReport)**
 for PR 25117 at commit 
[`1d8761a`](https://github.com/apache/spark/commit/1d8761ae3d426d3fb2a6a577e6f245154b35a59d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511269682
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511269683
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12789/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25133: [SPARK-28365][ML] Fallback 
locale to en_US in StopWordsRemover if system default locale isn't in available 
locales in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511269672
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25133: [SPARK-28365][ML] Fallback locale to 
en_US in StopWordsRemover if system default locale isn't in available locales 
in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511269672
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25133: [SPARK-28365][ML] Fallback locale to 
en_US in StopWordsRemover if system default locale isn't in available locales 
in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511269675
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12788/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511269683
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12789/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25133: [SPARK-28365][ML] Fallback 
locale to en_US in StopWordsRemover if system default locale isn't in available 
locales in JVM
URL: https://github.com/apache/spark/pull/25133#issuecomment-511269675
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12788/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511269682
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

HyukjinKwon commented on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511269586
 
 
   ok to test


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

AmplabJenkins removed a comment on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-510499121
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #25117: [Python] Validate LongType in _make_type_verifier

2019-07-14 Thread GitBox

HyukjinKwon commented on issue #25117: [Python] Validate LongType in 
_make_type_verifier
URL: https://github.com/apache/spark/pull/25117#issuecomment-511269564
 
 
   Can you show the output before / after? Also, please file a JIRA. See 
https://spark.apache.org/contributing.html


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #25130: [SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making UDFs (virtually) no-op

2019-07-14 Thread GitBox

cloud-fan commented on a change in pull request #25130: 
[SPARK-28359][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making 
UDFs (virtually) no-op
URL: https://github.com/apache/spark/pull/25130#discussion_r303283620
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/IntegratedUDFTestUtils.scala
 ##
 @@ -35,8 +36,12 @@ import org.apache.spark.sql.types.StringType
  * This object targets to integrate various UDF test cases so that Scalar UDF, 
Python UDF and
  * Scalar Pandas UDFs can be tested in SBT & Maven tests.
  *
- * The available UDFs cast input to strings, which take one column as input 
and return a string
- * type column as output.
+ * The available UDFs are special. It defines an UDF wrapped by cast. So, 
Input column is casted
+ * into string, UDF returns strings as are, and then output column is casted 
back to the input
+ * column. In this way, UDF is virtually no-op.
+ *
+ * Note that, due to this implementation limitation, complex types such as 
map, array and struct
+ * types do not work this this UDF because they cannot be same after the cast 
roundtrip.
 
 Review comment:
   double `this`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql

2019-07-14 Thread GitBox

dongjoon-hyun commented on issue #24931: [SPARK-28129][SQL][TEST] Port 
float8.sql
URL: https://github.com/apache/spark/pull/24931#issuecomment-511268877
 
 
   Hi, @wangyum .
   Since #25041 is merged, could you update this PR once more? Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun edited a comment on issue #25146: [SPARK-28152][SQL] Mapped ShortType to SMALLINT and FloatType to REAL for MsSqlServerDialect

2019-07-14 Thread GitBox

dongjoon-hyun edited a comment on issue #25146: [SPARK-28152][SQL] Mapped 
ShortType to SMALLINT and FloatType to REAL for MsSqlServerDialect
URL: https://github.com/apache/spark/pull/25146#issuecomment-511268693
 
 
   Could you elaborate your PR description a little bit more, @shivsood ? 
Please refer the other commit logs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25146: [SPARK-28152][SQL] Mapped ShortType to SMALLINT and FloatType to REAL for MsSqlServerDialect

2019-07-14 Thread GitBox

dongjoon-hyun commented on issue #25146: [SPARK-28152][SQL] Mapped ShortType to 
SMALLINT and FloatType to REAL for MsSqlServerDialect
URL: https://github.com/apache/spark/pull/25146#issuecomment-511268693
 
 
   Could you elaborate your PR description a little bit more, @shivsood ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25146: [SPARK-28152][SQL] Mapped ShortType to SMALLINT and FloatType to REAL for MsSqlServerDialect

2019-07-14 Thread GitBox

dongjoon-hyun commented on a change in pull request #25146: [SPARK-28152][SQL] 
Mapped ShortType to SMALLINT and FloatType to REAL for MsSqlServerDialect
URL: https://github.com/apache/spark/pull/25146#discussion_r303283367
 
 

 ##
 File path: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala
 ##
 @@ -895,8 +895,20 @@ class JDBCSuite extends QueryTest
   "BIT")
 
assert(msSqlServerDialect.getJDBCType(BinaryType).map(_.databaseTypeDefinition).get
 ==
   "VARBINARY(MAX)")
+
assert(msSqlServerDialect.getJDBCType(ShortType).map(_.databaseTypeDefinition).get
 ==
+  "SMALLINT")
   }
 
+  test("SPARK-28152 MsSqlServerDialect catalyst type mapping") {
+val msSqlServerDialect = JdbcDialects.get("jdbc:sqlserver")
+val metadata = new MetadataBuilder().putLong("scale", 1)
+assert(msSqlServerDialect.getCatalystType(java.sql.Types.SMALLINT, 
"SMALLINT", 1,
+  metadata).get == ShortType)
+assert(msSqlServerDialect.getCatalystType(java.sql.Types.REAL, "REAL", 1,
+  metadata).get == FloatType)
+  }
+
+
 
 Review comment:
   nit. Let's remove this redundant empty line.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25147: [SPARK-28333][SQL] NULLS FIRST for DESC and NULLS LAST for ASC

2019-07-14 Thread GitBox

HyukjinKwon commented on a change in pull request #25147: [SPARK-28333][SQL] 
NULLS FIRST for DESC and NULLS LAST for ASC
URL: https://github.com/apache/spark/pull/25147#discussion_r303283180
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala
 ##
 @@ -36,12 +36,12 @@ abstract sealed class NullOrdering {
 
 case object Ascending extends SortDirection {
   override def sql: String = "ASC"
-  override def defaultNullOrdering: NullOrdering = NullsFirst
+  override def defaultNullOrdering: NullOrdering = NullsLast
 }
 
 case object Descending extends SortDirection {
   override def sql: String = "DESC"
-  override def defaultNullOrdering: NullOrdering = NullsLast
+  override def defaultNullOrdering: NullOrdering = NullsFirst
 
 Review comment:
   I think this is a breaking change and workaround looks pretty easy. Can you 
check what other DBMSs do? If it's consistent with some other DBMSs, I think we 
shouldn't fix it and resolve the ticket as `Won't Do`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24128: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2019-07-14 Thread GitBox

AmplabJenkins commented on issue #24128: [SPARK-27188][SS] FileStreamSink: 
provide a new option to have retention on output files
URL: https://github.com/apache/spark/pull/24128#issuecomment-511267376
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #25041: [SPARK-28133][SQL] Add acosh/asinh/atanh functions to SQL

2019-07-14 Thread GitBox

HyukjinKwon commented on issue #25041: [SPARK-28133][SQL] Add acosh/asinh/atanh 
functions to SQL
URL: https://github.com/apache/spark/pull/25041#issuecomment-511266861
 
 
   +1 looks fine to me too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Tonix517 commented on issue #25041: [SPARK-28133][SQL] Add acosh/asinh/atanh functions to SQL

2019-07-14 Thread GitBox

Tonix517 commented on issue #25041: [SPARK-28133][SQL] Add acosh/asinh/atanh 
functions to SQL
URL: https://github.com/apache/spark/pull/25041#issuecomment-511266819
 
 
   @dongjoon-hyun @mgaido91 Thank you very much for all your guidance and 
patience. Learned a lot :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #25041: [SPARK-28133][SQL] Add acosh/asinh/atanh functions to SQL

2019-07-14 Thread GitBox

dongjoon-hyun closed pull request #25041: [SPARK-28133][SQL] Add 
acosh/asinh/atanh functions to SQL
URL: https://github.com/apache/spark/pull/25041
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25041: [SPARK-28133][SQL] Add acosh/asinh/atanh functions to SQL

2019-07-14 Thread GitBox

dongjoon-hyun commented on issue #25041: [SPARK-28133][SQL] Add 
acosh/asinh/atanh functions to SQL
URL: https://github.com/apache/spark/pull/25041#issuecomment-511265725
 
 
   Thank you, @mgaido91 . 
   
   Yes, I tried in several use cases including `CAST/CTAS/INSERT/WHERE` for the 
second case. The argument of that function is stored into `double` primitive 
variables like `double project_value_1` or `double filter_value_2` already. It 
turns out pure `double` operations.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 447 matches

Mail list logo