[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/21939 got it. Thank you! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/21939 @shaneknapp what was the version of pyarrow in that build? 0.8 or 0.10? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler So, for this upgrade, even the JVM side dependency is 0.10, pyspark can work with any version between pyarrow 0.8 to 0.10 without problem

[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

2018-08-06 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/22003 @dongjoon-hyun no problem. Thank you! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

spark git commit: [SPARK-25019][BUILD] Fix orc dependency to use the same exclusion rules

2018-08-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 51e2b38d9 -> 278984d5a [SPARK-25019][BUILD] Fix orc dependency to use the same exclusion rules ## What changes were proposed in this pull request? During upgrading Apache ORC to 1.5.2

[GitHub] spark issue #22003: [SPARK-25019][BUILD] Fix orc dependency to use the same ...

2018-08-06 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/22003 lgtm. Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

2018-08-06 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/22003#discussion_r207986831 --- Diff: sql/core/pom.xml --- @@ -90,39 +90,11 @@ org.apache.orc orc-core ${orc.classifier

[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

2018-08-06 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/22003#discussion_r207962501 --- Diff: sql/core/pom.xml --- @@ -90,39 +90,11 @@ org.apache.orc orc-core ${orc.classifier

[GitHub] spark pull request #22003: [SPARK-25019][BUILD] Fix orc dependency to use th...

2018-08-06 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/22003#discussion_r207888608 --- Diff: sql/core/pom.xml --- @@ -90,39 +90,11 @@ org.apache.orc orc-core ${orc.classifier

spark git commit: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d4a277f0c -> fc21f192a [SPARK-24895] Remove spotbugs plugin ## What changes were proposed in this pull request? Spotbugs maven plugin was a recently added plugin before 2.4.0 snapshot artifacts were broken. To ensure it does not affect

[GitHub] spark issue #21865: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/21865 lgtm. I am merging this PR to master branch. Then, I will kick off https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots

[GitHub] spark issue #21865: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/21865 cc @HyukjinKwon @kiszk I will merge this PR once it passes the test. --- - To unsubscribe, e-mail: reviews-unsubscr

svn commit: r25324 - /dev/spark/v2.3.0-rc5-bin/ /release/spark/spark-2.3.0/

2018-02-27 Thread yhuai
Author: yhuai Date: Wed Feb 28 07:25:53 2018 New Revision: 25324 Log: Releasing Apache Spark 2.3.0 Added: release/spark/spark-2.3.0/ - copied from r25323, dev/spark/v2.3.0-rc5-bin/ Removed: dev/spark/v2.3.0-rc5-bin

[GitHub] spark pull request #20473: [SPARK-23300][TESTS] Prints out if Pandas and PyA...

2018-02-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/20473#discussion_r16362 --- Diff: python/run-tests.py --- @@ -151,6 +151,38 @@ def parse_opts(): return opts +def _check_dependencies(python_exec

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

2018-02-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r165449847 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -199,7 +200,7 @@ object ExtractFiltersAndInnerJoins

[GitHub] spark pull request #20473: [SPARK-23300][TESTS] Prints out if Pandas and PyA...

2018-02-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/20473#discussion_r165445947 --- Diff: python/run-tests.py --- @@ -151,6 +151,38 @@ def parse_opts(): return opts +def _check_dependencies(python_exec

[GitHub] spark pull request #20473: [SPARK-23300][TESTS] Prints out if Pandas and PyA...

2018-02-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/20473#discussion_r165445232 --- Diff: python/run-tests.py --- @@ -151,6 +151,38 @@ def parse_opts(): return opts +def _check_dependencies(python_exec

[GitHub] spark issue #20465: [SPARK-23292][TEST] always run python tests

2018-01-31 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/20465 So, jenkins jobs run those tests with python3? If so, I feel better because those tests are not completely skipped in Jenkins. If it is hard to make them run with python 2. Let’s have a log

[GitHub] spark issue #20465: [SPARK-23292][TEST] always run python tests

2018-01-31 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/20465 @felixcheung jenkins is actually skipping those tests (see the failure of this pr). It makes sense to provide a way to allow developers to not run those tests. But, I'd prefer that we run those tests

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

2018-01-31 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r165253818 --- Diff: python/pyspark/sql/tests.py --- @@ -4353,6 +4347,446 @@ def test_unsupported_types(self): df.groupby('id').apply(f).collect

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

2018-01-31 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r165253514 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -199,7 +200,7 @@ object ExtractFiltersAndInnerJoins

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

2018-01-31 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r165220142 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -199,7 +200,7 @@ object ExtractFiltersAndInnerJoins

[GitHub] spark pull request #20037: [SPARK-22849] ivy.retrieve pattern should also co...

2018-01-23 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/20037#discussion_r163463718 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -1271,7 +1271,7 @@ private[spark] object SparkSubmitUtils

[GitHub] spark issue #20110: [SPARK-22313][PYTHON][FOLLOWUP] Explicitly import warnin...

2017-12-28 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/20110 Thank you! Let's also check the build result to make sure `pyspark.streaming.tests.FlumePollingStreamTests` is indeed triggered (I hit this issue while running this test

[GitHub] spark pull request #19535: [SPARK-22313][PYTHON] Mark/print deprecation warn...

2017-12-28 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/19535#discussion_r159019845 --- Diff: python/pyspark/streaming/flume.py --- @@ -54,8 +54,13 @@ def createStream(ssc, hostname, port, :param bodyDecoder: A function used

[GitHub] spark pull request #19535: [SPARK-22313][PYTHON] Mark/print deprecation warn...

2017-12-28 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/19535#discussion_r159013024 --- Diff: python/pyspark/streaming/flume.py --- @@ -54,8 +54,13 @@ def createStream(ssc, hostname, port, :param bodyDecoder: A function used

[GitHub] spark pull request #5604: [SPARK-1442][SQL] Window Function Support for Spar...

2017-12-19 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/5604#discussion_r157933488 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala --- @@ -0,0 +1,340 @@ +/* + * Licensed

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/19448 Thank you :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/19448 I am not really worried about this particular change. It's already merged and it seems a small and safe change. I am not planning to revert it. But, in general, let's avoid of merging changes

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/19448 @HyukjinKwon branch-2.2 is in a maintenance branch, I am not sure it is appropriate to merge this change to branch-2.2 since it is not really a bug fix. If the doc is not accurate, we should fix

[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...

2017-09-29 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/19149 Can we add a test? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19080: [SPARK-21865][SQL] simplify the distribution sema...

2017-08-30 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/19080#discussion_r136214689 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -30,18 +30,43 @@ import

[GitHub] spark issue #19080: [SPARK-21865][SQL] simplify the distribution semantic of...

2017-08-30 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/19080 Have a question after reading the new approach. Let's say that we have a join like `T1 JOIN T2 on T1.a = T2.a`. Also `T1` is hash partitioned by the value of `T1.a` and it has 10 partitions, and `T2

[3/3] spark-website git commit: Add the news about spark-summit-eu-2017 agenda

2017-08-28 Thread yhuai
Add the news about spark-summit-eu-2017 agenda Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/35eb1471 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/35eb1471 Diff:

[1/3] spark-website git commit: Add the news about spark-summit-eu-2017 agenda

2017-08-28 Thread yhuai
Repository: spark-website Updated Branches: refs/heads/asf-site cca972e7f -> 35eb14717 http://git-wip-us.apache.org/repos/asf/spark-website/blob/35eb1471/site/releases/spark-release-1-3-0.html -- diff --git

[2/3] spark-website git commit: Add the news about spark-summit-eu-2017 agenda

2017-08-28 Thread yhuai
http://git-wip-us.apache.org/repos/asf/spark-website/blob/35eb1471/site/news/spark-accepted-into-apache-incubator.html -- diff --git a/site/news/spark-accepted-into-apache-incubator.html

[GitHub] spark issue #18944: [SPARK-21732][SQL]Lazily init hive metastore client

2017-08-14 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/18944 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-21111][TEST][2.2] Fix the test failure of describe.sql

2017-06-15 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.2 76ee41fd7 -> a585c870a [SPARK-2][TEST][2.2] Fix the test failure of describe.sql ## What changes were proposed in this pull request? Test failed in `describe.sql`. We need to fix the related bug introduced in

[GitHub] spark issue #18316: [SPARK-21111] [TEST] [2.2] Fix the test failure of descr...

2017-06-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/18316 Thanks! I have merged this pr to branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18316: [SPARK-21111] [TEST] [2.2] Fix the test failure of descr...

2017-06-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/18316 thanks! merging to branch-2.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18316: [SPARK-21111] [TEST] [2.2] Fix the test failure of descr...

2017-06-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/18316 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-06-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/18064 My suggestion was about getting changes on the interfaces of ExecutedCommandExec and SaveIntoDataSourceCommand to separate prs. It will help code review (both speed and quality). --- If your

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-07 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/18148 @vanzin Seems merging to branch-2.2 was an accident? Since it is not really a bug fix, should we revert it from branch-2.2 and just keep it in the master? --- If your project is set up for it, you

[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-06-07 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/18064 I just case across this pr. I have one general feedback. It will be great if we can make a pr have a single purpose. This pr contains different kinds of changes in order to fix the UI. If refactoring

spark git commit: Revert "[SPARK-20946][SQL] simplify the config setting logic in SparkSession.getOrCreate"

2017-06-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.2 6c628e75e -> b560c975b Revert "[SPARK-20946][SQL] simplify the config setting logic in SparkSession.getOrCreate" This reverts commit e11d90bf8deb553fd41b8837e3856c11486c2503. Project:

[GitHub] spark issue #18172: [SPARK-20946][SQL] simplify the config setting logic in ...

2017-06-02 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/18172 Reverting this because it breaks repl tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17617: [SPARK-20244][Core] Handle incorrect bytesRead me...

2017-06-02 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/17617#discussion_r119938185 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala --- @@ -143,14 +144,29 @@ class SparkHadoopUtil extends Logging

[GitHub] spark issue #17763: [SPARK-13747][Core]Add ThreadUtils.awaitReady and disall...

2017-05-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17763 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17666: [SPARK-20311][SQL] Support aliases for table value funct...

2017-05-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17666 I have reverted this change from both master and branch-2.2. I have reopened the jira. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

spark git commit: Revert "[SPARK-20311][SQL] Support aliases for table value functions"

2017-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.2 9e8d23b3a -> d191b962d Revert "[SPARK-20311][SQL] Support aliases for table value functions" This reverts commit 714811d0b5bcb5d47c39782ff74f898d276ecc59. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert "[SPARK-20311][SQL] Support aliases for table value functions"

2017-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ac1ab6b9d -> f79aa285c Revert "[SPARK-20311][SQL] Support aliases for table value functions" This reverts commit 714811d0b5bcb5d47c39782ff74f898d276ecc59. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

[GitHub] spark issue #17666: [SPARK-20311][SQL] Support aliases for table value funct...

2017-05-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17666 I am going to revert this PR from master and branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17666: [SPARK-20311][SQL] Support aliases for table value funct...

2017-05-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17666 @maropu Sorry. I think this PR introduces a regression. ``` scala> spark.sql("select * from range(1, 10) cross join range(1, 10)").explain ==

[GitHub] spark issue #17905: [SPARK-20661][SPARKR][TEST][FOLLOWUP] SparkR tableNames(...

2017-05-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17905 i see. I think https://github.com/apache/spark/pull/17905/commits/d4c1a9db25ee7386f7b12e4dabb54210a9892510 is good. How about we get it checked in first (after jenkins passes)? --- If your project

[GitHub] spark issue #17905: [SPARK-20661][SPARKR][TEST][FOLLOWUP] SparkR tableNames(...

2017-05-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17905 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17905: [SPARK-20661][SPARKR][TEST][FOLLOWUP] SparkR tableNames(...

2017-05-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17905 @falaki's PR did not actually trigger that test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17905: [SPARK-20661][SPARKR][TEST][FOLLOWUP] SparkR tableNames(...

2017-05-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17905 @felixcheung you are right. That is the problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17903: [SPARK-20661][SparkR][Test] SparkR tableNames() test fai...

2017-05-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17903 I do not think https://github.com/apache/spark/pull/17649 caused the problem. I saw failures without that internally. --- If your project is set up for it, you can reply to this email and have your

spark git commit: [SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails

2017-05-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.2 23681e9ca -> 4179ffc03 [SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails ## What changes were proposed in this pull request? Cleaning existing temp tables before running tableNames tests ## How was this patch tested? SparkR

spark git commit: [SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails

2017-05-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 829cd7b8b -> 2abfee18b [SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails ## What changes were proposed in this pull request? Cleaning existing temp tables before running tableNames tests ## How was this patch tested? SparkR Unit

[GitHub] spark issue #17903: [SPARK-20661][SparkR][Test] SparkR tableNames() test fai...

2017-05-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17903 Thanks @falaki. Merging to master and branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17903: [SPARK-20661][SparkR][Test] SparkR tableNames() test fai...

2017-05-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17903 Seems 2.2 build is fine. But, I'd like to get this merged in branch-2.2 since this test will fail if any previous tests leak tables. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #17903: [SPARK-20661][SparkR][Test] SparkR tableNames() test fai...

2017-05-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17903 @felixcheung fyi. I think the main problem of this test is that it will be broken if tests executed before this one leak any table. I think this change makes sense. I will merge it once it passes

[GitHub] spark issue #17892: [SPARK-20626][SPARKR] address date test warning with tim...

2017-05-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17892 @felixcheung Seems master build is broken because R tests are broken (https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-master-test-sbt-hadoop-2.7/2844/console). I am not sure

[GitHub] spark issue #17746: [SPARK-20449][ML] Upgrade breeze version to 0.13.1

2017-05-01 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17746 @dbtsai Thanks for the explanation and the context :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17746: [SPARK-20449][ML] Upgrade breeze version to 0.13.1

2017-05-01 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17746 Can I ask how we decided merging this dependency change after the cut of the release branch (especially this change affects user code)? --- If your project is set up for it, you can reply

[GitHub] spark issue #17659: [SPARK-20358] [core] Executors failing stage on interrup...

2017-04-20 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17659 lgtm. Merging to master and branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

spark git commit: [SPARK-20217][CORE] Executor should not fail stage if killed task throws non-interrupted exception

2017-04-05 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 4000f128b -> 5142e5d4e [SPARK-20217][CORE] Executor should not fail stage if killed task throws non-interrupted exception ## What changes were proposed in this pull request? If tasks throw non-interrupted exceptions on kill (e.g.

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17531 Thanks. Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17531 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17423: [SPARK-20088] Do not create new SparkContext in SparkR c...

2017-03-26 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17423 got it. Thanks :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17423: [SPARK-20088] Do not create new SparkContext in SparkR c...

2017-03-25 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17423 @felixcheung `SparkContext.getOrCreate` is the preferred way to create a SparkContext. So, even we have check, it is still better to use `getOrCreate`. --- If your project is set up for it, you can

spark git commit: [SPARK-19620][SQL] Fix incorrect exchange coordinator id in the physical plan

2017-03-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master fcb68e0f5 -> dd9049e04 [SPARK-19620][SQL] Fix incorrect exchange coordinator id in the physical plan ## What changes were proposed in this pull request? When adaptive execution is enabled, an exchange coordinator is used in the Exchange

[GitHub] spark issue #16952: [SPARK-19620][SQL]Fix incorrect exchange coordinator id ...

2017-03-10 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16952 LGTM. Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #17156: [SPARK-19816][SQL][Tests] Fix an issue that DataFrameCal...

2017-03-03 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17156 merged to branch-2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-19816][SQL][TESTS] Fix an issue that DataFrameCallbackSuite doesn't recover the log level

2017-03-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 da04d45c2 -> 664c9795c [SPARK-19816][SQL][TESTS] Fix an issue that DataFrameCallbackSuite doesn't recover the log level ## What changes were proposed in this pull request? "DataFrameCallbackSuite.execute callback functions when a

[GitHub] spark issue #17156: [SPARK-19816][SQL][Tests] Fix an issue that DataFrameCal...

2017-03-03 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17156 Let's also merge this to branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16917: [SPARK-19529][BRANCH-1.6] Backport PR #16866 to branch-1...

2017-02-27 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16917 Let's use a meaningful title in future :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16935: [SPARK-19604] [TESTS] Log the start of every Python test

2017-02-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16935 cool. It has been merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

spark git commit: [SPARK-19604][TESTS] Log the start of every Python test

2017-02-15 Thread yhuai
lso log the start of a test. So, if a test is hanging, we can tell which test file is running. ## How was this patch tested? This is a change for python tests. Author: Yin Huai <yh...@databricks.com> Closes #16935 from yhuai/SPARK-19604. (cherry picked fr

[GitHub] spark issue #16935: [SPARK-19604] [TESTS] Log the start of every Python test

2017-02-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16935 Seems I cannot merge now... Will try again later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16935: [SPARK-19604] [TESTS] Log the start of every Python test

2017-02-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16935 ok. Nothing new to add. I will merge this to master and branch-2.1 (in case we want to debug any python test hanging issue in branch-2.1). --- If your project is set up for it, you can reply

[GitHub] spark issue #16935: [SPARK-19604] [TESTS] Log the start of every Python test

2017-02-14 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16935 Let's not merge it right now. I may need to log more. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16935: [SPARK-19604] [TESTS] Log the start of every Pyth...

2017-02-14 Thread yhuai
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/16935 [SPARK-19604] [TESTS] Log the start of every Python test ## What changes were proposed in this pull request? Right now, we only have info level log after we finish the tests of a Python

[GitHub] spark issue #16894: [SPARK-17897] [SQL] [BACKPORT-2.0] Fixed IsNotNull Const...

2017-02-12 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16894 thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16067: [SPARK-17897] [SQL] Fixed IsNotNull Constraint Inference...

2017-02-10 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16067 @gatorsmile can we also add it in branch-2.0? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

spark git commit: [SPARK-19295][SQL] IsolatedClientLoader's downloadVersion should log the location of downloaded metastore client jars

2017-01-19 Thread yhuai
ion of those downloaded jars when `spark.sql.hive.metastore.jars` is set to `maven`. ## How was this patch tested? jenkins Author: Yin Huai <yh...@databricks.com> Closes #16649 from yhuai/SPARK-19295. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.a

[GitHub] spark issue #16649: [SPARK-19295] [SQL] IsolatedClientLoader's downloadVersi...

2017-01-19 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16649 Cool I am merging this to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...

2017-01-19 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16645 My main concern of this pr is that if people will think it is recommended to add new batches to force those rules running in a certain ordering. For these resolution rules, we can also use conditions

[GitHub] spark pull request #16649: [SPARK-19295] [SQL] IsolatedClientLoader's downlo...

2017-01-19 Thread yhuai
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/16649 [SPARK-19295] [SQL] IsolatedClientLoader's downloadVersion should log the location of downloaded metastore client jars ## What changes were proposed in this pull request? This will help

spark git commit: Update known_translations for contributor names

2017-01-18 Thread yhuai
uai <yh...@databricks.com> Closes #16628 from yhuai/known_translations. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0c923185 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0c923185 Diff: http://git-wip-us.a

[GitHub] spark issue #16613: [SPARK-19024][SQL] Implement new approach to write a per...

2017-01-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16613 nvm. After second thought, the feature flag does not really buy us anything. We just store the original view definition and the column mapping in the metastore. So, I think it is fine to just do

[GitHub] spark issue #16628: Update known_translations for contributor names

2017-01-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16628 I am merging this to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14204: [SPARK-16520] [WEBUI] Link executors to corresponding wo...

2017-01-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14204 ok I agree. Originally, I thought it will be helpful to figure out the worker that an executor belongs to. But, if it does not provide very useful information. I am fine to drop

[GitHub] spark issue #16628: Update known_translations for contributor names

2017-01-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16628 done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16613: [SPARK-19024][SQL] Implement new approach to write a per...

2017-01-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16613 is there a feature flag that is used to determine if we use this new approach? I feel it will be good to have an internal feature flag to determine the code path. So, if there is something wrong

[GitHub] spark issue #16517: [SPARK-18243][SQL] Port Hive writing to use FileFormat i...

2017-01-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16517 Looks good to me. @gatorsmile can you explain your concerns? I am wondering what kind of cases that you think HiveFileFormat may not be able to handle. --- If your project is set up for it, you can

[GitHub] spark pull request #16517: [SPARK-18243][SQL] Port Hive writing to use FileF...

2017-01-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/16517#discussion_r96566857 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -276,40 +276,31 @@ case class InsertIntoHiveTable

[GitHub] spark pull request #16517: [SPARK-18243][SQL] Port Hive writing to use FileF...

2017-01-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/16517#discussion_r96566523 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -276,40 +276,31 @@ case class InsertIntoHiveTable

[GitHub] spark pull request #16517: [SPARK-18243][SQL] Port Hive writing to use FileF...

2017-01-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/16517#discussion_r96566290 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveFileFormat.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache

  1   2   3   4   5   6   7   8   9   10   >