[GitHub] spark issue #22725: [SPARK-25753][[CORE]fix reading small files via BinaryFi...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22725 It still has `[[` before `CORE`. :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22775 Ah.. let me rebase and sync the tests --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22775 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22775 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97631/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22775 **[Test build #97631 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97631/testReport)** for PR 22775 at commit [`9cb0b94`](https://github.com/apache/spark/commit/9cb0b946f95f881ba9203f785cc6446f93244157). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Update the building doc to use Maven 3.5.4 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22781 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97639/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Update the building doc to use Maven 3.5.4 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22781 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Update the building doc to use Maven 3.5.4 ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22781 **[Test build #97639 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97639/testReport)** for PR 22781 at commit [`0af884c`](https://github.com/apache/spark/commit/0af884c83999769625b53b852dc546c49976fae1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22756: [SPARK-25758][ML] Deprecate computeCost on BisectingKMea...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22756 shall we revert it from master as well? At least we need to update the message `This method is deprecated and will be removed in 3.0.0.` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Update the building doc to use Maven 3.5.4 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22781 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4148/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Update the building doc to use Maven 3.5.4 ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22781 **[Test build #97639 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97639/testReport)** for PR 22781 at commit [`0af884c`](https://github.com/apache/spark/commit/0af884c83999769625b53b852dc546c49976fae1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Update the building doc to use Maven 3.5.4 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22781 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22781: [MINOR][DOC] Update the building doc to use Maven...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22781#discussion_r226816760 --- Diff: docs/building-spark.md --- @@ -12,7 +12,7 @@ redirect_from: "building-with-maven.html" ## Apache Maven The Maven-based build is the build of reference for Apache Spark. -Building Spark using Maven requires Maven 3.3.9 or newer and Java 8+. +Building Spark using Maven requires Maven 3.5.4 and Java 8. --- End diff -- Maven 3.5.4 is the latest one and there is no newer version for now. We cannot guarantee the newer Maven work or not. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Fix the building document to describe Java ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22781 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Fix the building document to describe Java ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22781 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97637/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Fix the building document to describe Java ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22781 **[Test build #97637 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97637/testReport)** for PR 22781 at commit [`82069cd`](https://github.com/apache/spark/commit/82069cd1da847fec76daadb502fdf57fdbdebbfb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22781: [MINOR][DOC] Fix the building document to describ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22781#discussion_r226816703 --- Diff: docs/building-spark.md --- @@ -12,7 +12,7 @@ redirect_from: "building-with-maven.html" ## Apache Maven The Maven-based build is the build of reference for Apache Spark. -Building Spark using Maven requires Maven 3.3.9 or newer and Java 8+. +Building Spark using Maven requires Maven 3.3.9 or newer and Java 8. --- End diff -- Thank you, @cloud-fan and @HyukjinKwon . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22763: [SPARK-25764][ML][EXAMPLES] Update BisectingKMeans examp...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22763 Ur, @mengxr and @gatorsmile seems to determine to revert #22756 on `branch-2.4` only. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22750 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4147/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22781: [MINOR][DOC] Fix the building document to describ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22781#discussion_r226816661 --- Diff: docs/building-spark.md --- @@ -12,7 +12,7 @@ redirect_from: "building-with-maven.html" ## Apache Maven The Maven-based build is the build of reference for Apache Spark. -Building Spark using Maven requires Maven 3.3.9 or newer and Java 8+. +Building Spark using Maven requires Maven 3.3.9 or newer and Java 8. --- End diff -- +1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22750 **[Test build #97638 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97638/testReport)** for PR 22750 at commit [`0dae572`](https://github.com/apache/spark/commit/0dae5723e815bd8c49982530068aa3d45187bfde). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22750 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22781: [MINOR][DOC] Fix the building document to describ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22781#discussion_r226816630 --- Diff: docs/building-spark.md --- @@ -12,7 +12,7 @@ redirect_from: "building-with-maven.html" ## Apache Maven The Maven-based build is the build of reference for Apache Spark. -Building Spark using Maven requires Maven 3.3.9 or newer and Java 8+. +Building Spark using Maven requires Maven 3.3.9 or newer and Java 8. --- End diff -- let's change to the expected maven version, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22501: [SPARK-25492][TEST] Refactor WideSchemaBenchmark to use ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22501 seems jenkins is broken, cc @shaneknapp ``` Command "/tmp/tmp.JfFHaoRFPU/3.5/bin/python -c "import setuptools, tokenize;__file__='/home/jenkins/workspace/SparkPullRequestBuilder/python/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" develop --no-deps" failed with error code 1 in /home/jenkins/workspace/SparkPullRequestBuilder/python/ You are using pip version 10.0.1, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. Cleaning up temporary directory - /tmp/tmp.JfFHaoRFPU [error] running /home/jenkins/workspace/SparkPullRequestBuilder/dev/run-pip-tests ; received return code 1 ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22781: [MINOR][DOC] Fix the building document to describ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22781#discussion_r226816563 --- Diff: docs/building-spark.md --- @@ -12,7 +12,7 @@ redirect_from: "building-with-maven.html" ## Apache Maven The Maven-based build is the build of reference for Apache Spark. -Building Spark using Maven requires Maven 3.3.9 or newer and Java 8+. +Building Spark using Maven requires Maven 3.3.9 or newer and Java 8. --- End diff -- Ur, @kiszk and @cloud-fan , @HyukjinKwon . It seems to be Maven version is also outdated. We updated Maven version in https://github.com/apache/spark/pull/21905 and https://github.com/apache/spark/pull/21920 . Should we change it together? Or, Maven 3.3.9 is still valid to compile Spark? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22750 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22750 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97628/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22750 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22750 **[Test build #97628 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97628/testReport)** for PR 22750 at commit [`0dae572`](https://github.com/apache/spark/commit/0dae5723e815bd8c49982530068aa3d45187bfde). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Fix the building document to describe Java ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22781 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4146/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Fix the building document to describe Java ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22781 **[Test build #97637 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97637/testReport)** for PR 22781 at commit [`82069cd`](https://github.com/apache/spark/commit/82069cd1da847fec76daadb502fdf57fdbdebbfb). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22781: [MINOR][DOC] Fix the building document to describe Java ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22781 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22781: [MINOR][DOC] Fix the building document to describ...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/22781 [MINOR][DOC] Fix the building document to describe Java 8 only insteaof `Java 8+` ## What changes were proposed in this pull request? Since we didn't test Java 9 ~ 11 up to now in the community, fix the document to describe Java 8 only. ## How was this patch tested? N/A (This is a document only change.) You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-JDK-DOC Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22781.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22781 commit 82069cd1da847fec76daadb502fdf57fdbdebbfb Author: Dongjoon Hyun Date: 2018-10-20T04:33:06Z [MINOR][DOC] Fix the building document to describe Java 8 only instead of `Java 8+` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22501: [SPARK-25492][TEST] Refactor WideSchemaBenchmark to use ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22501 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97627/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22501: [SPARK-25492][TEST] Refactor WideSchemaBenchmark to use ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22501 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22501: [SPARK-25492][TEST] Refactor WideSchemaBenchmark to use ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22501 **[Test build #97627 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97627/testReport)** for PR 22501 at commit [`64e5ede`](https://github.com/apache/spark/commit/64e5ede51fcc900d51256d421d86939b202f3d75). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22666 **[Test build #97636 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97636/testReport)** for PR 22666 at commit [`bd79d87`](https://github.com/apache/spark/commit/bd79d87af764f3368cc7c8ad4048bd9d95a8da38). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22778 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97629/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22778 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22778 **[Test build #97629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97629/testReport)** for PR 22778 at commit [`c8d1b91`](https://github.com/apache/spark/commit/c8d1b91b93e7ad05ca0bd17984fad1c30062d504). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22780: [DOC][MINOR] Fix minor error in the code of graphx guide
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22780 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22780: [DOC][MINOR] Fix minor error in the code of graphx guide
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22780 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97633/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22780: [DOC][MINOR] Fix minor error in the code of graphx guide
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22780 **[Test build #97633 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97633/testReport)** for PR 22780 at commit [`35c77a5`](https://github.com/apache/spark/commit/35c77a5ef7b493ac0973ce0af6a66cbdbbbc749b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22776: [SPARK-25779][SQL][TESTS] Remove SQL query tests for fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22776 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22776: [SPARK-25779][SQL][TESTS] Remove SQL query tests for fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22776 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4145/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22666#discussion_r226814727 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -3886,6 +3886,31 @@ object functions { withExpr(new CsvToStructs(e.expr, schema.expr, options.asScala.toMap)) } + /** + * Parses a column containing a CSV string and infers its schema. + * + * @param e a string column containing CSV data. + * + * @group collection_funcs + * @since 3.0.0 + */ + def schema_of_csv(e: Column): Column = withExpr(new SchemaOfCsv(e.expr)) + + /** + * Parses a column containing a CSV string and infers its schema using options. + * + * @param e a string column containing CSV data. + * @param options options to control how the CSV is parsed. accepts the same options and the + *json data source. See [[DataFrameReader#csv]]. + * @return a column with string literal containing schema in DDL format. + * + * @group collection_funcs + * @since 3.0.0 + */ + def schema_of_csv(e: Column, options: java.util.Map[String, String]): Column = { --- End diff -- `schema_of_json` also has only Java specific (I actually suggested to minimise exposed functions) since Java specific one can be used in Scala side but Scala specific can't be used in Java side. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22776: [SPARK-25779][SQL][TESTS] Remove SQL query tests for fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22776 **[Test build #97635 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97635/testReport)** for PR 22776 at commit [`7b1490a`](https://github.com/apache/spark/commit/7b1490a80879ae2b356f430d743fc9ef3ab5cd7e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22749 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22749 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97630/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22749 **[Test build #97630 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97630/testReport)** for PR 22749 at commit [`efbc3fc`](https://github.com/apache/spark/commit/efbc3fc05ce42bda932b54c78c7f7b7eca90419f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22662: [SPARK-25627][TEST] Reduce test time for ContinuousStres...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22662 ping @tdas and @zsxwing Can you take a look this? Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22779 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22780: [DOC][MINOR] Fix minor error in the code of graphx guide
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22780 **[Test build #97633 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97633/testReport)** for PR 22780 at commit [`35c77a5`](https://github.com/apache/spark/commit/35c77a5ef7b493ac0973ce0af6a66cbdbbbc749b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22779 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4144/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22779 **[Test build #97634 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97634/testReport)** for PR 22779 at commit [`1913fe6`](https://github.com/apache/spark/commit/1913fe674cd6ee948eb6bc1d0ef548c649a7adcf). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22780: [DOC][MINOR] Fix minor error in the code of graphx guide
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22780 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22780: [DOC][MINOR] Fix minor error in the code of graphx guide
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22780 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4143/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22780: [DOC][MINOR] Fix minor error in the code of graph...
GitHub user WeichenXu123 opened a pull request: https://github.com/apache/spark/pull/22780 [DOC][MINOR] Fix minor error in the code of graphx guide ## What changes were proposed in this pull request? Fix minor error in the code "sketch of pregel implementation" of GraphX guide ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/WeichenXu123/spark minor_doc_update1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22780.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22780 commit 35c77a5ef7b493ac0973ce0af6a66cbdbbbc749b Author: WeichenXu Date: 2018-10-20T02:52:38Z init pr --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22779 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97632/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22779 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22779 **[Test build #97632 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97632/testReport)** for PR 22779 at commit [`943e398`](https://github.com/apache/spark/commit/943e3988dcb70d17e65b5e508f6f35b87fc71d28). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22779 **[Test build #97632 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97632/testReport)** for PR 22779 at commit [`943e398`](https://github.com/apache/spark/commit/943e3988dcb70d17e65b5e508f6f35b87fc71d28). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22779 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22779 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4142/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is ...
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22779 [SPARK-25786][CORE]If the ByteBuffer.hasArray is false , it will throw UnsupportedOperationException for Kryo ## What changes were proposed in this pull request? `deserialize` for kryo, the type of input parameter is ByteBuffer, if it is not backed by an accessible byte array. it will throw `UnsupportedOperationException` Exception Info: ``` java.lang.UnsupportedOperationException was thrown. java.lang.UnsupportedOperationException at java.nio.ByteBuffer.array(ByteBuffer.java:994) at org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:362) ``` ## How was this patch tested? Added a unit test You can merge this pull request into a Git repository by running: $ git pull https://github.com/10110346/spark InputStreamKryo Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22779.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22779 commit 943e3988dcb70d17e65b5e508f6f35b87fc71d28 Author: liuxian Date: 2018-10-19T11:08:10Z fix --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22775 **[Test build #97631 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97631/testReport)** for PR 22775 at commit [`9cb0b94`](https://github.com/apache/spark/commit/9cb0b946f95f881ba9203f785cc6446f93244157). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22775 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4141/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22775 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22288 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97625/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22288 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22288 **[Test build #97625 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97625/testReport)** for PR 22288 at commit [`b2d0d40`](https://github.com/apache/spark/commit/b2d0d40771534291bdd5a1e3ebfc2c0c227c5956). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97624/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22749 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4140/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22749 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22504 **[Test build #97624 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97624/testReport)** for PR 22504 at commit [`0bd33f6`](https://github.com/apache/spark/commit/0bd33f6b2fccc45fcd99de735b8a25465aedb325). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22749 **[Test build #97630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97630/testReport)** for PR 22749 at commit [`efbc3fc`](https://github.com/apache/spark/commit/efbc3fc05ce42bda932b54c78c7f7b7eca90419f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22773: [SPARK-25785][SQL] Add prettyNames for from_json,...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22773 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22773: [SPARK-25785][SQL] Add prettyNames for from_json, to_jso...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22773 Thank you @viirya and @dongjoon-hyun. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22773: [SPARK-25785][SQL] Add prettyNames for from_json, to_jso...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22773 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22725: [SPARK-25753][[CORE][FOLLOW-UP]fix reading small files v...
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22725 ok,thanks @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22775 Yup, will fix. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22773: [MINOR][SQL] Add prettyNames for from_json, to_json, fro...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22773 Other JIRAs have different fixed versions. Let me create a new JIRA then. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22624: [SPARK-23781][CORE] Merge token renewer functionality in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22624 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22624: [SPARK-23781][CORE] Merge token renewer functionality in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22624 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97623/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22624: [SPARK-23781][CORE] Merge token renewer functionality in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22624 **[Test build #97623 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97623/testReport)** for PR 22624 at commit [`b3a282e`](https://github.com/apache/spark/commit/b3a282e9d9cc0a6202091603e84c1ce0d50269b7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22721: [SPARK-25403][SQL] Refreshes the table after inse...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22721#discussion_r226812885 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -189,6 +189,7 @@ case class InsertIntoHadoopFsRelationCommand( sparkSession.catalog.refreshByPath(outputPath.toString) if (catalogTable.nonEmpty) { + sparkSession.catalog.refreshTable(catalogTable.get.identifier.quotedString) --- End diff -- OK, Fixed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22778 **[Test build #97629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97629/testReport)** for PR 22778 at commit [`c8d1b91`](https://github.com/apache/spark/commit/c8d1b91b93e7ad05ca0bd17984fad1c30062d504). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22778 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22778 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4139/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22547#discussion_r226812577 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Format.java --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.sources.v2; + +import org.apache.spark.annotation.InterfaceStability; +import org.apache.spark.sql.sources.DataSourceRegister; +import org.apache.spark.sql.types.StructType; + +/** + * The base interface for data source v2. Implementations must have a public, 0-arg constructor. + * + * The major responsibility of this interface is to return a {@link Table} for read/write. + */ +@InterfaceStability.Evolving +public interface Format extends DataSourceV2 { --- End diff -- the write API has not been migrated and still need `DataSourceV2` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/22778 [SPARK-25784][SQL] Infer filters from constraints after rewriting predicate subquery ## What changes were proposed in this pull request? Infer filters from constraints after rewriting predicate subquery. ## How was this patch tested? unit tests and benchmark tests ```scala withTempView("t1", "t2") { withTempDir { dir => spark.range(300) .selectExpr("cast(null as int) as c1", "if(id % 2 = 0, null, id) as c2", "id as c3") .coalesce(1) .orderBy("c2") .write .mode("overwrite") .option("parquet.block.size", 10485760) .parquet(dir.getCanonicalPath) spark.read.parquet(dir.getCanonicalPath).createTempView("t1") spark.read.parquet(dir.getCanonicalPath).createTempView("t2") Seq("c1", "c2", "c3").foreach { column => val benchmark = new Benchmark(s"join key $column", 10) Seq(false, true).foreach { inferFilters => benchmark.addCase(s"Is infer filters $inferFilters", numIters = 5) { _ => withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> inferFilters.toString) { sql(s"select t1.* from t1 where t1.$column in (select $column from t2)").count() } } } benchmark.run() } } } ``` ``` ava HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz join key c1: Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative Is infer filters false2005 / 2163 0.0 200481431.0 1.0X Is infer filters true 190 / 207 0.0 18962935.7 10.6X Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz join key c2: Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative Is infer filters false2368 / 2498 0.0 236803743.1 1.0X Is infer filters true 1234 / 1268 0.0 123443912.3 1.9X Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6 Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz join key c3: Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative Is infer filters false2754 / 2907 0.0 275376009.7 1.0X Is infer filters true 2237 / 2255 0.0 223739457.8 1.2X ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangyum/spark SPARK-25784 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22778.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22778 commit c8d1b91b93e7ad05ca0bd17984fad1c30062d504 Author: Yuming Wang Date: 2018-10-20T01:39:51Z Infer filters from constraints after rewriting predicate subquery --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22750 **[Test build #97628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97628/testReport)** for PR 22750 at commit [`0dae572`](https://github.com/apache/spark/commit/0dae5723e815bd8c49982530068aa3d45187bfde). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22750 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needsUnsafeR...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22750 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4138/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22750: [SPARK-25747][SQL] remove ColumnarBatchScan.needs...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22750#discussion_r226812447 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ColumnarBatchScan.scala --- @@ -164,12 +162,11 @@ private[sql] trait ColumnarBatchScan extends CodegenSupport { val outputVars = output.zipWithIndex.map { case (a, i) => --- End diff -- good catch! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22501: [SPARK-25492][TEST] Refactor WideSchemaBenchmark to use ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22501 thank you guys for refreshing the benchmarks and results! It's very helpful. If possible, can we post the perf regressions we found in the umbrella JIRA? Then people can see if the perf regression is reasonable(if we have addressed it) or investigate how the regression was introduced. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22763: [SPARK-25764][ML][EXAMPLES] Update BisectingKMeans examp...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22763 This has been reverted from master/2.4 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22764: [SPARK-25765][ML] Add training cost to BisectingK...
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/22764#discussion_r226812121 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala --- @@ -225,13 +227,14 @@ object BisectingKMeansModel extends Loader[BisectingKMeansModel] { assert(formatVersion == thisFormatVersion) val rootId = (metadata \ "rootId").extract[Int] val distanceMeasure = (metadata \ "distanceMeasure").extract[String] + val trainingCost = (metadata \ "trainingCost").extract[Double] --- End diff -- @mgaido91 What about this way ? In `ml` reader, we can load `trainingCost` and then construct `mllib.clustering.BisectingKMeansModel` and pass the `trainingCost` argument and then construct `ml.clustering.BisectingKMeansModel` to wrap that `mllib` model. ** Then in `ml` reader we can check the spark major/minor version to keep backwards compatibility, this is a more important thing. ** ** In `mllib` loader we can ignore loading `trainingCost`, because `mllib` is deprecate, we can ignore adding new features, but we cannot breaking backwards compatibility (your change breaking backwards compatibility). ** --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22501: [SPARK-25492][TEST] Refactor WideSchemaBenchmark to use ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22501 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4137/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org