[GitHub] spark issue #17417: [DOCS] Docs-only improvements
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17417 **[Test build #75388 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75388/testReport)** for PR 17417 at commit [`ae57b33`](https://github.com/apache/spark/commit/ae57b33a12e26c2b2c512d35c33ff8663f4f3373). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning
Github user Gauravshah commented on the issue: https://github.com/apache/spark/pull/16578 can I do something to help this pull request ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17450: [SPARK-20121][SQL] simplify NullPropagation with NullInt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17450 **[Test build #75390 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75390/testReport)** for PR 17450 at commit [`63287ef`](https://github.com/apache/spark/commit/63287ef766b779255054bc463a9e3f9e49149083). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17443: [DOCS][MINOR] Fixed a few typos in the Structured Stream...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17443 **[Test build #3620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3620/testReport)** for PR 17443 at commit [`a2faf88`](https://github.com/apache/spark/commit/a2faf88e61c2b634dd81fdaea565234bc5d6012d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17443: [DOCS][MINOR] Fixed a few typos in the Structured Stream...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17443 **[Test build #3620 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3620/testReport)** for PR 17443 at commit [`a2faf88`](https://github.com/apache/spark/commit/a2faf88e61c2b634dd81fdaea565234bc5d6012d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17443: [DOCS][MINOR] Fixed a few typos in the Structured Stream...
Github user aseigneurin commented on the issue: https://github.com/apache/spark/pull/17443 @srowen does this mean I should open a JIRA for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17458 **[Test build #3619 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3619/testReport)** for PR 17458 at commit [`4788bbe`](https://github.com/apache/spark/commit/4788bbef279828d96feead227b4b450e96493d4c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...
Github user samelamin commented on the issue: https://github.com/apache/spark/pull/17472 @gatorsmile @srowen how does it look now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17430 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17430 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75387/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17430 **[Test build #75387 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75387/testReport)** for PR 17430 at commit [`77dadfd`](https://github.com/apache/spark/commit/77dadfdf116c7b6e0385582ced746d83a960adb5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17472 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17472 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75386/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17472 **[Test build #75386 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75386/testReport)** for PR 17472 at commit [`632161b`](https://github.com/apache/spark/commit/632161b299ac37a598083b1c2995e9becbddc33c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17477 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75385/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17477 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17477 **[Test build #75385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75385/testReport)** for PR 17477 at commit [`7a7cf04`](https://github.com/apache/spark/commit/7a7cf04db7c2a1ffe1fc3cde1e1abdc99481b618). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17476: [SPARK-20151][SQL] Account for partition pruning ...
Github user adrian-ionescu commented on a diff in the pull request: https://github.com/apache/spark/pull/17476#discussion_r108906608 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CatalogFileIndex.scala --- @@ -111,7 +113,8 @@ private class PrunedInMemoryFileIndex( sparkSession: SparkSession, tableBasePath: Path, fileStatusCache: FileStatusCache, -override val partitionSpec: PartitionSpec) +override val partitionSpec: PartitionSpec, +override val metadataOpsTimeNs: Option[Long]) --- End diff -- Add param doc, as it's not immediately obvious what a user is supposed to supply here. I'd say something like "time it took to obtain the partitionSpec from the Hive metastore", but maybe that's too specific.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17416 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17416 **[Test build #75389 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75389/testReport)** for PR 17416 at commit [`dec7bfb`](https://github.com/apache/spark/commit/dec7bfb8911fc03c8813fe809c3b09014a659791). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17416 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75389/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17416 **[Test build #75389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75389/testReport)** for PR 17416 at commit [`dec7bfb`](https://github.com/apache/spark/commit/dec7bfb8911fc03c8813fe809c3b09014a659791). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17329: [SPARK-19991]FileSegmentManagedBuffer performance...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/17329 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17416 Yeah, it clearly downloads the right models .jar file, puts it in the .ivy cache correctly (I can find it there as expected) but then `spark.jars` isn't set correctly: ``` ... (spark.jars,file:/Users/srowen/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar) ... ``` Let me push some more updates including the one you suggested, which didn't seem to change this, but I wouldn't expect it to. Still tracing down exactly what sets this value of spark.jars. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17417: [DOCS] Docs-only improvements
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17417 **[Test build #75388 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75388/testReport)** for PR 17417 at commit [`ae57b33`](https://github.com/apache/spark/commit/ae57b33a12e26c2b2c512d35c33ff8663f4f3373). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17452: [SPARK-20123][build]$SPARK_HOME variable might ha...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/17452#discussion_r108898852 --- Diff: dev/make-distribution.sh --- @@ -217,43 +217,43 @@ fi # Make R package - this is used for both CRAN release and packing R layout into distribution if [ "$MAKE_R" == "true" ]; then echo "Building R source package" - R_PACKAGE_VERSION=`grep Version $SPARK_HOME/R/pkg/DESCRIPTION | awk '{print $NF}'` + R_PACKAGE_VERSION=`grep Version "$SPARK_HOME/R/pkg/DESCRIPTION" | awk '{print $NF}'` pushd "$SPARK_HOME/R" > /dev/null # Build source package and run full checks # Do not source the check-cran.sh - it should be run from where it is for it to set SPARK_HOME - NO_TESTS=1 "$SPARK_HOME/"R/check-cran.sh + NO_TESTS=1 "$SPARK_HOME/R/check-cran.sh" # Move R source package to match the Spark release version if the versions are not the same. # NOTE(shivaram): `mv` throws an error on Linux if source and destination are same file if [ "$R_PACKAGE_VERSION" != "$VERSION" ]; then -mv $SPARK_HOME/R/SparkR_"$R_PACKAGE_VERSION".tar.gz $SPARK_HOME/R/SparkR_"$VERSION".tar.gz +mv "$SPARK_HOME/R/SparkR_$R_PACKAGE_VERSION.tar.gz" "$SPARK_HOME/R/SparkR_$VERSION.tar.gz" fi # Install source package to get it to generate vignettes rds files, etc. - VERSION=$VERSION "$SPARK_HOME/"R/install-source-package.sh + VERSION=$VERSION "$SPARK_HOME/R/install-source-package.sh" popd > /dev/null else echo "Skipping building R source package" fi # Copy other things -mkdir "$DISTDIR"/conf -cp "$SPARK_HOME"/conf/*.template "$DISTDIR"/conf +mkdir "$DISTDIR/conf" +cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf" --- End diff -- Sorry, yes that's also correct. But right now this is just one big argument to cp, not 2 or more. Doesn't it need to be more like `cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf"`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17452: [SPARK-20123][build]$SPARK_HOME variable might ha...
Github user zuotingbing commented on a diff in the pull request: https://github.com/apache/spark/pull/17452#discussion_r108898299 --- Diff: dev/make-distribution.sh --- @@ -217,43 +217,43 @@ fi # Make R package - this is used for both CRAN release and packing R layout into distribution if [ "$MAKE_R" == "true" ]; then echo "Building R source package" - R_PACKAGE_VERSION=`grep Version $SPARK_HOME/R/pkg/DESCRIPTION | awk '{print $NF}'` + R_PACKAGE_VERSION=`grep Version "$SPARK_HOME/R/pkg/DESCRIPTION" | awk '{print $NF}'` pushd "$SPARK_HOME/R" > /dev/null # Build source package and run full checks # Do not source the check-cran.sh - it should be run from where it is for it to set SPARK_HOME - NO_TESTS=1 "$SPARK_HOME/"R/check-cran.sh + NO_TESTS=1 "$SPARK_HOME/R/check-cran.sh" # Move R source package to match the Spark release version if the versions are not the same. # NOTE(shivaram): `mv` throws an error on Linux if source and destination are same file if [ "$R_PACKAGE_VERSION" != "$VERSION" ]; then -mv $SPARK_HOME/R/SparkR_"$R_PACKAGE_VERSION".tar.gz $SPARK_HOME/R/SparkR_"$VERSION".tar.gz +mv "$SPARK_HOME/R/SparkR_$R_PACKAGE_VERSION.tar.gz" "$SPARK_HOME/R/SparkR_$VERSION.tar.gz" fi # Install source package to get it to generate vignettes rds files, etc. - VERSION=$VERSION "$SPARK_HOME/"R/install-source-package.sh + VERSION=$VERSION "$SPARK_HOME/R/install-source-package.sh" popd > /dev/null else echo "Skipping building R source package" fi # Copy other things -mkdir "$DISTDIR"/conf -cp "$SPARK_HOME"/conf/*.template "$DISTDIR"/conf +mkdir "$DISTDIR/conf" +cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf" --- End diff -- oh NO, it might be wrong if we quote the arg $SPARK_HOME/conf/*.template as a whole. It works well already and the debug info as follows : + mkdir '/home/spark build/spark/dist/conf' + cp '/home/spark build/spark/conf/docker.properties.template' '/home/spark build/spark/conf/fairscheduler.xml.template' '/home/spark build/spark/conf/log4j.properties.template' '/home/spark build/spark/conf/metrics.properties.template' '/home/spark build/spark/conf/slaves.template' '/home/spark build/spark/conf/spark-defaults.conf.template' '/home/spark build/spark/conf/spark-env.sh.template' '/home/spark build/spark/dist/conf' --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17458 It looks good to me too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17458 **[Test build #3619 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3619/testReport)** for PR 17458 at commit [`4788bbe`](https://github.com/apache/spark/commit/4788bbef279828d96feead227b4b450e96493d4c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...
Github user dbolshak commented on the issue: https://github.com/apache/spark/pull/17458 Corrected. Please take a look. There is only one change (in 2 places) related to calling toSeq which has comment, but it's not clear can I leave my change or not. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...
Github user dbolshak commented on a diff in the pull request: https://github.com/apache/spark/pull/17458#discussion_r108890304 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala --- @@ -34,9 +34,9 @@ private[ui] class AllStagesPage(parent: StagesTab) extends WebUIPage("") { listener.synchronized { val activeStages = listener.activeStages.values.toSeq val pendingStages = listener.pendingStages.values.toSeq - val completedStages = listener.completedStages.reverse.toSeq + val completedStages = listener.completedStages.reverse --- End diff -- For me not clear, do you request this change back or we can leave my changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17430 **[Test build #75387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75387/testReport)** for PR 17430 at commit [`77dadfd`](https://github.com/apache/spark/commit/77dadfdf116c7b6e0385582ced746d83a960adb5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17477 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17477 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75381/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17477 **[Test build #75381 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75381/testReport)** for PR 17477 at commit [`7ddb6eb`](https://github.com/apache/spark/commit/7ddb6eb11ed17c87355826db5bf2512b785042f5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17443: typos
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17443 @aseigneurin Please read http://spark.apache.org/contributing.html too. As you might imagine "typos" isn't useful as a title. Write "[MINOR]" too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17472 **[Test build #75386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75386/testReport)** for PR 17472 at commit [`632161b`](https://github.com/apache/spark/commit/632161b299ac37a598083b1c2995e9becbddc33c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17430 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17452: [SPARK-20123][build]$SPARK_HOME variable might ha...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/17452#discussion_r108885749 --- Diff: dev/make-distribution.sh --- @@ -217,43 +217,43 @@ fi # Make R package - this is used for both CRAN release and packing R layout into distribution if [ "$MAKE_R" == "true" ]; then echo "Building R source package" - R_PACKAGE_VERSION=`grep Version $SPARK_HOME/R/pkg/DESCRIPTION | awk '{print $NF}'` + R_PACKAGE_VERSION=`grep Version "$SPARK_HOME/R/pkg/DESCRIPTION" | awk '{print $NF}'` pushd "$SPARK_HOME/R" > /dev/null # Build source package and run full checks # Do not source the check-cran.sh - it should be run from where it is for it to set SPARK_HOME - NO_TESTS=1 "$SPARK_HOME/"R/check-cran.sh + NO_TESTS=1 "$SPARK_HOME/R/check-cran.sh" # Move R source package to match the Spark release version if the versions are not the same. # NOTE(shivaram): `mv` throws an error on Linux if source and destination are same file if [ "$R_PACKAGE_VERSION" != "$VERSION" ]; then -mv $SPARK_HOME/R/SparkR_"$R_PACKAGE_VERSION".tar.gz $SPARK_HOME/R/SparkR_"$VERSION".tar.gz +mv "$SPARK_HOME/R/SparkR_$R_PACKAGE_VERSION.tar.gz" "$SPARK_HOME/R/SparkR_$VERSION.tar.gz" fi # Install source package to get it to generate vignettes rds files, etc. - VERSION=$VERSION "$SPARK_HOME/"R/install-source-package.sh + VERSION=$VERSION "$SPARK_HOME/R/install-source-package.sh" popd > /dev/null else echo "Skipping building R source package" fi # Copy other things -mkdir "$DISTDIR"/conf -cp "$SPARK_HOME"/conf/*.template "$DISTDIR"/conf +mkdir "$DISTDIR/conf" +cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf" --- End diff -- I think the two args need quoting separately? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17417: [DOCS] Docs-only improvements
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17417 Looks good, just needs a rebase now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17472: [SPARK-19999]: Accommodate a new architecture tha...
Github user samelamin commented on a diff in the pull request: https://github.com/apache/spark/pull/17472#discussion_r108884506 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java --- @@ -46,18 +46,22 @@ private static final boolean unaligned; static { boolean _unaligned; -// use reflection to access unaligned field -try { - Class bitsClass = -Class.forName("java.nio.Bits", false, ClassLoader.getSystemClassLoader()); - Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned"); - unalignedMethod.setAccessible(true); - _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null)); -} catch (Throwable t) { - // We at least know x86 and x64 support unaligned access. - String arch = System.getProperty("os.arch", ""); - //noinspection DynamicRegexReplaceableByCompiledPattern - _unaligned = arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$"); +String arch = System.getProperty("os.arch", ""); +if (arch.matches("^(ppc64le | ppc64)$")) { --- End diff -- haha its fine, called it rubber duck programming eh Sure I will make those changes, but I have no rights to rename the JIRA So I will rename this PR, can you get yourself or someone to rename the JIRA? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17472: [SPARK-19999]: Accommodate a new architecture tha...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/17472#discussion_r108884076 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java --- @@ -46,18 +46,22 @@ private static final boolean unaligned; static { boolean _unaligned; -// use reflection to access unaligned field -try { - Class bitsClass = -Class.forName("java.nio.Bits", false, ClassLoader.getSystemClassLoader()); - Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned"); - unalignedMethod.setAccessible(true); - _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null)); -} catch (Throwable t) { - // We at least know x86 and x64 support unaligned access. - String arch = System.getProperty("os.arch", ""); - //noinspection DynamicRegexReplaceableByCompiledPattern - _unaligned = arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$"); +String arch = System.getProperty("os.arch", ""); +if (arch.matches("^(ppc64le | ppc64)$")) { --- End diff -- Sorry, I'm writing one thing and thinking another. I mean alignment, not endianness. Which architectures do you know allow unaligned access? I'd presume all PPC does, and I assume the JDK issue means "PPC64 (big-endian) but also PPC64 little-endian". OK, to be conservative, maybe just check the strings "ppc64" and "ppc64le" as you intended. However your regex doesn't work. You have extra whitespace. Just instead check `arch.equals(...) || arch.equals(...)` Also the PR title shoudl match the JIRA title. I was commenting that it differed here in referring to a flaky test, but it isn't. To be very clear, I propose you rename both to perhaps: "Workaround JDK-8165231 to identify PPC64 architectures as supporting unaligned access" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17467: [SPARK-20140][DStream] Remove hardcoded kinesis retry wa...
Github user yssharma commented on the issue: https://github.com/apache/spark/pull/17467 You're a gen @HyukjinKwon ð¯ . I will wait for Tathagata and Burak's inputs then :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17442: [SPARK-20107][DOC] Add spark.hadoop.mapreduce.fil...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17442 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17477 **[Test build #75385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75385/testReport)** for PR 17477 at commit [`7a7cf04`](https://github.com/apache/spark/commit/7a7cf04db7c2a1ffe1fc3cde1e1abdc99481b618). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc bui...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17477#discussion_r108882566 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -704,12 +704,12 @@ private[spark] object TaskSchedulerImpl { * Used to balance containers across hosts. * * Accepts a map of hosts to resource offers for that host, and returns a prioritized list of - * resource offers representing the order in which the offers should be used. The resource + * resource offers representing the order in which the offers should be used. The resource * offers are ordered such that we'll allocate one container on each host before allocating a * second container on any host, and so on, in order to reduce the damage if a host fails. * - * For example, given , , , returns - * [o1, o5, o4, 02, o6, o3] + * For example, given a map consisting of h1 to [o1, o2, o3], h2 to [o4] and h3 to [o5, o6], + * returns a list, [o1, o5, o4, o2, o6, o3]. --- End diff -- There look few typos here. 02 -> o2 and h1 -> h3. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17442: [SPARK-20107][DOC] Add spark.hadoop.mapreduce.fileoutput...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17442 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75384/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17251 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17251 **[Test build #75384 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75384/testReport)** for PR 17251 at commit [`2150ce5`](https://github.com/apache/spark/commit/2150ce552a7a02d656329761e04a7fcb38e5e648). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17472: [SPARK-19999]: Accommodate a new architecture tha...
Github user samelamin commented on a diff in the pull request: https://github.com/apache/spark/pull/17472#discussion_r108881923 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java --- @@ -46,18 +46,22 @@ private static final boolean unaligned; static { boolean _unaligned; -// use reflection to access unaligned field -try { - Class bitsClass = -Class.forName("java.nio.Bits", false, ClassLoader.getSystemClassLoader()); - Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned"); - unalignedMethod.setAccessible(true); - _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null)); -} catch (Throwable t) { - // We at least know x86 and x64 support unaligned access. - String arch = System.getProperty("os.arch", ""); - //noinspection DynamicRegexReplaceableByCompiledPattern - _unaligned = arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$"); +String arch = System.getProperty("os.arch", ""); +if (arch.matches("^(ppc64le | ppc64)$")) { --- End diff -- yeah sorry complete fail, didnt see that lol With regards to a big-endian PPC64 arch , I dont think so but I am not 100% sure. that said as it stands it isn't supported and the test fails. So happy to change the PR or even close it if you do not see any value it adds --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17329: [SPARK-19991]FileSegmentManagedBuffer performance...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/17329#discussion_r108881597 --- Diff: common/network-common/src/main/java/org/apache/spark/network/buffer/FileSegmentManagedBuffer.java --- @@ -37,13 +37,24 @@ * A {@link ManagedBuffer} backed by a segment in a file. */ public final class FileSegmentManagedBuffer extends ManagedBuffer { - private final TransportConf conf; + private final boolean lazyFileDescriptor; + private final int memoryMapBytes; private final File file; private final long offset; private final long length; public FileSegmentManagedBuffer(TransportConf conf, File file, long offset, long length) { -this.conf = conf; +this(conf.lazyFileDescriptor(), conf.memoryMapBytes(), file, offset, length); + } + + public FileSegmentManagedBuffer( --- End diff -- Ping @witgo to update or close --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17472: [SPARK-19999]: Accommodate a new architecture tha...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/17472#discussion_r108880865 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java --- @@ -46,18 +46,22 @@ private static final boolean unaligned; static { boolean _unaligned; -// use reflection to access unaligned field -try { - Class bitsClass = -Class.forName("java.nio.Bits", false, ClassLoader.getSystemClassLoader()); - Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned"); - unalignedMethod.setAccessible(true); - _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null)); -} catch (Throwable t) { - // We at least know x86 and x64 support unaligned access. - String arch = System.getProperty("os.arch", ""); - //noinspection DynamicRegexReplaceableByCompiledPattern - _unaligned = arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$"); +String arch = System.getProperty("os.arch", ""); +if (arch.matches("^(ppc64le | ppc64)$")) { --- End diff -- Both of those start with ppc64, what do you mean? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/17458#discussion_r108880632 --- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala --- @@ -116,15 +116,15 @@ private[spark] abstract class WebUI( * @param path Path in UI to unmount. */ def removeStaticHandler(path: String): Unit = { -handlers.find(_.getContextPath() == path).foreach(detachHandler) +handlers.find(_.getContextPath == path).foreach(detachHandler) --- End diff -- @dbolshak this still isn't addressing comments that have been made several times. You can appreciate why I don't think it's worth the time to make changes like this given how much discussion it takes. Here's another example. Please read _all_ previous comments and address all of them or close this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/executors/...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17479 I don't think this is worth changing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17324: [SPARK-19969] [ML] Imputer doc and example
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17324#discussion_r108880024 --- Diff: examples/src/main/python/ml/imputer_example.py --- @@ -0,0 +1,50 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# $example on$ +from pyspark.ml.feature import Imputer +# $example off$ +from pyspark.sql import SparkSession + +""" +An example demonstrating Imputer. +Run with: + bin/spark-submit examples/src/main/python/ml/imputer_example.py +""" + +if __name__ == "__main__": +spark = SparkSession\ +.builder\ +.appName("ImputerExample")\ +.getOrCreate() + +# $example on$ +df = spark.createDataFrame([ +(1.0, float("nan")), +(2.0, float("nan")), +(float("nan"), 3.0), +(4.0, 4.0), +(5.0, 5.0) +], ["a", "b"]) + +imputer = Imputer(inputCols=["a", "b"], outputCols=["out_a", "out_b"]) +model = imputer.fit(df) + +model.transform(df).select("a", "b", "out_a", "out_b").show() --- End diff -- In previous comment I wasn't totally clear, sorry! I mean let's _only_ have the `transform(df).show()` - so we can remove the `select` here as it's unnecessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17479#discussion_r108879364 --- Diff: core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html --- @@ -24,7 +24,7 @@ Summary RDD Blocks Storage Memory + title="Memory used / total available memory for storage of data like RDD partitions cached in memory. ">Storage Memory used/total --- End diff -- Let's post before/after snapshoots here and defer to committers. Personally, I am not sure too because tooltip explains quite clear. BTW, I think we should capitalise it just to be consistent other pages. ![2017-03-30 6 21 24](https://cloud.githubusercontent.com/assets/6477701/24497335/0f62dabc-1576-11e7-9ec4-ad0f3cd8206c.png) ![2017-03-30 6 21 18](https://cloud.githubusercontent.com/assets/6477701/24497339/15032e36-1576-11e7-99d7-8493eea3ae92.png) ![2017-03-30 6 21 15](https://cloud.githubusercontent.com/assets/6477701/24497341/1810be18-1576-11e7-907f-858f692b522b.png) ![2017-03-30 6 21 12](https://cloud.githubusercontent.com/assets/6477701/24497344/1b4013ae-1576-11e7-9599-9f91e4f89317.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17472: [SPARK-19999]: Fix for flakey tests due to java.n...
Github user samelamin commented on a diff in the pull request: https://github.com/apache/spark/pull/17472#discussion_r108879329 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java --- @@ -46,18 +46,22 @@ private static final boolean unaligned; static { boolean _unaligned; -// use reflection to access unaligned field -try { - Class bitsClass = -Class.forName("java.nio.Bits", false, ClassLoader.getSystemClassLoader()); - Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned"); - unalignedMethod.setAccessible(true); - _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null)); -} catch (Throwable t) { - // We at least know x86 and x64 support unaligned access. - String arch = System.getProperty("os.arch", ""); - //noinspection DynamicRegexReplaceableByCompiledPattern - _unaligned = arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$"); +String arch = System.getProperty("os.arch", ""); +if (arch.matches("^(ppc64le | ppc64)$")) { --- End diff -- thats fine but itll end up being arch.startsWith("ppc64") || arch.startsWith("ppc64le") I thought this was less code, but startsWith is more readable Are you happy with that approach? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17472: [SPARK-19999]: Fix for flakey tests due to java.n...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/17472#discussion_r108878462 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java --- @@ -46,18 +46,22 @@ private static final boolean unaligned; static { boolean _unaligned; -// use reflection to access unaligned field -try { - Class bitsClass = -Class.forName("java.nio.Bits", false, ClassLoader.getSystemClassLoader()); - Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned"); - unalignedMethod.setAccessible(true); - _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null)); -} catch (Throwable t) { - // We at least know x86 and x64 support unaligned access. - String arch = System.getProperty("os.arch", ""); - //noinspection DynamicRegexReplaceableByCompiledPattern - _unaligned = arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$"); +String arch = System.getProperty("os.arch", ""); +if (arch.matches("^(ppc64le | ppc64)$")) { --- End diff -- Still, why not `arch.startsWith("ppc64")`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...
Github user guoxiaolongzte commented on a diff in the pull request: https://github.com/apache/spark/pull/17479#discussion_r108876916 --- Diff: core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html --- @@ -24,7 +24,7 @@ Summary RDD Blocks Storage Memory + title="Memory used / total available memory for storage of data like RDD partitions cached in memory. ">Storage Memory used/total --- End diff -- The memory data is separated from each other, and if the title is separated, it is easy to observe and understand the user.Modify this slight UI, advantages outweigh the disadvantages. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/17479#discussion_r108875608 --- Diff: core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html --- @@ -24,7 +24,7 @@ Summary RDD Blocks Storage Memory + title="Memory used / total available memory for storage of data like RDD partitions cached in memory. ">Storage Memory used/total --- End diff -- I don't disagree your comments, but from my point it is slightly duplicated and not so necessary. And I believe most of the users understand what this column mean. So I'm conservative of such change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17458#discussion_r108874836 --- Diff: core/src/main/scala/org/apache/spark/ui/storage/RDDPage.scala --- @@ -42,7 +42,7 @@ private[ui] class RDDPage(parent: StorageTab) extends WebUIPage("rdd") { val blockPage = Option(parameterBlockPage).map(_.toInt).getOrElse(1) val blockSortColumn = Option(parameterBlockSortColumn).getOrElse("Block Name") -val blockSortDesc = Option(parameterBlockSortDesc).map(_.toBoolean).getOrElse(false) +val blockSortDesc = Option(parameterBlockSortDesc).exists(_.toBoolean) --- End diff -- This also looks the same instance. Let's revert this change too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17458#discussion_r108874623 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/UIData.scala --- @@ -180,8 +180,8 @@ private[spark] object UIData { speculative = taskInfo.speculative ) newTaskInfo.gettingResultTime = taskInfo.gettingResultTime - newTaskInfo.setAccumulables(taskInfo.accumulables.filter { -accum => !accum.internal && accum.metadata != Some(AccumulatorContext.SQL_ACCUM_IDENTIFIER) + newTaskInfo.setAccumulables(taskInfo.accumulables.filter { acc => +!acc.internal && !acc.metadata.contains(AccumulatorContext.SQL_ACCUM_IDENTIFIER) --- End diff -- We should revert this one too for the same reason in https://github.com/apache/spark/pull/17458/files#r108432518 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17458#discussion_r108874489 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagesTab.scala --- @@ -35,7 +35,7 @@ private[ui] class StagesTab(parent: SparkUI) extends SparkUITab(parent, "stages" attachPage(new StagePage(this)) attachPage(new PoolPage(this)) - def isFairScheduler: Boolean = progressListener.schedulingMode == Some(SchedulingMode.FAIR) + def isFairScheduler: Boolean = progressListener.schedulingMode.contains(SchedulingMode.FAIR) --- End diff -- It seems not reverted yet.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17472: [SPARK-19999]: Fix for flakey tests due to java.nio.Bits...
Github user samelamin commented on the issue: https://github.com/apache/spark/pull/17472 Can someone guide me why the tests failed? is it another flakey test or? because the below message doesnt make sense to me :( > Traceback (most recent call last): > File "./dev/run-tests-jenkins.py", line 226, in > main() > File "./dev/run-tests-jenkins.py", line 213, in main > test_result_code, test_result_note = run_tests(tests_timeout) > File "./dev/run-tests-jenkins.py", line 140, in run_tests > test_result_note = ' * This patch **fails %s**.' % failure_note_by_errcode[test_result_code] > KeyError: -9 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17472: [SPARK-19999]: Fix for flakey tests due to java.nio.Bits...
Github user samelamin commented on the issue: https://github.com/apache/spark/pull/17472 jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...
Github user dbolshak commented on the issue: https://github.com/apache/spark/pull/17458 @srowen, @HyukjinKwon could you please merge the PR if it's ok of course? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15326: [SPARK-17759] [CORE] Avoid adding duplicate schedulables
Github user erenavsarogullari commented on the issue: https://github.com/apache/spark/pull/15326 Hi @kayousterhout and @markhamstra, - This PR is ready for review in the light of first-added-wins (Schedulables: `Pool` / `TaskSetManager`) pattern. All feedback are welcome in advance. - Two TaskSchedulerImpl UT cases have been failed due to duplicate TaskSet submission so they have also been fixed via latest commit: dd302ffb44e01280b42d130bee3ede9c81fd4839 - Also jenkins needs to be triggered. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17324 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75383/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17324 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17324 **[Test build #75383 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75383/testReport)** for PR 17324 at commit [`48a1361`](https://github.com/apache/spark/commit/48a136133fe83b5e4c2408e4391c15fdefead901). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17468: [SPARK-20143][SQL] DataType.fromJson should throw an exc...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17468 @gatorsmile, could you take a look please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17478 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17478 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75382/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17478 **[Test build #75382 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75382/testReport)** for PR 17478 at commit [`44305bf`](https://github.com/apache/spark/commit/44305bf67a1fecd1923f9ee57122147efabb5702). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...
Github user guoxiaolongzte commented on a diff in the pull request: https://github.com/apache/spark/pull/17479#discussion_r108863264 --- Diff: core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html --- @@ -24,7 +24,7 @@ Summary RDD Blocks Storage Memory + title="Memory used / total available memory for storage of data like RDD partitions cached in memory. ">Storage Memory used/total --- End diff -- Because of this change, easier to understand for users and observation.Web UI with clarity and friendly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/17479#discussion_r108862872 --- Diff: core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html --- @@ -24,7 +24,7 @@ Summary RDD Blocks Storage Memory + title="Memory used / total available memory for storage of data like RDD partitions cached in memory. ">Storage Memory used/total --- End diff -- Is it necessary to change? I think what tooltip mentioned is quite clear if you hover on this column. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/executors/...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17479 I think UI change requires a screenshot as written above. It seems trivial though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/executors/...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17479 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...
GitHub user guoxiaolongzte opened a pull request: https://github.com/apache/spark/pull/17479 [SPARK-20154][Web UI]In web ui,http://ip:4040/executors/,the title 'Storage Memory' should⦠⦠modify 'Storage Memory used/total' ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/guoxiaolongzte/spark SPARK-20154 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17479.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17479 commit d9cd198b99797929d85fa206738cd772a2e63147 Author: éå°é¾ 10207633 Date: 2017-03-30T07:31:49Z In web ui,http://ip:4040/executors/,the title 'Storage Memory' should modify 'Storage Memory used/total' --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17419: [SPARK-19634][ML] Multivariate summarizer - dataf...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17419#discussion_r108858117 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,746 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.stat + +import breeze.{linalg => la} +import breeze.linalg.{Vector => BV} +import breeze.numerics + +import org.apache.spark.SparkException +import org.apache.spark.annotation.Since +import org.apache.spark.internal.Logging +import org.apache.spark.ml.linalg.{DenseVector, SparseVector, Vector, Vectors, VectorUDT} +import org.apache.spark.sql.Column +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.expressions.{Expression, UnsafeArrayData, UnsafeProjection, UnsafeRow} +import org.apache.spark.sql.catalyst.expressions.aggregate.{AggregateExpression, Complete, TypedImperativeAggregate} +import org.apache.spark.sql.types._ + + +/** + * A builder object that provides summary statistics about a given column. + * + * Users should not directly create such builders, but instead use one of the methods in + * [[Summarizer]]. + */ +@Since("2.2.0") +abstract class SummaryBuilder { + /** + * Returns an aggregate object that contains the summary of the column with the requested metrics. + * @param column a column that contains Vector object. + * @return an aggregate column that contains the statistics. The exact content of this + * structure is determined during the creation of the builder. + */ + @Since("2.2.0") + def summary(column: Column): Column +} + +/** + * Tools for vectorized statistics on MLlib Vectors. + * + * The methods in this package provide various statistics for Vectors contained inside DataFrames. + * + * This class lets users pick the statistics they would like to extract for a given column. Here is + * an example in Scala: + * {{{ + * val dataframe = ... // Some dataframe containing a feature column + * val allStats = dataframe.select(Summarizer.metrics("min", "max").summary($"features")) + * val Row(min_, max_) = allStats.first() + * }}} + * + * If one wants to get a single metric, shortcuts are also available: + * {{{ + * val meanDF = dataframe.select(Summarizer.mean($"features")) + * val Row(mean_) = meanDF.first() + * }}} + */ +@Since("2.2.0") +object Summarizer extends Logging { + + import SummaryBuilderImpl._ + + /** + * Given a list of metrics, provides a builder that it turns computes metrics from a column. + * + * See the documentation of [[Summarizer]] for an example. + * + * The following metrics are accepted (case sensitive): + * - mean: a vector that contains the coefficient-wise mean. + * - variance: a vector tha contains the coefficient-wise variance. + * - count: the count of all vectors seen. + * - numNonzeros: a vector with the number of non-zeros for each coefficients + * - max: the maximum for each coefficient. + * - min: the minimum for each coefficient. + * - normL2: the Euclidian norm for each coefficient. + * - normL1: the L1 norm of each coefficient (sum of the absolute values). + * @param firstMetric the metric being provided + * @param metrics additional metrics that can be provided. + * @return a builder. + * @throws IllegalArgumentException if one of the metric names is not understood. + */ + @Since("2.2.0") + def metrics(firstMetric: String, metrics: String*): SummaryBuilder = { +val (typedMetrics, computeMetrics) = getRelevantMetrics(Seq(firstMetric) ++ metrics) +new SummaryBuilderImpl(typedMetrics, computeMetrics) + } + + def mean(col: Column): Column = getSingleMetric(col, "mean") + + def variance(col: Column): Column = getSingleMetric(col, "variance") +
[GitHub] spark pull request #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc bui...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17477#discussion_r108857749 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -704,12 +704,12 @@ private[spark] object TaskSchedulerImpl { * Used to balance containers across hosts. * * Accepts a map of hosts to resource offers for that host, and returns a prioritized list of - * resource offers representing the order in which the offers should be used. The resource + * resource offers representing the order in which the offers should be used. The resource * offers are ordered such that we'll allocate one container on each host before allocating a * second container on any host, and so on, in order to reduce the damage if a host fails. * - * For example, given , , , returns - * [o1, o5, o4, 02, o6, o3] + * For example, given maps from h1 to [o1, o2, o3], from h2 to [o4] and from h3 to [o5, o6], + * returns a list, [o1, o5, o4, o2, o6, o3]. --- End diff -- There look few typos here. 02 -> o2 and h1 -> h3. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17419: [SPARK-19634][ML] Multivariate summarizer - dataf...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17419#discussion_r108856798 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,746 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.stat + +import breeze.{linalg => la} +import breeze.linalg.{Vector => BV} +import breeze.numerics + +import org.apache.spark.SparkException +import org.apache.spark.annotation.Since +import org.apache.spark.internal.Logging +import org.apache.spark.ml.linalg.{DenseVector, SparseVector, Vector, Vectors, VectorUDT} +import org.apache.spark.sql.Column +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.expressions.{Expression, UnsafeArrayData, UnsafeProjection, UnsafeRow} +import org.apache.spark.sql.catalyst.expressions.aggregate.{AggregateExpression, Complete, TypedImperativeAggregate} +import org.apache.spark.sql.types._ + + +/** + * A builder object that provides summary statistics about a given column. + * + * Users should not directly create such builders, but instead use one of the methods in + * [[Summarizer]]. + */ +@Since("2.2.0") +abstract class SummaryBuilder { + /** + * Returns an aggregate object that contains the summary of the column with the requested metrics. + * @param column a column that contains Vector object. + * @return an aggregate column that contains the statistics. The exact content of this + * structure is determined during the creation of the builder. + */ + @Since("2.2.0") + def summary(column: Column): Column +} + +/** + * Tools for vectorized statistics on MLlib Vectors. + * + * The methods in this package provide various statistics for Vectors contained inside DataFrames. + * + * This class lets users pick the statistics they would like to extract for a given column. Here is + * an example in Scala: + * {{{ + * val dataframe = ... // Some dataframe containing a feature column + * val allStats = dataframe.select(Summarizer.metrics("min", "max").summary($"features")) + * val Row(min_, max_) = allStats.first() + * }}} + * + * If one wants to get a single metric, shortcuts are also available: + * {{{ + * val meanDF = dataframe.select(Summarizer.mean($"features")) + * val Row(mean_) = meanDF.first() + * }}} + */ +@Since("2.2.0") +object Summarizer extends Logging { + + import SummaryBuilderImpl._ + + /** + * Given a list of metrics, provides a builder that it turns computes metrics from a column. + * + * See the documentation of [[Summarizer]] for an example. + * + * The following metrics are accepted (case sensitive): + * - mean: a vector that contains the coefficient-wise mean. + * - variance: a vector tha contains the coefficient-wise variance. + * - count: the count of all vectors seen. + * - numNonzeros: a vector with the number of non-zeros for each coefficients + * - max: the maximum for each coefficient. + * - min: the minimum for each coefficient. + * - normL2: the Euclidian norm for each coefficient. + * - normL1: the L1 norm of each coefficient (sum of the absolute values). + * @param firstMetric the metric being provided + * @param metrics additional metrics that can be provided. + * @return a builder. + * @throws IllegalArgumentException if one of the metric names is not understood. + */ + @Since("2.2.0") + def metrics(firstMetric: String, metrics: String*): SummaryBuilder = { +val (typedMetrics, computeMetrics) = getRelevantMetrics(Seq(firstMetric) ++ metrics) +new SummaryBuilderImpl(typedMetrics, computeMetrics) + } + + def mean(col: Column): Column = getSingleMetric(col, "mean") + + def variance(col: Column): Column = getSingleMetric(col, "variance") +
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/17251 Thank you so much for review, @gatorsmile . I updated the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17251 **[Test build #75384 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75384/testReport)** for PR 17251 at commit [`2150ce5`](https://github.com/apache/spark/commit/2150ce552a7a02d656329761e04a7fcb38e5e648). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17324 **[Test build #75383 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75383/testReport)** for PR 17324 at commit [`48a1361`](https://github.com/apache/spark/commit/48a136133fe83b5e4c2408e4391c15fdefead901). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17251: [SPARK-19910][SQL] `stack` should not reject NULL...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/17251#discussion_r108855184 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -590,6 +591,21 @@ object TypeCoercion { } /** + * Coerces NullTypes of a Stack function to the corresponding column types. --- End diff -- Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17324 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17251: [SPARK-19910][SQL] `stack` should not reject NULL...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/17251#discussion_r108854836 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala --- @@ -707,6 +707,36 @@ class TypeCoercionSuite extends PlanTest { ) } + test("type coercion for Stack") { +val rule = TypeCoercion.StackCoercion + +ruleTest(rule, + Stack(Seq(Literal(3), Literal(1), Literal(2), Literal(null))), + Stack(Seq(Literal(3), Literal(1), Literal(2), Literal.create(null, IntegerType +ruleTest(rule, + Stack(Seq(Literal(3), Literal(1.0), Literal(null), Literal(3.0))), + Stack(Seq(Literal(3), Literal(1.0), Literal.create(null, DoubleType), Literal(3.0 +ruleTest(rule, + Stack(Seq(Literal(3), Literal(null), Literal("2"), Literal("3"))), + Stack(Seq(Literal(3), Literal.create(null, StringType), Literal("2"), Literal("3" + +ruleTest(rule, + Stack(Seq(Literal(2), +Literal(1), Literal("2"), +Literal(null), Literal(null))), + Stack(Seq(Literal(2), +Literal(1), Literal("2"), +Literal.create(null, IntegerType), Literal.create(null, StringType + +ruleTest(rule, + Stack(Seq(Subtract(Literal(3), Literal(1)), +Literal(1), Literal("2"), +Literal(null), Literal(null))), + Stack(Seq(Subtract(Literal(3), Literal(1)), +Literal(1), Literal("2"), +Literal.create(null, IntegerType), Literal.create(null, StringType --- End diff -- Right, for that one, I'll add the following [here](https://github.com/apache/spark/pull/17251/files/c9510847c8eeb5f5da3b63c38ac835d1c3491815#diff-01ecdd038c5c2f53f38118912210fef8R722). ```scala ruleTest(rule, Stack(Seq(Literal(3), Literal(null), Literal(null), Literal(null))), Stack(Seq(Literal(3), Literal(null), Literal(null), Literal(null ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17251: [SPARK-19910][SQL] `stack` should not reject NULL...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/17251#discussion_r108854853 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/GeneratorFunctionSuite.scala --- @@ -39,9 +39,9 @@ class GeneratorFunctionSuite extends QueryTest with SharedSQLContext { checkAnswer(df.selectExpr("stack(3, 1, 2, 3)"), Row(1) :: Row(2) :: Row(3) :: Nil) checkAnswer(df.selectExpr("stack(4, 1, 2, 3)"), Row(1) :: Row(2) :: Row(3) :: Row(null) :: Nil) -// Various column types -checkAnswer(df.selectExpr("stack(3, 1, 1.1, 'a', 2, 2.2, 'b', 3, 3.3, 'c')"), - Row(1, 1.1, "a") :: Row(2, 2.2, "b") :: Row(3, 3.3, "c") :: Nil) +// Various column types and null values +checkAnswer(df.selectExpr("stack(3, 1, 1.1, null, 2, null, 'b', null, 3.3, 'c')"), + Row(1, 1.1, null) :: Row(2, null, "b") :: Row(null, 3.3, "c") :: Nil) --- End diff -- Yep. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17251: [SPARK-19910][SQL] `stack` should not reject NULL...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/17251#discussion_r108854595 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala --- @@ -156,9 +157,21 @@ case class Stack(children: Seq[Expression]) extends Generator { } } + private def findDataType(column: Integer): DataType = { +// Find the first data type except NullType +for (i <- (column + 1) until children.length by numFields) { + if (children(i).dataType != NullType) { +return children(i).dataType + } +} +// If all values of the column are NullType, use it. +children(column + 1).dataType + } + override def elementSchema: StructType = StructType(children.tail.take(numFields).zipWithIndex.map { - case (e, index) => StructField(s"col$index", e.dataType) + case (e, index) if e.dataType != NullType => StructField(s"col$index", e.dataType) + case (_, index) => StructField(s"col$index", findDataType(index)) --- End diff -- Sure! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17478 **[Test build #75382 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75382/testReport)** for PR 17478 at commit [`44305bf`](https://github.com/apache/spark/commit/44305bf67a1fecd1923f9ee57122147efabb5702). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17251 LGTM except a few comments cc @cloud-fan for the final sign off --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17375: [SPARK-19019][PYTHON][BRANCH-1.6] Fix hijacked `collecti...
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/17375 @holdenk At least some users have more control over Python environment than a whole cluster setup, and with Anaconda defaulting now to 3.6, it is an annoyance. Assuming there will be 1.6.4, it makes more sense to patch than document maximum supported version. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17478: [SPARK-18901][ML]:Require in LR LogisticAggregato...
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/17478 [SPARK-18901][ML]:Require in LR LogisticAggregator is redundant ## What changes were proposed in this pull request? In MultivariateOnlineSummarizer, `add` and `merge` have check for weights and feature sizes. The checks in LR are redundant, which are removed from this PR. ## How was this patch tested? Existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangmiao1981/spark logit Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17478.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17478 commit 44305bf67a1fecd1923f9ee57122147efabb5702 Author: wm...@hotmail.com Date: 2017-03-30T07:00:18Z remove redudant check --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17472: [SPARK-19999]: Fix for flakey tests due to java.nio.Bits...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17472 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75380/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17472: [SPARK-19999]: Fix for flakey tests due to java.nio.Bits...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17472 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17476: [SPARK-20151][SQL] Account for partition pruning in scan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17476 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17476: [SPARK-20151][SQL] Account for partition pruning in scan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17476 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75379/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17477 FYI, If I haven't missed something, all the cases are the instances with the ones previously fixed. cc @joshrosen, @srowen and @jkbradley. Could you take a look and see if it makes sense? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org