date:20170330

[GitHub] spark issue #17417: [DOCS] Docs-only improvements

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17417
  
**[Test build #75388 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75388/testReport)**
 for PR 17417 at commit 
[`ae57b33`](https://github.com/apache/spark/commit/ae57b33a12e26c2b2c512d35c33ff8663f4f3373).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-03-30 Thread Gauravshah

Github user Gauravshah commented on the issue:

https://github.com/apache/spark/pull/16578
  
can I do something to help this pull request ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17450: [SPARK-20121][SQL] simplify NullPropagation with NullInt...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17450
  
**[Test build #75390 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75390/testReport)**
 for PR 17450 at commit 
[`63287ef`](https://github.com/apache/spark/commit/63287ef766b779255054bc463a9e3f9e49149083).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17443: [DOCS][MINOR] Fixed a few typos in the Structured Stream...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17443
  
**[Test build #3620 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3620/testReport)**
 for PR 17443 at commit 
[`a2faf88`](https://github.com/apache/spark/commit/a2faf88e61c2b634dd81fdaea565234bc5d6012d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17443: [DOCS][MINOR] Fixed a few typos in the Structured Stream...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17443
  
**[Test build #3620 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3620/testReport)**
 for PR 17443 at commit 
[`a2faf88`](https://github.com/apache/spark/commit/a2faf88e61c2b634dd81fdaea565234bc5d6012d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17443: [DOCS][MINOR] Fixed a few typos in the Structured Stream...

2017-03-30 Thread aseigneurin

Github user aseigneurin commented on the issue:

https://github.com/apache/spark/pull/17443
  
@srowen does this mean I should open a JIRA for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17458
  
**[Test build #3619 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3619/testReport)**
 for PR 17458 at commit 
[`4788bbe`](https://github.com/apache/spark/commit/4788bbef279828d96feead227b4b450e96493d4c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...

2017-03-30 Thread samelamin

Github user samelamin commented on the issue:

https://github.com/apache/spark/pull/17472
  
@gatorsmile @srowen how does it look now? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17430
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17430
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75387/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17430
  
**[Test build #75387 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75387/testReport)**
 for PR 17430 at commit 
[`77dadfd`](https://github.com/apache/spark/commit/77dadfdf116c7b6e0385582ced746d83a960adb5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17472
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17472
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75386/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17472
  
**[Test build #75386 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75386/testReport)**
 for PR 17472 at commit 
[`632161b`](https://github.com/apache/spark/commit/632161b299ac37a598083b1c2995e9becbddc33c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17477
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75385/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17477
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17477
  
**[Test build #75385 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75385/testReport)**
 for PR 17477 at commit 
[`7a7cf04`](https://github.com/apache/spark/commit/7a7cf04db7c2a1ffe1fc3cde1e1abdc99481b618).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17476: [SPARK-20151][SQL] Account for partition pruning ...

2017-03-30 Thread adrian-ionescu

Github user adrian-ionescu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17476#discussion_r108906608
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CatalogFileIndex.scala
 ---
@@ -111,7 +113,8 @@ private class PrunedInMemoryFileIndex(
 sparkSession: SparkSession,
 tableBasePath: Path,
 fileStatusCache: FileStatusCache,
-override val partitionSpec: PartitionSpec)
+override val partitionSpec: PartitionSpec,
+override val metadataOpsTimeNs: Option[Long])
--- End diff --

Add param doc, as it's not immediately obvious what a user is supposed to 
supply here.
I'd say something like "time it took to obtain the partitionSpec from the 
Hive metastore", but maybe that's too specific..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17416
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17416
  
**[Test build #75389 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75389/testReport)**
 for PR 17416 at commit 
[`dec7bfb`](https://github.com/apache/spark/commit/dec7bfb8911fc03c8813fe809c3b09014a659791).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17416
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75389/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17416
  
**[Test build #75389 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75389/testReport)**
 for PR 17416 at commit 
[`dec7bfb`](https://github.com/apache/spark/commit/dec7bfb8911fc03c8813fe809c3b09014a659791).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17329: [SPARK-19991]FileSegmentManagedBuffer performance...

2017-03-30 Thread witgo

Github user witgo closed the pull request at:

https://github.com/apache/spark/pull/17329


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...

2017-03-30 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17416
  
Yeah, it clearly downloads the right models .jar file, puts it in the .ivy 
cache correctly (I can find it there as expected) but then `spark.jars` isn't 
set correctly:

```
...

(spark.jars,file:/Users/srowen/.ivy2/jars/edu.stanford.nlp_stanford-corenlp-3.4.1.jar)
...
```

Let me push some more updates including the one you suggested, which didn't 
seem to change this, but I wouldn't expect it to. Still tracing down exactly 
what sets this value of spark.jars.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17417: [DOCS] Docs-only improvements

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17417
  
**[Test build #75388 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75388/testReport)**
 for PR 17417 at commit 
[`ae57b33`](https://github.com/apache/spark/commit/ae57b33a12e26c2b2c512d35c33ff8663f4f3373).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17452: [SPARK-20123][build]$SPARK_HOME variable might ha...

2017-03-30 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/17452#discussion_r108898852
  
--- Diff: dev/make-distribution.sh ---
@@ -217,43 +217,43 @@ fi
 # Make R package - this is used for both CRAN release and packing R layout 
into distribution
 if [ "$MAKE_R" == "true" ]; then
   echo "Building R source package"
-  R_PACKAGE_VERSION=`grep Version $SPARK_HOME/R/pkg/DESCRIPTION | awk 
'{print $NF}'`
+  R_PACKAGE_VERSION=`grep Version "$SPARK_HOME/R/pkg/DESCRIPTION" | awk 
'{print $NF}'`
   pushd "$SPARK_HOME/R" > /dev/null
   # Build source package and run full checks
   # Do not source the check-cran.sh - it should be run from where it is 
for it to set SPARK_HOME
-  NO_TESTS=1 "$SPARK_HOME/"R/check-cran.sh
+  NO_TESTS=1 "$SPARK_HOME/R/check-cran.sh"
 
   # Move R source package to match the Spark release version if the 
versions are not the same.
   # NOTE(shivaram): `mv` throws an error on Linux if source and 
destination are same file
   if [ "$R_PACKAGE_VERSION" != "$VERSION" ]; then
-mv $SPARK_HOME/R/SparkR_"$R_PACKAGE_VERSION".tar.gz 
$SPARK_HOME/R/SparkR_"$VERSION".tar.gz
+mv "$SPARK_HOME/R/SparkR_$R_PACKAGE_VERSION.tar.gz" 
"$SPARK_HOME/R/SparkR_$VERSION.tar.gz"
   fi
 
   # Install source package to get it to generate vignettes rds files, etc.
-  VERSION=$VERSION "$SPARK_HOME/"R/install-source-package.sh
+  VERSION=$VERSION "$SPARK_HOME/R/install-source-package.sh"
   popd > /dev/null
 else
   echo "Skipping building R source package"
 fi
 
 # Copy other things
-mkdir "$DISTDIR"/conf
-cp "$SPARK_HOME"/conf/*.template "$DISTDIR"/conf
+mkdir "$DISTDIR/conf"
+cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf"
--- End diff --

Sorry, yes that's also correct. But right now this is just one big argument 
to cp, not 2 or more. Doesn't it need to be more like `cp 
"$SPARK_HOME"/conf/*.template "$DISTDIR/conf"`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17452: [SPARK-20123][build]$SPARK_HOME variable might ha...

2017-03-30 Thread zuotingbing

Github user zuotingbing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17452#discussion_r108898299
  
--- Diff: dev/make-distribution.sh ---
@@ -217,43 +217,43 @@ fi
 # Make R package - this is used for both CRAN release and packing R layout 
into distribution
 if [ "$MAKE_R" == "true" ]; then
   echo "Building R source package"
-  R_PACKAGE_VERSION=`grep Version $SPARK_HOME/R/pkg/DESCRIPTION | awk 
'{print $NF}'`
+  R_PACKAGE_VERSION=`grep Version "$SPARK_HOME/R/pkg/DESCRIPTION" | awk 
'{print $NF}'`
   pushd "$SPARK_HOME/R" > /dev/null
   # Build source package and run full checks
   # Do not source the check-cran.sh - it should be run from where it is 
for it to set SPARK_HOME
-  NO_TESTS=1 "$SPARK_HOME/"R/check-cran.sh
+  NO_TESTS=1 "$SPARK_HOME/R/check-cran.sh"
 
   # Move R source package to match the Spark release version if the 
versions are not the same.
   # NOTE(shivaram): `mv` throws an error on Linux if source and 
destination are same file
   if [ "$R_PACKAGE_VERSION" != "$VERSION" ]; then
-mv $SPARK_HOME/R/SparkR_"$R_PACKAGE_VERSION".tar.gz 
$SPARK_HOME/R/SparkR_"$VERSION".tar.gz
+mv "$SPARK_HOME/R/SparkR_$R_PACKAGE_VERSION.tar.gz" 
"$SPARK_HOME/R/SparkR_$VERSION.tar.gz"
   fi
 
   # Install source package to get it to generate vignettes rds files, etc.
-  VERSION=$VERSION "$SPARK_HOME/"R/install-source-package.sh
+  VERSION=$VERSION "$SPARK_HOME/R/install-source-package.sh"
   popd > /dev/null
 else
   echo "Skipping building R source package"
 fi
 
 # Copy other things
-mkdir "$DISTDIR"/conf
-cp "$SPARK_HOME"/conf/*.template "$DISTDIR"/conf
+mkdir "$DISTDIR/conf"
+cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf"
--- End diff --

oh NO, it might be wrong if we quote the arg $SPARK_HOME/conf/*.template as 
a whole.
It works well already and the debug info as follows :
+ mkdir '/home/spark build/spark/dist/conf'
+ cp '/home/spark build/spark/conf/docker.properties.template' '/home/spark 
build/spark/conf/fairscheduler.xml.template' '/home/spark 
build/spark/conf/log4j.properties.template' '/home/spark 
build/spark/conf/metrics.properties.template' '/home/spark 
build/spark/conf/slaves.template' '/home/spark 
build/spark/conf/spark-defaults.conf.template' '/home/spark 
build/spark/conf/spark-env.sh.template' '/home/spark build/spark/dist/conf'


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17458
  
It looks good to me too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17458
  
**[Test build #3619 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3619/testReport)**
 for PR 17458 at commit 
[`4788bbe`](https://github.com/apache/spark/commit/4788bbef279828d96feead227b4b450e96493d4c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...

2017-03-30 Thread dbolshak

Github user dbolshak commented on the issue:

https://github.com/apache/spark/pull/17458
  
Corrected.
Please take a look.
There is only one change (in 2 places) related to calling toSeq which has 
comment, but it's not clear can I leave my change or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...

2017-03-30 Thread dbolshak

Github user dbolshak commented on a diff in the pull request:

https://github.com/apache/spark/pull/17458#discussion_r108890304
  
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala 
---
@@ -34,9 +34,9 @@ private[ui] class AllStagesPage(parent: StagesTab) 
extends WebUIPage("") {
 listener.synchronized {
   val activeStages = listener.activeStages.values.toSeq
   val pendingStages = listener.pendingStages.values.toSeq
-  val completedStages = listener.completedStages.reverse.toSeq
+  val completedStages = listener.completedStages.reverse
--- End diff --

For me not clear, do you request this change back or we can leave my 
changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17430
  
**[Test build #75387 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75387/testReport)**
 for PR 17430 at commit 
[`77dadfd`](https://github.com/apache/spark/commit/77dadfdf116c7b6e0385582ced746d83a960adb5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17477
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17477
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75381/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17477
  
**[Test build #75381 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75381/testReport)**
 for PR 17477 at commit 
[`7ddb6eb`](https://github.com/apache/spark/commit/7ddb6eb11ed17c87355826db5bf2512b785042f5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17443: typos

2017-03-30 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17443
  
@aseigneurin Please read http://spark.apache.org/contributing.html too. As 
you might imagine "typos" isn't useful as a title. Write "[MINOR]" too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17472: [SPARK-19999]: Workaround JDK-8165231 to identify PPC64 ...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17472
  
**[Test build #75386 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75386/testReport)**
 for PR 17472 at commit 
[`632161b`](https://github.com/apache/spark/commit/632161b299ac37a598083b1c2995e9becbddc33c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17430: [SPARK-20096][Spark Submit][Minor]Expose the right queue...

2017-03-30 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17430
  
Jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17452: [SPARK-20123][build]$SPARK_HOME variable might ha...

2017-03-30 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/17452#discussion_r108885749
  
--- Diff: dev/make-distribution.sh ---
@@ -217,43 +217,43 @@ fi
 # Make R package - this is used for both CRAN release and packing R layout 
into distribution
 if [ "$MAKE_R" == "true" ]; then
   echo "Building R source package"
-  R_PACKAGE_VERSION=`grep Version $SPARK_HOME/R/pkg/DESCRIPTION | awk 
'{print $NF}'`
+  R_PACKAGE_VERSION=`grep Version "$SPARK_HOME/R/pkg/DESCRIPTION" | awk 
'{print $NF}'`
   pushd "$SPARK_HOME/R" > /dev/null
   # Build source package and run full checks
   # Do not source the check-cran.sh - it should be run from where it is 
for it to set SPARK_HOME
-  NO_TESTS=1 "$SPARK_HOME/"R/check-cran.sh
+  NO_TESTS=1 "$SPARK_HOME/R/check-cran.sh"
 
   # Move R source package to match the Spark release version if the 
versions are not the same.
   # NOTE(shivaram): `mv` throws an error on Linux if source and 
destination are same file
   if [ "$R_PACKAGE_VERSION" != "$VERSION" ]; then
-mv $SPARK_HOME/R/SparkR_"$R_PACKAGE_VERSION".tar.gz 
$SPARK_HOME/R/SparkR_"$VERSION".tar.gz
+mv "$SPARK_HOME/R/SparkR_$R_PACKAGE_VERSION.tar.gz" 
"$SPARK_HOME/R/SparkR_$VERSION.tar.gz"
   fi
 
   # Install source package to get it to generate vignettes rds files, etc.
-  VERSION=$VERSION "$SPARK_HOME/"R/install-source-package.sh
+  VERSION=$VERSION "$SPARK_HOME/R/install-source-package.sh"
   popd > /dev/null
 else
   echo "Skipping building R source package"
 fi
 
 # Copy other things
-mkdir "$DISTDIR"/conf
-cp "$SPARK_HOME"/conf/*.template "$DISTDIR"/conf
+mkdir "$DISTDIR/conf"
+cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf"
--- End diff --

I think the two args need quoting separately?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17417: [DOCS] Docs-only improvements

2017-03-30 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17417
  
Looks good, just needs a rebase now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17472: [SPARK-19999]: Accommodate a new architecture tha...

2017-03-30 Thread samelamin

Github user samelamin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17472#discussion_r108884506
  
--- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java 
---
@@ -46,18 +46,22 @@
   private static final boolean unaligned;
   static {
 boolean _unaligned;
-// use reflection to access unaligned field
-try {
-  Class bitsClass =
-Class.forName("java.nio.Bits", false, 
ClassLoader.getSystemClassLoader());
-  Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned");
-  unalignedMethod.setAccessible(true);
-  _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null));
-} catch (Throwable t) {
-  // We at least know x86 and x64 support unaligned access.
-  String arch = System.getProperty("os.arch", "");
-  //noinspection DynamicRegexReplaceableByCompiledPattern
-  _unaligned = 
arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$");
+String arch = System.getProperty("os.arch", "");
+if (arch.matches("^(ppc64le | ppc64)$")) {
--- End diff --

haha its fine, called it rubber duck programming eh

Sure I will make those changes, but I have no rights to rename the JIRA

So I will rename this PR, can you get yourself or someone to rename the 
JIRA? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17472: [SPARK-19999]: Accommodate a new architecture tha...

2017-03-30 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/17472#discussion_r108884076
  
--- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java 
---
@@ -46,18 +46,22 @@
   private static final boolean unaligned;
   static {
 boolean _unaligned;
-// use reflection to access unaligned field
-try {
-  Class bitsClass =
-Class.forName("java.nio.Bits", false, 
ClassLoader.getSystemClassLoader());
-  Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned");
-  unalignedMethod.setAccessible(true);
-  _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null));
-} catch (Throwable t) {
-  // We at least know x86 and x64 support unaligned access.
-  String arch = System.getProperty("os.arch", "");
-  //noinspection DynamicRegexReplaceableByCompiledPattern
-  _unaligned = 
arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$");
+String arch = System.getProperty("os.arch", "");
+if (arch.matches("^(ppc64le | ppc64)$")) {
--- End diff --

Sorry, I'm writing one thing and thinking another. I mean alignment, not 
endianness. Which architectures do you know allow unaligned access? I'd presume 
all PPC does, and I assume the JDK issue means "PPC64 (big-endian) but also 
PPC64 little-endian".  OK, to be conservative, maybe just check the strings 
"ppc64" and "ppc64le" as you intended.

However your regex doesn't work. You have extra whitespace. Just instead 
check `arch.equals(...) || arch.equals(...)`

Also the PR title shoudl match the JIRA title. I was commenting that it 
differed here in referring to a flaky test, but it isn't. To be very clear, I 
propose you rename both to perhaps: "Workaround JDK-8165231 to identify PPC64 
architectures as supporting unaligned access"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17467: [SPARK-20140][DStream] Remove hardcoded kinesis retry wa...

2017-03-30 Thread yssharma

Github user yssharma commented on the issue:

https://github.com/apache/spark/pull/17467
  
You're a gen @HyukjinKwon ð¯ . I will wait for Tathagata and Burak's 
inputs then :) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17442: [SPARK-20107][DOC] Add spark.hadoop.mapreduce.fil...

2017-03-30 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17442


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17477
  
**[Test build #75385 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75385/testReport)**
 for PR 17477 at commit 
[`7a7cf04`](https://github.com/apache/spark/commit/7a7cf04db7c2a1ffe1fc3cde1e1abdc99481b618).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc bui...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17477#discussion_r108882566
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -704,12 +704,12 @@ private[spark] object TaskSchedulerImpl {
* Used to balance containers across hosts.
*
* Accepts a map of hosts to resource offers for that host, and returns 
a prioritized list of
-   * resource offers representing the order in which the offers should be 
used.  The resource
+   * resource offers representing the order in which the offers should be 
used. The resource
* offers are ordered such that we'll allocate one container on each 
host before allocating a
* second container on any host, and so on, in order to reduce the 
damage if a host fails.
*
-   * For example, given , , , 
returns
-   * [o1, o5, o4, 02, o6, o3]
+   * For example, given a map consisting of h1 to [o1, o2, o3], h2 to [o4] 
and h3 to [o5, o6],
+   * returns a list, [o1, o5, o4, o2, o6, o3].
--- End diff --

There look few typos here. 02 -> o2 and h1 -> h3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17442: [SPARK-20107][DOC] Add spark.hadoop.mapreduce.fileoutput...

2017-03-30 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17442
  
Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17251
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75384/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17251
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17251
  
**[Test build #75384 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75384/testReport)**
 for PR 17251 at commit 
[`2150ce5`](https://github.com/apache/spark/commit/2150ce552a7a02d656329761e04a7fcb38e5e648).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17472: [SPARK-19999]: Accommodate a new architecture tha...

2017-03-30 Thread samelamin

Github user samelamin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17472#discussion_r108881923
  
--- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java 
---
@@ -46,18 +46,22 @@
   private static final boolean unaligned;
   static {
 boolean _unaligned;
-// use reflection to access unaligned field
-try {
-  Class bitsClass =
-Class.forName("java.nio.Bits", false, 
ClassLoader.getSystemClassLoader());
-  Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned");
-  unalignedMethod.setAccessible(true);
-  _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null));
-} catch (Throwable t) {
-  // We at least know x86 and x64 support unaligned access.
-  String arch = System.getProperty("os.arch", "");
-  //noinspection DynamicRegexReplaceableByCompiledPattern
-  _unaligned = 
arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$");
+String arch = System.getProperty("os.arch", "");
+if (arch.matches("^(ppc64le | ppc64)$")) {
--- End diff --

yeah sorry complete fail, didnt see that lol

With regards to a big-endian PPC64 arch , I dont think so but I am not 100% 
sure. that said as it stands it isn't supported and the test fails. So happy to 
change the PR or even close it if you do not see any value it adds 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17329: [SPARK-19991]FileSegmentManagedBuffer performance...

2017-03-30 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/17329#discussion_r108881597
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/buffer/FileSegmentManagedBuffer.java
 ---
@@ -37,13 +37,24 @@
  * A {@link ManagedBuffer} backed by a segment in a file.
  */
 public final class FileSegmentManagedBuffer extends ManagedBuffer {
-  private final TransportConf conf;
+  private final boolean lazyFileDescriptor;
+  private final int memoryMapBytes;
   private final File file;
   private final long offset;
   private final long length;
 
   public FileSegmentManagedBuffer(TransportConf conf, File file, long 
offset, long length) {
-this.conf = conf;
+this(conf.lazyFileDescriptor(), conf.memoryMapBytes(), file, offset, 
length);
+  }
+
+  public FileSegmentManagedBuffer(
--- End diff --

Ping @witgo to update or close


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17472: [SPARK-19999]: Accommodate a new architecture tha...

2017-03-30 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/17472#discussion_r108880865
  
--- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java 
---
@@ -46,18 +46,22 @@
   private static final boolean unaligned;
   static {
 boolean _unaligned;
-// use reflection to access unaligned field
-try {
-  Class bitsClass =
-Class.forName("java.nio.Bits", false, 
ClassLoader.getSystemClassLoader());
-  Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned");
-  unalignedMethod.setAccessible(true);
-  _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null));
-} catch (Throwable t) {
-  // We at least know x86 and x64 support unaligned access.
-  String arch = System.getProperty("os.arch", "");
-  //noinspection DynamicRegexReplaceableByCompiledPattern
-  _unaligned = 
arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$");
+String arch = System.getProperty("os.arch", "");
+if (arch.matches("^(ppc64le | ppc64)$")) {
--- End diff --

Both of those start with ppc64, what do you mean?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...

2017-03-30 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/17458#discussion_r108880632
  
--- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
@@ -116,15 +116,15 @@ private[spark] abstract class WebUI(
* @param path Path in UI to unmount.
*/
   def removeStaticHandler(path: String): Unit = {
-handlers.find(_.getContextPath() == path).foreach(detachHandler)
+handlers.find(_.getContextPath == path).foreach(detachHandler)
--- End diff --

@dbolshak this still isn't addressing comments that have been made several 
times. You can appreciate why I don't think it's worth the time to make changes 
like this given how much discussion it takes. Here's another example. Please 
read _all_ previous comments and address all of them or close this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/executors/...

2017-03-30 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17479
  
I don't think this is worth changing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17324: [SPARK-19969] [ML] Imputer doc and example

2017-03-30 Thread MLnick

Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/17324#discussion_r108880024
  
--- Diff: examples/src/main/python/ml/imputer_example.py ---
@@ -0,0 +1,50 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# $example on$
+from pyspark.ml.feature import Imputer
+# $example off$
+from pyspark.sql import SparkSession
+
+"""
+An example demonstrating Imputer.
+Run with:
+  bin/spark-submit examples/src/main/python/ml/imputer_example.py
+"""
+
+if __name__ == "__main__":
+spark = SparkSession\
+.builder\
+.appName("ImputerExample")\
+.getOrCreate()
+
+# $example on$
+df = spark.createDataFrame([
+(1.0, float("nan")),
+(2.0, float("nan")),
+(float("nan"), 3.0),
+(4.0, 4.0),
+(5.0, 5.0)
+], ["a", "b"])
+
+imputer = Imputer(inputCols=["a", "b"], outputCols=["out_a", "out_b"])
+model = imputer.fit(df)
+
+model.transform(df).select("a", "b", "out_a", "out_b").show()
--- End diff --

In previous comment I wasn't totally clear, sorry! I mean let's _only_ have 
the `transform(df).show()` - so we can remove the `select` here as it's 
unnecessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17479#discussion_r108879364
  
--- Diff: 
core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html 
---
@@ -24,7 +24,7 @@ Summary
 
 RDD Blocks
 Storage Memory
+  title="Memory used / total available memory for storage 
of data like RDD partitions cached in memory. ">Storage Memory used/total
--- End diff --

Let's post before/after snapshoots here and defer to committers. 
Personally, I am not sure too because tooltip explains quite clear.

BTW, I think we should capitalise it just to be consistent other pages.

![2017-03-30 6 21 
24](https://cloud.githubusercontent.com/assets/6477701/24497335/0f62dabc-1576-11e7-9ec4-ad0f3cd8206c.png)
![2017-03-30 6 21 
18](https://cloud.githubusercontent.com/assets/6477701/24497339/15032e36-1576-11e7-99d7-8493eea3ae92.png)
![2017-03-30 6 21 
15](https://cloud.githubusercontent.com/assets/6477701/24497341/1810be18-1576-11e7-907f-858f692b522b.png)
![2017-03-30 6 21 
12](https://cloud.githubusercontent.com/assets/6477701/24497344/1b4013ae-1576-11e7-9599-9f91e4f89317.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17472: [SPARK-19999]: Fix for flakey tests due to java.n...

2017-03-30 Thread samelamin

Github user samelamin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17472#discussion_r108879329
  
--- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java 
---
@@ -46,18 +46,22 @@
   private static final boolean unaligned;
   static {
 boolean _unaligned;
-// use reflection to access unaligned field
-try {
-  Class bitsClass =
-Class.forName("java.nio.Bits", false, 
ClassLoader.getSystemClassLoader());
-  Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned");
-  unalignedMethod.setAccessible(true);
-  _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null));
-} catch (Throwable t) {
-  // We at least know x86 and x64 support unaligned access.
-  String arch = System.getProperty("os.arch", "");
-  //noinspection DynamicRegexReplaceableByCompiledPattern
-  _unaligned = 
arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$");
+String arch = System.getProperty("os.arch", "");
+if (arch.matches("^(ppc64le | ppc64)$")) {
--- End diff --

thats fine but itll end up being arch.startsWith("ppc64") || 
arch.startsWith("ppc64le")

I thought this was less code, but startsWith is more readable

Are you happy with that approach? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17472: [SPARK-19999]: Fix for flakey tests due to java.n...

2017-03-30 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/17472#discussion_r108878462
  
--- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java 
---
@@ -46,18 +46,22 @@
   private static final boolean unaligned;
   static {
 boolean _unaligned;
-// use reflection to access unaligned field
-try {
-  Class bitsClass =
-Class.forName("java.nio.Bits", false, 
ClassLoader.getSystemClassLoader());
-  Method unalignedMethod = bitsClass.getDeclaredMethod("unaligned");
-  unalignedMethod.setAccessible(true);
-  _unaligned = Boolean.TRUE.equals(unalignedMethod.invoke(null));
-} catch (Throwable t) {
-  // We at least know x86 and x64 support unaligned access.
-  String arch = System.getProperty("os.arch", "");
-  //noinspection DynamicRegexReplaceableByCompiledPattern
-  _unaligned = 
arch.matches("^(i[3-6]86|x86(_64)?|x64|amd64|aarch64)$");
+String arch = System.getProperty("os.arch", "");
+if (arch.matches("^(ppc64le | ppc64)$")) {
--- End diff --

Still, why not `arch.startsWith("ppc64")`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...

2017-03-30 Thread guoxiaolongzte

Github user guoxiaolongzte commented on a diff in the pull request:

https://github.com/apache/spark/pull/17479#discussion_r108876916
  
--- Diff: 
core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html 
---
@@ -24,7 +24,7 @@ Summary
 
 RDD Blocks
 Storage Memory
+  title="Memory used / total available memory for storage 
of data like RDD partitions cached in memory. ">Storage Memory used/total
--- End diff --

The memory data is separated from each other, and if the title is 
separated, it is easy to observe and understand the user.Modify this slight UI, 
advantages outweigh the disadvantages.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...

2017-03-30 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/17479#discussion_r108875608
  
--- Diff: 
core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html 
---
@@ -24,7 +24,7 @@ Summary
 
 RDD Blocks
 Storage Memory
+  title="Memory used / total available memory for storage 
of data like RDD partitions cached in memory. ">Storage Memory used/total
--- End diff --

I don't disagree your comments, but from my point it is slightly duplicated 
and not so necessary. And I believe most of the users understand what this 
column mean. So I'm conservative of such change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17458#discussion_r108874836
  
--- Diff: core/src/main/scala/org/apache/spark/ui/storage/RDDPage.scala ---
@@ -42,7 +42,7 @@ private[ui] class RDDPage(parent: StorageTab) extends 
WebUIPage("rdd") {
 
 val blockPage = Option(parameterBlockPage).map(_.toInt).getOrElse(1)
 val blockSortColumn = 
Option(parameterBlockSortColumn).getOrElse("Block Name")
-val blockSortDesc = 
Option(parameterBlockSortDesc).map(_.toBoolean).getOrElse(false)
+val blockSortDesc = Option(parameterBlockSortDesc).exists(_.toBoolean)
--- End diff --

This also looks the same instance. Let's revert this change too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17458#discussion_r108874623
  
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/UIData.scala ---
@@ -180,8 +180,8 @@ private[spark] object UIData {
 speculative = taskInfo.speculative
   )
   newTaskInfo.gettingResultTime = taskInfo.gettingResultTime
-  newTaskInfo.setAccumulables(taskInfo.accumulables.filter {
-accum => !accum.internal && accum.metadata != 
Some(AccumulatorContext.SQL_ACCUM_IDENTIFIER)
+  newTaskInfo.setAccumulables(taskInfo.accumulables.filter { acc =>
+!acc.internal && 
!acc.metadata.contains(AccumulatorContext.SQL_ACCUM_IDENTIFIER)
--- End diff --

We should revert this one too for the same reason in 
https://github.com/apache/spark/pull/17458/files#r108432518


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17458: [SPARK-20127][CORE] few warning have been fixed w...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17458#discussion_r108874489
  
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagesTab.scala ---
@@ -35,7 +35,7 @@ private[ui] class StagesTab(parent: SparkUI) extends 
SparkUITab(parent, "stages"
   attachPage(new StagePage(this))
   attachPage(new PoolPage(this))
 
-  def isFairScheduler: Boolean = progressListener.schedulingMode == 
Some(SchedulingMode.FAIR)
+  def isFairScheduler: Boolean = 
progressListener.schedulingMode.contains(SchedulingMode.FAIR)
--- End diff --

It seems not reverted yet..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17472: [SPARK-19999]: Fix for flakey tests due to java.nio.Bits...

2017-03-30 Thread samelamin

Github user samelamin commented on the issue:

https://github.com/apache/spark/pull/17472
  
Can someone guide me why the tests failed? is it another flakey test or? 
because the below message doesnt make sense to me :( 

> Traceback (most recent call last):
>   File "./dev/run-tests-jenkins.py", line 226, in 
> main()
>   File "./dev/run-tests-jenkins.py", line 213, in main
> test_result_code, test_result_note = run_tests(tests_timeout)
>   File "./dev/run-tests-jenkins.py", line 140, in run_tests
> test_result_note = ' * This patch **fails %s**.' % 
failure_note_by_errcode[test_result_code]
> KeyError: -9


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17472: [SPARK-19999]: Fix for flakey tests due to java.nio.Bits...

2017-03-30 Thread samelamin

Github user samelamin commented on the issue:

https://github.com/apache/spark/pull/17472
  
jenkins test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17458: [SPARK-20127][CORE] few warning have been fixed which In...

2017-03-30 Thread dbolshak

Github user dbolshak commented on the issue:

https://github.com/apache/spark/pull/17458
  
@srowen, @HyukjinKwon could you please merge the PR if it's ok of course?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15326: [SPARK-17759] [CORE] Avoid adding duplicate schedulables

2017-03-30 Thread erenavsarogullari

Github user erenavsarogullari commented on the issue:

https://github.com/apache/spark/pull/15326
  
Hi @kayousterhout and @markhamstra,

- This PR is ready for review in the light of first-added-wins 
(Schedulables: `Pool` / `TaskSetManager`) pattern. All feedback are welcome in 
advance.
- Two TaskSchedulerImpl UT cases have been failed due to duplicate TaskSet 
submission so they have also been fixed via latest commit: 
dd302ffb44e01280b42d130bee3ede9c81fd4839 

- Also jenkins needs to be triggered.

Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17324
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75383/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17324
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17324
  
**[Test build #75383 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75383/testReport)**
 for PR 17324 at commit 
[`48a1361`](https://github.com/apache/spark/commit/48a136133fe83b5e4c2408e4391c15fdefead901).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17468: [SPARK-20143][SQL] DataType.fromJson should throw an exc...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17468
  
@gatorsmile, could you take a look please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17478
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17478
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75382/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17478
  
**[Test build #75382 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75382/testReport)**
 for PR 17478 at commit 
[`44305bf`](https://github.com/apache/spark/commit/44305bf67a1fecd1923f9ee57122147efabb5702).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...

2017-03-30 Thread guoxiaolongzte

Github user guoxiaolongzte commented on a diff in the pull request:

https://github.com/apache/spark/pull/17479#discussion_r108863264
  
--- Diff: 
core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html 
---
@@ -24,7 +24,7 @@ Summary
 
 RDD Blocks
 Storage Memory
+  title="Memory used / total available memory for storage 
of data like RDD partitions cached in memory. ">Storage Memory used/total
--- End diff --

Because of this change, easier to understand for users and observation.Web 
UI with clarity and friendly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...

2017-03-30 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/17479#discussion_r108862872
  
--- Diff: 
core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html 
---
@@ -24,7 +24,7 @@ Summary
 
 RDD Blocks
 Storage Memory
+  title="Memory used / total available memory for storage 
of data like RDD partitions cached in memory. ">Storage Memory used/total
--- End diff --

Is it necessary to change? I think what tooltip mentioned is quite clear if 
you hover on this column.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/executors/...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17479
  
I think UI change requires a screenshot as written above. It seems trivial 
though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/executors/...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17479
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17479: [SPARK-20154][Web UI]In web ui,http://ip:4040/exe...

2017-03-30 Thread guoxiaolongzte

GitHub user guoxiaolongzte opened a pull request:

https://github.com/apache/spark/pull/17479

[SPARK-20154][Web UI]In web ui,http://ip:4040/executors/,the title 'Storage 
Memory' shouldâ¦

â¦ modify 'Storage Memory used/total'

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guoxiaolongzte/spark SPARK-20154

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17479.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17479


commit d9cd198b99797929d85fa206738cd772a2e63147
Author: éå°é¾ 10207633 
Date:   2017-03-30T07:31:49Z

In web ui,http://ip:4040/executors/,the title 'Storage Memory' should 
modify 'Storage Memory used/total'




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17419: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-03-30 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17419#discussion_r108858117
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,746 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.stat
+
+import breeze.{linalg => la}
+import breeze.linalg.{Vector => BV}
+import breeze.numerics
+
+import org.apache.spark.SparkException
+import org.apache.spark.annotation.Since
+import org.apache.spark.internal.Logging
+import org.apache.spark.ml.linalg.{DenseVector, SparseVector, Vector, 
Vectors, VectorUDT}
+import org.apache.spark.sql.Column
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.{Expression, 
UnsafeArrayData, UnsafeProjection, UnsafeRow}
+import 
org.apache.spark.sql.catalyst.expressions.aggregate.{AggregateExpression, 
Complete, TypedImperativeAggregate}
+import org.apache.spark.sql.types._
+
+
+/**
+ * A builder object that provides summary statistics about a given column.
+ *
+ * Users should not directly create such builders, but instead use one of 
the methods in
+ * [[Summarizer]].
+ */
+@Since("2.2.0")
+abstract class SummaryBuilder {
+  /**
+   * Returns an aggregate object that contains the summary of the column 
with the requested metrics.
+   * @param column a column that contains Vector object.
+   * @return an aggregate column that contains the statistics. The exact 
content of this
+   * structure is determined during the creation of the builder.
+   */
+  @Since("2.2.0")
+  def summary(column: Column): Column
+}
+
+/**
+ * Tools for vectorized statistics on MLlib Vectors.
+ *
+ * The methods in this package provide various statistics for Vectors 
contained inside DataFrames.
+ *
+ * This class lets users pick the statistics they would like to extract 
for a given column. Here is
+ * an example in Scala:
+ * {{{
+ *   val dataframe = ... // Some dataframe containing a feature column
+ *   val allStats = dataframe.select(Summarizer.metrics("min", 
"max").summary($"features"))
+ *   val Row(min_, max_) = allStats.first()
+ * }}}
+ *
+ * If one wants to get a single metric, shortcuts are also available:
+ * {{{
+ *   val meanDF = dataframe.select(Summarizer.mean($"features"))
+ *   val Row(mean_) = meanDF.first()
+ * }}}
+ */
+@Since("2.2.0")
+object Summarizer extends Logging {
+
+  import SummaryBuilderImpl._
+
+  /**
+   * Given a list of metrics, provides a builder that it turns computes 
metrics from a column.
+   *
+   * See the documentation of [[Summarizer]] for an example.
+   *
+   * The following metrics are accepted (case sensitive):
+   *  - mean: a vector that contains the coefficient-wise mean.
+   *  - variance: a vector tha contains the coefficient-wise variance.
+   *  - count: the count of all vectors seen.
+   *  - numNonzeros: a vector with the number of non-zeros for each 
coefficients
+   *  - max: the maximum for each coefficient.
+   *  - min: the minimum for each coefficient.
+   *  - normL2: the Euclidian norm for each coefficient.
+   *  - normL1: the L1 norm of each coefficient (sum of the absolute 
values).
+   * @param firstMetric the metric being provided
+   * @param metrics additional metrics that can be provided.
+   * @return a builder.
+   * @throws IllegalArgumentException if one of the metric names is not 
understood.
+   */
+  @Since("2.2.0")
+  def metrics(firstMetric: String, metrics: String*): SummaryBuilder = {
+val (typedMetrics, computeMetrics) = 
getRelevantMetrics(Seq(firstMetric) ++ metrics)
+new SummaryBuilderImpl(typedMetrics, computeMetrics)
+  }
+
+  def mean(col: Column): Column = getSingleMetric(col, "mean")
+
+  def variance(col: Column): Column = getSingleMetric(col, "variance")
+

[GitHub] spark pull request #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc bui...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17477#discussion_r108857749
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -704,12 +704,12 @@ private[spark] object TaskSchedulerImpl {
* Used to balance containers across hosts.
*
* Accepts a map of hosts to resource offers for that host, and returns 
a prioritized list of
-   * resource offers representing the order in which the offers should be 
used.  The resource
+   * resource offers representing the order in which the offers should be 
used. The resource
* offers are ordered such that we'll allocate one container on each 
host before allocating a
* second container on any host, and so on, in order to reduce the 
damage if a host fails.
*
-   * For example, given , , , 
returns
-   * [o1, o5, o4, 02, o6, o3]
+   * For example, given maps from h1 to [o1, o2, o3], from h2 to [o4] and 
from h3 to [o5, o6],
+   * returns a list, [o1, o5, o4, o2, o6, o3].
--- End diff --

There look few typos here. 02 -> o2 and h1 -> h3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17419: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-03-30 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17419#discussion_r108856798
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,746 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.stat
+
+import breeze.{linalg => la}
+import breeze.linalg.{Vector => BV}
+import breeze.numerics
+
+import org.apache.spark.SparkException
+import org.apache.spark.annotation.Since
+import org.apache.spark.internal.Logging
+import org.apache.spark.ml.linalg.{DenseVector, SparseVector, Vector, 
Vectors, VectorUDT}
+import org.apache.spark.sql.Column
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.{Expression, 
UnsafeArrayData, UnsafeProjection, UnsafeRow}
+import 
org.apache.spark.sql.catalyst.expressions.aggregate.{AggregateExpression, 
Complete, TypedImperativeAggregate}
+import org.apache.spark.sql.types._
+
+
+/**
+ * A builder object that provides summary statistics about a given column.
+ *
+ * Users should not directly create such builders, but instead use one of 
the methods in
+ * [[Summarizer]].
+ */
+@Since("2.2.0")
+abstract class SummaryBuilder {
+  /**
+   * Returns an aggregate object that contains the summary of the column 
with the requested metrics.
+   * @param column a column that contains Vector object.
+   * @return an aggregate column that contains the statistics. The exact 
content of this
+   * structure is determined during the creation of the builder.
+   */
+  @Since("2.2.0")
+  def summary(column: Column): Column
+}
+
+/**
+ * Tools for vectorized statistics on MLlib Vectors.
+ *
+ * The methods in this package provide various statistics for Vectors 
contained inside DataFrames.
+ *
+ * This class lets users pick the statistics they would like to extract 
for a given column. Here is
+ * an example in Scala:
+ * {{{
+ *   val dataframe = ... // Some dataframe containing a feature column
+ *   val allStats = dataframe.select(Summarizer.metrics("min", 
"max").summary($"features"))
+ *   val Row(min_, max_) = allStats.first()
+ * }}}
+ *
+ * If one wants to get a single metric, shortcuts are also available:
+ * {{{
+ *   val meanDF = dataframe.select(Summarizer.mean($"features"))
+ *   val Row(mean_) = meanDF.first()
+ * }}}
+ */
+@Since("2.2.0")
+object Summarizer extends Logging {
+
+  import SummaryBuilderImpl._
+
+  /**
+   * Given a list of metrics, provides a builder that it turns computes 
metrics from a column.
+   *
+   * See the documentation of [[Summarizer]] for an example.
+   *
+   * The following metrics are accepted (case sensitive):
+   *  - mean: a vector that contains the coefficient-wise mean.
+   *  - variance: a vector tha contains the coefficient-wise variance.
+   *  - count: the count of all vectors seen.
+   *  - numNonzeros: a vector with the number of non-zeros for each 
coefficients
+   *  - max: the maximum for each coefficient.
+   *  - min: the minimum for each coefficient.
+   *  - normL2: the Euclidian norm for each coefficient.
+   *  - normL1: the L1 norm of each coefficient (sum of the absolute 
values).
+   * @param firstMetric the metric being provided
+   * @param metrics additional metrics that can be provided.
+   * @return a builder.
+   * @throws IllegalArgumentException if one of the metric names is not 
understood.
+   */
+  @Since("2.2.0")
+  def metrics(firstMetric: String, metrics: String*): SummaryBuilder = {
+val (typedMetrics, computeMetrics) = 
getRelevantMetrics(Seq(firstMetric) ++ metrics)
+new SummaryBuilderImpl(typedMetrics, computeMetrics)
+  }
+
+  def mean(col: Column): Column = getSingleMetric(col, "mean")
+
+  def variance(col: Column): Column = getSingleMetric(col, "variance")
+

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-03-30 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/17251
  
Thank you so much for review, @gatorsmile .
I updated the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17251
  
**[Test build #75384 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75384/testReport)**
 for PR 17251 at commit 
[`2150ce5`](https://github.com/apache/spark/commit/2150ce552a7a02d656329761e04a7fcb38e5e648).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17324
  
**[Test build #75383 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75383/testReport)**
 for PR 17324 at commit 
[`48a1361`](https://github.com/apache/spark/commit/48a136133fe83b5e4c2408e4391c15fdefead901).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17251: [SPARK-19910][SQL] `stack` should not reject NULL...

2017-03-30 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/17251#discussion_r108855184
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -590,6 +591,21 @@ object TypeCoercion {
   }
 
   /**
+   * Coerces NullTypes of a Stack function to the corresponding column 
types.
--- End diff --

Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17324: [SPARK-19969] [ML] Imputer doc and example

2017-03-30 Thread MLnick

Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/17324
  
Jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17251: [SPARK-19910][SQL] `stack` should not reject NULL...

2017-03-30 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/17251#discussion_r108854836
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -707,6 +707,36 @@ class TypeCoercionSuite extends PlanTest {
 )
   }
 
+  test("type coercion for Stack") {
+val rule = TypeCoercion.StackCoercion
+
+ruleTest(rule,
+  Stack(Seq(Literal(3), Literal(1), Literal(2), Literal(null))),
+  Stack(Seq(Literal(3), Literal(1), Literal(2), Literal.create(null, 
IntegerType
+ruleTest(rule,
+  Stack(Seq(Literal(3), Literal(1.0), Literal(null), Literal(3.0))),
+  Stack(Seq(Literal(3), Literal(1.0), Literal.create(null, 
DoubleType), Literal(3.0
+ruleTest(rule,
+  Stack(Seq(Literal(3), Literal(null), Literal("2"), Literal("3"))),
+  Stack(Seq(Literal(3), Literal.create(null, StringType), 
Literal("2"), Literal("3"
+
+ruleTest(rule,
+  Stack(Seq(Literal(2),
+Literal(1), Literal("2"),
+Literal(null), Literal(null))),
+  Stack(Seq(Literal(2),
+Literal(1), Literal("2"),
+Literal.create(null, IntegerType), Literal.create(null, 
StringType
+
+ruleTest(rule,
+  Stack(Seq(Subtract(Literal(3), Literal(1)),
+Literal(1), Literal("2"),
+Literal(null), Literal(null))),
+  Stack(Seq(Subtract(Literal(3), Literal(1)),
+Literal(1), Literal("2"),
+Literal.create(null, IntegerType), Literal.create(null, 
StringType
--- End diff --

Right, for that one, I'll add the following 
[here](https://github.com/apache/spark/pull/17251/files/c9510847c8eeb5f5da3b63c38ac835d1c3491815#diff-01ecdd038c5c2f53f38118912210fef8R722).
```scala
ruleTest(rule,
  Stack(Seq(Literal(3), Literal(null), Literal(null), Literal(null))),
  Stack(Seq(Literal(3), Literal(null), Literal(null), Literal(null
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17251: [SPARK-19910][SQL] `stack` should not reject NULL...

2017-03-30 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/17251#discussion_r108854853
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/GeneratorFunctionSuite.scala ---
@@ -39,9 +39,9 @@ class GeneratorFunctionSuite extends QueryTest with 
SharedSQLContext {
 checkAnswer(df.selectExpr("stack(3, 1, 2, 3)"), Row(1) :: Row(2) :: 
Row(3) :: Nil)
 checkAnswer(df.selectExpr("stack(4, 1, 2, 3)"), Row(1) :: Row(2) :: 
Row(3) :: Row(null) :: Nil)
 
-// Various column types
-checkAnswer(df.selectExpr("stack(3, 1, 1.1, 'a', 2, 2.2, 'b', 3, 3.3, 
'c')"),
-  Row(1, 1.1, "a") :: Row(2, 2.2, "b") :: Row(3, 3.3, "c") :: Nil)
+// Various column types and null values
+checkAnswer(df.selectExpr("stack(3, 1, 1.1, null, 2, null, 'b', null, 
3.3, 'c')"),
+  Row(1, 1.1, null) :: Row(2, null, "b") :: Row(null, 3.3, "c") :: Nil)
--- End diff --

Yep.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17251: [SPARK-19910][SQL] `stack` should not reject NULL...

2017-03-30 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/17251#discussion_r108854595
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ---
@@ -156,9 +157,21 @@ case class Stack(children: Seq[Expression]) extends 
Generator {
 }
   }
 
+  private def findDataType(column: Integer): DataType = {
+// Find the first data type except NullType
+for (i <- (column + 1) until children.length by numFields) {
+  if (children(i).dataType != NullType) {
+return children(i).dataType
+  }
+}
+// If all values of the column are NullType, use it.
+children(column + 1).dataType
+  }
+
   override def elementSchema: StructType =
 StructType(children.tail.take(numFields).zipWithIndex.map {
-  case (e, index) => StructField(s"col$index", e.dataType)
+  case (e, index) if e.dataType != NullType => 
StructField(s"col$index", e.dataType)
+  case (_, index) => StructField(s"col$index", findDataType(index))
--- End diff --

Sure!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17478: [SPARK-18901][ML]:Require in LR LogisticAggregator is re...

2017-03-30 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17478
  
**[Test build #75382 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75382/testReport)**
 for PR 17478 at commit 
[`44305bf`](https://github.com/apache/spark/commit/44305bf67a1fecd1923f9ee57122147efabb5702).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...

2017-03-30 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17251
  
LGTM except a few comments

cc @cloud-fan for the final sign off


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17375: [SPARK-19019][PYTHON][BRANCH-1.6] Fix hijacked `collecti...

2017-03-30 Thread zero323

Github user zero323 commented on the issue:

https://github.com/apache/spark/pull/17375
  
@holdenk  At least some users have more control over Python environment 
than a whole cluster setup, and with Anaconda defaulting now to 3.6, it is an 
annoyance. Assuming there will be 1.6.4, it makes more sense to patch than 
document maximum supported version.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17478: [SPARK-18901][ML]:Require in LR LogisticAggregato...

2017-03-30 Thread wangmiao1981

GitHub user wangmiao1981 opened a pull request:

https://github.com/apache/spark/pull/17478

[SPARK-18901][ML]:Require in LR LogisticAggregator is redundant

## What changes were proposed in this pull request?

In MultivariateOnlineSummarizer, 

`add` and `merge` have check for weights and feature sizes. The checks in 
LR are redundant, which are removed from this PR.

## How was this patch tested?

Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangmiao1981/spark logit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17478.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17478


commit 44305bf67a1fecd1923f9ee57122147efabb5702
Author: wm...@hotmail.com 
Date:   2017-03-30T07:00:18Z

remove redudant check




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17472: [SPARK-19999]: Fix for flakey tests due to java.nio.Bits...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17472
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75380/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17472: [SPARK-19999]: Fix for flakey tests due to java.nio.Bits...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17472
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17476: [SPARK-20151][SQL] Account for partition pruning in scan...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17476
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17476: [SPARK-20151][SQL] Account for partition pruning in scan...

2017-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17476
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75379/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...

2017-03-30 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17477
  
FYI, If I haven't missed something, all the cases are the instances with 
the ones previously fixed. cc @joshrosen, @srowen and @jkbradley. Could you 
take a look and see if it makes sense?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 >

301 - 400 of 404 matches

Mail list logo