[GitHub] spark pull request: [SPARK-11195][CORE] Use correct classloader fo...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9367#issuecomment-153543299
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Update LDAOptimizer.scala

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9456#issuecomment-153541597
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Update LDAOptimizer.scala

2015-11-03 Thread a1singh
GitHub user a1singh opened a pull request:

https://github.com/apache/spark/pull/9456

Update LDAOptimizer.scala

In file LDAOptimizer.scala:

line 441: since "idx" was never used, replaced unrequired 
zipWithIndex.foreach with foreach.

-  nonEmptyDocs.zipWithIndex.foreach { case ((_, termCounts: Vector), 
idx: Int) =>
+  nonEmptyDocs.foreach { case (_, termCounts: Vector) =>

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/a1singh/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9456.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9456


commit ba0879bb8d9949146ca72401d195fcfb04edc3ba
Author: a1singh 
Date:   2015-11-04T01:22:33Z

Update LDAOptimizer.scala

line 441: since idx was never used, replaced unrequired 
zipWithIndex.foreach with foreach




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-153541323
  
**[Test build #44986 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44986/consoleFull)**
 for PR 8984 at commit 
[`df6606b`](https://github.com/apache/spark/commit/df6606b8832c9979df87c4c878e098a59cb04707).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

2015-11-03 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/9441#discussion_r43831883
  
--- Diff: python/pyspark/mllib/linalg/distributed.py ---
@@ -500,6 +661,25 @@ def numCols(self):
 """
 return self._java_matrix_wrapper.call("numCols")
 
+def transpose(self):
+"""
+Transpose this CoordinateMatrix.
--- End diff --

Ah ok, looking at Matrices.scala (the root class) it indicates it shares 
the same data type but I forgot to look at the Coordinate matrix underneath. 
Sorry about that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-153540902
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-153540890
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

2015-11-03 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/9441#discussion_r43831697
  
--- Diff: python/pyspark/mllib/linalg/distributed.py ---
@@ -297,6 +444,20 @@ def numCols(self):
 """
 return self._java_matrix_wrapper.call("numCols")
 
+def computeGramianMatrix(self):
+"""
+Computes the Gramian matrix `A^T A`. Note that this cannot be
+computed on matrices with more than 65535 columns.
--- End diff --

Thats a good question, totally reasonable to do this in a follow up PR I 
think since its pretty unrelated just while we are at it good to unify things 
while we are working on it. Just create a follow up JIRA to do this :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4921. TaskSetManager.dequeueTask returns...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3816#issuecomment-153540579
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44967/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4921. TaskSetManager.dequeueTask returns...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3816#issuecomment-153540577
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4921. TaskSetManager.dequeueTask returns...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3816#issuecomment-153540430
  
**[Test build #44967 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44967/consoleFull)**
 for PR 3816 at commit 
[`247ce55`](https://github.com/apache/spark/commit/247ce5587b8a06fe586a2730f1ad2df4ab7f79dc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11275][SQL] Reimplement Expand as a Gen...

2015-11-03 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/9429#issuecomment-153540121
  
After checking with Hive like:
```
hive> select sum(a-b) as ab from mytable group by b with rollup;
FAILED: SemanticException [Error 10210]: Grouping sets aggregations (with 
rollups or cubes) are not allowed if aggregation function parameters overlap 
with the aggregation functions columns
```
Hive actually doesn't support the overlap with the aggregation functions 
columns. Probably we can have a simple fixing based on the current master 
branch if we need to support that. 

And after double checking, the master branch will be optimized for 
expression constant folding while with the `Expand` operator, it means better 
performance than re-implemented based on UDTF, so I am a little struggling 
which is the better approach for the implementation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

2015-11-03 Thread dusenberrymw
Github user dusenberrymw commented on the pull request:

https://github.com/apache/spark/pull/9441#issuecomment-153539972
  
@holdenk Great, thanks for the feedback!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

2015-11-03 Thread dusenberrymw
Github user dusenberrymw commented on a diff in the pull request:

https://github.com/apache/spark/pull/9441#discussion_r43831342
  
--- Diff: python/pyspark/mllib/linalg/distributed.py ---
@@ -500,6 +661,25 @@ def numCols(self):
 """
 return self._java_matrix_wrapper.call("numCols")
 
+def transpose(self):
+"""
+Transpose this CoordinateMatrix.
+
+>>> entries = sc.parallelize([MatrixEntry(0, 0, 1.2),
+...   MatrixEntry(1, 0, 2),
+...   MatrixEntry(2, 1, 3.7)])
+>>> mat = CoordinateMatrix(entries)
+>>> mat_transposed = mat.transpose()
+
--- End diff --

Yeah, I like the visual clarity when viewing these tests on the Python 
docs, as it helps indicate that the following two tests rely on the data 
structures formed above.  This is generally the pattern I've followed with 
these classes for cases with >1 test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

2015-11-03 Thread dusenberrymw
Github user dusenberrymw commented on a diff in the pull request:

https://github.com/apache/spark/pull/9441#discussion_r43831207
  
--- Diff: python/pyspark/mllib/linalg/distributed.py ---
@@ -500,6 +661,25 @@ def numCols(self):
 """
 return self._java_matrix_wrapper.call("numCols")
 
+def transpose(self):
+"""
+Transpose this CoordinateMatrix.
--- End diff --

I think that is just the case for the `BlockMatrix` type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11491] Update build to use Scala 2.10.5

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9450#issuecomment-153539495
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11491] Update build to use Scala 2.10.5

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9450#issuecomment-153539496
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44969/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

2015-11-03 Thread dusenberrymw
Github user dusenberrymw commented on a diff in the pull request:

https://github.com/apache/spark/pull/9441#discussion_r43831148
  
--- Diff: python/pyspark/mllib/linalg/distributed.py ---
@@ -297,6 +444,20 @@ def numCols(self):
 """
 return self._java_matrix_wrapper.call("numCols")
 
+def computeGramianMatrix(self):
+"""
+Computes the Gramian matrix `A^T A`. Note that this cannot be
+computed on matrices with more than 65535 columns.
--- End diff --

Agreed.  Would it be reasonable to include that in this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11491] Update build to use Scala 2.10.5

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9450#issuecomment-153539361
  
**[Test build #44969 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44969/consoleFull)**
 for PR 9450 at commit 
[`6ffe0b0`](https://github.com/apache/spark/commit/6ffe0b0540007a96a75d48a75303aad0b45fc9b0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153538637
  
**[Test build #44985 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44985/consoleFull)**
 for PR 9344 at commit 
[`a0c5668`](https://github.com/apache/spark/commit/a0c5668fa3044d3934cd3a5934794a5ae569838b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11495] Fix potential socket / file hand...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9455#issuecomment-153538651
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11495] Fix potential socket / file hand...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9455#issuecomment-153538647
  
**[Test build #44984 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44984/consoleFull)**
 for PR 9455 at commit 
[`ab34781`](https://github.com/apache/spark/commit/ab3478124bea8cddeb1dd35611ae0f5cf4a67588).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * 
`public final class UnsafeSorterSpillReader extends UnsafeSorterIterator 
implements Closeable `\n  * `case class ExecutorLostFailure(`\n  * `  case 
class DecimalLit(chars: String) extends Token `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11495] Fix potential socket / file hand...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9455#issuecomment-153538652
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44984/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11495] Fix potential socket / file hand...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9455#issuecomment-153537697
  
**[Test build #44984 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44984/consoleFull)**
 for PR 9455 at commit 
[`ab34781`](https://github.com/apache/spark/commit/ab3478124bea8cddeb1dd35611ae0f5cf4a67588).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153537707
  
**[Test build #44982 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44982/consoleFull)**
 for PR 9344 at commit 
[`a0c5668`](https://github.com/apache/spark/commit/a0c5668fa3044d3934cd3a5934794a5ae569838b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9454#issuecomment-153537621
  
**[Test build #44980 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44980/consoleFull)**
 for PR 9454 at commit 
[`e01e92d`](https://github.com/apache/spark/commit/e01e92d92f3f799356dd6a8cebc60002899090e9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7097][SQL]: Partitioned tables should o...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5668#issuecomment-153537468
  
  [Test build #44983 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44983/console)
 for   PR 5668 at commit 
[`dd630e7`](https://github.com/apache/spark/commit/dd630e7667ae54b19946318eb6d8f7fd3180e40e).
 * This patch **fails to build**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7097][SQL]: Partitioned tables should o...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5668#issuecomment-153537473
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44983/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7097][SQL]: Partitioned tables should o...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5668#issuecomment-153537472
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153537163
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11495] Fix potential socket / file hand...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9455#issuecomment-153537143
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11495] Fix potential socket / file hand...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9455#issuecomment-153537164
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153537150
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11336] Add links to example codes

2015-11-03 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/9320#issuecomment-153537027
  
When we build the doc, there is a flag `PRODUCTION`. I think when this flag 
is on, we should use Spark version in the GitHub link, otherwise we should use 
`master` instead. There is a link verification step in our release process. I'm 
a little worried about having broken links.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153537096
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7097][SQL]: Partitioned tables should o...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5668#issuecomment-153536924
  
  [Test build #44983 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44983/consoleFull)
 for   PR 5668 at commit 
[`dd630e7`](https://github.com/apache/spark/commit/dd630e7667ae54b19946318eb6d8f7fd3180e40e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536898
  
**[Test build #44981 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44981/consoleFull)**
 for PR 9385 at commit 
[`a26763d`](https://github.com/apache/spark/commit/a26763d758bc58dacf81be171428ede215775532).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11495] Fix potential socket / file hand...

2015-11-03 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/9455#discussion_r43830137
  
--- Diff: 
core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java
 ---
@@ -31,10 +30,8 @@
  * Reads spill files written by {@link UnsafeSorterSpillWriter} (see that 
class for a description
  * of the file format).
  */
-public final class UnsafeSorterSpillReader extends UnsafeSorterIterator {
-  private static final Logger logger = 
LoggerFactory.getLogger(UnsafeSorterSpillReader.class);
+public final class UnsafeSorterSpillReader extends UnsafeSorterIterator 
implements Closeable {
--- End diff --

Of the potential leaks that were discovered, I think that this is the only 
one to be concerned about: it looks like `UnsafeSorterSpillReader` might leak 
an open `FileInputStream` if an exception occurred while reading records. The 
fix implemented here is to add a `close()` method for safely closing the 
reader's streams.

I chose to move the spill deletion logic out of the reader itself since it 
appears to be redundant with spill deletion code that lives elsewhere. We 
should audit the existing code to make sure that the chain of responsibility 
for cleaning up spill files is clearly defined.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

2015-11-03 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/9441#discussion_r43830079
  
--- Diff: python/pyspark/mllib/linalg/distributed.py ---
@@ -500,6 +661,25 @@ def numCols(self):
 """
 return self._java_matrix_wrapper.call("numCols")
 
+def transpose(self):
+"""
+Transpose this CoordinateMatrix.
--- End diff --

Maybe mention it shares the same underlying data as mentioned in the 
scaladoc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11336] Add links to example codes

2015-11-03 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/9320#discussion_r43830050
  
--- Diff: docs/_plugins/include_example.rb ---
@@ -38,7 +38,15 @@ def render(context)
   code = File.open(@file).read.encode("UTF-8")
   code = select_lines(code)
  
-  Pygments.highlight(code, :lexer => @lang)
+  rendered_code = Pygments.highlight(code, :lexer => @lang)
+
+  spark_version = site.config['SPARK_VERSION_SHORT']
+  hint = "Find full example code here: " \
--- End diff --

remove `class="hint"`, which is no longer needed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153536652
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44965/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153536651
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153536587
  
**[Test build #44965 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44965/consoleFull)**
 for PR 9344 at commit 
[`96705b8`](https://github.com/apache/spark/commit/96705b881caa82b87874d5145c16ce232c4a64a4).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * 
`class ExecutionMemoryPool(`\n  * `abstract class MemoryPool(memoryManager: 
Object) `\n  * `class StorageMemoryPool(memoryManager: Object) extends 
MemoryPool(memoryManager) with Logging `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11495] Fix potential socket / file hand...

2015-11-03 Thread JoshRosen
GitHub user JoshRosen opened a pull request:

https://github.com/apache/spark/pull/9455

[SPARK-11495] Fix potential socket / file handle leaks that were found via 
static analysis

The HP Fortify Opens Source Review team 
(https://www.hpfod.com/open-source-review-project) reported a handful of 
potential resource leaks that were discovered using their static analysis tool. 
We should fix the issues identified by their scan.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JoshRosen/spark fix-potential-resource-leaks

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9455.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9455


commit 279eb246baa73ac6397852e4935d41ef19eabbfc
Author: Josh Rosen 
Date:   2015-11-03T23:46:21Z

Fix possible leak in JavaCustomReceiver.

commit a632cd8463030187a70ca9590ea630e05cb127f2
Author: Josh Rosen 
Date:   2015-11-03T23:53:26Z

Fix potential leak in JavaReceiverAPISuite

commit 1c0433d189192308711c1c2ad4f4ebce2242c3d0
Author: Josh Rosen 
Date:   2015-11-03T23:55:06Z

Release bufferChunk in ChunkFetchIntegrationSuite

commit cafdea94fd8d215964be27631af41e1e27d2bc89
Author: Josh Rosen 
Date:   2015-11-04T00:30:28Z

Fix potential file stream leak in UnsafeSorterSpillReader.

commit 4119fa9a73520e1451843ce1a6af8749aa6c82d9
Author: Josh Rosen 
Date:   2015-11-04T00:33:13Z

Address potential leaks in TestShuffleDataContext.

commit ab3478124bea8cddeb1dd35611ae0f5cf4a67588
Author: Josh Rosen 
Date:   2015-11-04T00:35:23Z

Fix potential resource leak in ChunkFetchIntegrationSuite




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7097][SQL]: Partitioned tables should o...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5668#issuecomment-153536466
  
Build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7097][SQL]: Partitioned tables should o...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5668#issuecomment-153536447
  
 Build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9454#issuecomment-153536443
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

2015-11-03 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/9441#discussion_r43829896
  
--- Diff: python/pyspark/mllib/linalg/distributed.py ---
@@ -297,6 +444,20 @@ def numCols(self):
 """
 return self._java_matrix_wrapper.call("numCols")
 
+def computeGramianMatrix(self):
+"""
+Computes the Gramian matrix `A^T A`. Note that this cannot be
+computed on matrices with more than 65535 columns.
--- End diff --

We should maybe also add this note about max columns to the 
IndexedRowMatrix.scala for consistency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153536455
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536456
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9454#issuecomment-153536434
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11389][CORE] Add support for off-heap m...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9344#issuecomment-153536439
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536438
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

2015-11-03 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/9441#discussion_r43829680
  
--- Diff: python/pyspark/mllib/linalg/distributed.py ---
@@ -500,6 +661,25 @@ def numCols(self):
 """
 return self._java_matrix_wrapper.call("numCols")
 
+def transpose(self):
+"""
+Transpose this CoordinateMatrix.
+
+>>> entries = sc.parallelize([MatrixEntry(0, 0, 1.2),
+...   MatrixEntry(1, 0, 2),
+...   MatrixEntry(2, 1, 3.7)])
+>>> mat = CoordinateMatrix(entries)
+>>> mat_transposed = mat.transpose()
+
--- End diff --

Is this blank line intentional?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread dbtsai
Github user dbtsai commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536227
  
Jenkins, add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11433] [SQL] Cleanup the subquery name ...

2015-11-03 Thread dbtsai
Github user dbtsai commented on the pull request:

https://github.com/apache/spark/pull/9385#issuecomment-153536164
  
I think there is some issue in Jenkins. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Cleanup from spark-11329 f...

2015-11-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9442


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-153535524
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-153535482
  
**[Test build #44977 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44977/consoleFull)**
 for PR 8984 at commit 
[`24976fb`](https://github.com/apache/spark/commit/24976fb7ae222507f28beec47e77ef8b1136fa9a).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4980] [MLlib] Add decay factors to stre...

2015-11-03 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/8022#discussion_r43829408
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
 ---
@@ -91,7 +112,22 @@ abstract class StreamingLinearAlgorithm[
 }
 data.foreachRDD { (rdd, time) =>
   if (!rdd.isEmpty) {
-model = Some(algorithm.run(rdd, model.get.weights))
+val newModel = algorithm.run(rdd, model.get.weights)
+
+val numNewDataPoints = rdd.count()
+val discount = getDiscount(numNewDataPoints)
+
+val updatedDataWeight = previousDataWeight * discount + 
numNewDataPoints
+// updatedDataWeight >= 1 because rdd is not empty;
+// no need to check division by zero in below
+val lambda = numNewDataPoints / updatedDataWeight
+
+BLAS.scal(lambda, newModel.weights)
+BLAS.axpy(1-lambda, model.get.weights, newModel.weights)
--- End diff --

Do we have some references about this merging scheme? I assume that this 
works for many cases, but there is no guarantee in theory.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-153535525
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44977/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11329] [SQL] Cleanup from spark-11329 f...

2015-11-03 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/9442#issuecomment-153535172
  
Thanks! LGTM. Merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11486][SQL] Fix TungstenAggregate's han...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9447#issuecomment-153535188
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11486][SQL] Fix TungstenAggregate's han...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9447#issuecomment-153535190
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44964/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11486][SQL] Fix TungstenAggregate's han...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9447#issuecomment-153535095
  
**[Test build #44964 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44964/consoleFull)**
 for PR 9447 at commit 
[`764a2b2`](https://github.com/apache/spark/commit/764a2b25b31c54c1774b3c1941ad200e40228cc2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11485][SQL] Make DataFrameHolder and Da...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9445#issuecomment-153534880
  
**[Test build #44979 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44979/consoleFull)**
 for PR 9445 at commit 
[`f75c4a5`](https://github.com/apache/spark/commit/f75c4a5f330db23f0bd65fcc6043b9cbffc076aa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Missing link to R DataFrame API doc

2015-11-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9394


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Missing link to R DataFrame API doc

2015-11-03 Thread Lewuathe
Github user Lewuathe commented on the pull request:

https://github.com/apache/spark/pull/9394#issuecomment-153534639
  
@felixcheung  @shivaram Thank you so much!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11484][WebUi] Using proxyBase set by sp...

2015-11-03 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/9448#issuecomment-153534396
  
LGTM. My only concern is that someone might be using yarn-client mode with 
a different UI proxy than the RM; this change makes that impossible, since the 
AM always sets `spark.ui.proxyBase`. I find that case rather unlikely though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Missing link to R DataFrame API doc

2015-11-03 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/9394#issuecomment-153534358
  
Thanks again. Merging this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11484][WebUi] Using proxyBase set by sp...

2015-11-03 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/9448#discussion_r43828670
  
--- Diff: core/src/main/scala/org/apache/spark/ui/UIUtils.scala ---
@@ -143,12 +143,12 @@ private[spark] object UIUtils extends Logging {
 
   // Yarn has to go through a proxy so the base uri is provided and has to 
be on all links
   def uiRoot: String = {
-if (System.getenv("APPLICATION_WEB_PROXY_BASE") != null) {
-  System.getenv("APPLICATION_WEB_PROXY_BASE")
-} else if (System.getProperty("spark.ui.proxyBase") != null) {
+// SPARK-11484 - Use the proxyBase set by the AM, if not found then 
use env.
+if (System.getProperty("spark.ui.proxyBase") != null) {
--- End diff --

I'd use the opportunity to clean this up a bit:


sys.props("spark.ui.proxyBase").orElse(sys.env("APPLICATION_WEB_PROXY_BASE")).getOrElse("")


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11484][WebUi] Using proxyBase set by sp...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9448#issuecomment-153534052
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44966/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11484][WebUi] Using proxyBase set by sp...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9448#issuecomment-153534051
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Missing link to R DataFrame API doc

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9394#issuecomment-153533981
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44976/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Missing link to R DataFrame API doc

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9394#issuecomment-153533901
  
**[Test build #44976 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44976/consoleFull)**
 for PR 9394 at commit 
[`12f3a74`](https://github.com/apache/spark/commit/12f3a742db8f9dde54b7f8e1dfa9e880da1894a4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Missing link to R DataFrame API doc

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9394#issuecomment-153533980
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11484][WebUi] Using proxyBase set by sp...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9448#issuecomment-153533948
  
**[Test build #44966 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44966/consoleFull)**
 for PR 9448 at commit 
[`1b272a4`](https://github.com/apache/spark/commit/1b272a4724ee183a3a417bd2571b497ca1bd887d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11485][SQL] Make DataFrameHolder and Da...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9445#issuecomment-153533742
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11485][SQL] Make DataFrameHolder and Da...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9445#issuecomment-153533744
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44960/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11485][SQL] Make DataFrameHolder and Da...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9445#issuecomment-153533639
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11485][SQL] Make DataFrameHolder and Da...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9445#issuecomment-153533624
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11485][SQL] Make DataFrameHolder and Da...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9445#issuecomment-153533647
  
**[Test build #44960 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44960/consoleFull)**
 for PR 9445 at commit 
[`f623c5d`](https://github.com/apache/spark/commit/f623c5d545f51c9cfec6caf4bb6dbba97a92e320).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * 
`case class ExecutorLostFailure(`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11425] [SPARK-11486] Improve hybrid agg...

2015-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9383#discussion_r43828335
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala
 ---
@@ -589,6 +588,13 @@ abstract class AggregationQuerySuite extends QueryTest 
with SQLTestUtils with Te
 }
   }
 
+  test("no aggregation function") {
+val df = sqlContext.range(20).selectExpr("id", "repeat(id, 1) as s")
+  .groupBy("s").count()
+  .groupBy().count()
+checkAnswer(df, Row(20) :: Nil)
+  }
--- End diff --

Maybe add a description of the original problem at here or just add a link?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10949] Update Snappy version to 1.1.2

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9439#issuecomment-153533385
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44958/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10949] Update Snappy version to 1.1.2

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9439#issuecomment-153533381
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11425] [SPARK-11486] Improve hybrid agg...

2015-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9383#discussion_r43828243
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/UnsafeFixedWidthAggregationMapSuite.scala
 ---
@@ -183,31 +180,30 @@ class UnsafeFixedWidthAggregationMapSuite
   false // disable perf metrics
 )
 
+var map = createMap()
 val keys = randomStrings(1024).take(512)
 keys.foreach { keyString =>
   val buf = 
map.getAggregationBuffer(InternalRow(UTF8String.fromString(keyString)))
   buf.setInt(0, keyString.length)
   assert(buf != null)
 }
-
-// Convert the map into a sorter
 val sorter = map.destructAndCreateExternalSorter()
 
 // Add more keys to the sorter and make sure the results come out 
sorted.
 val additionalKeys = randomStrings(1024)
-val keyConverter = UnsafeProjection.create(groupKeySchema)
-val valueConverter = UnsafeProjection.create(aggBufferSchema)
-
+map = createMap()
 additionalKeys.zipWithIndex.foreach { case (str, i) =>
-  val k = InternalRow(UTF8String.fromString(str))
-  val v = InternalRow(str.length)
-  sorter.insertKV(keyConverter.apply(k), valueConverter.apply(v))
+  val buf = 
map.getAggregationBuffer(InternalRow(UTF8String.fromString(str)))
+  buf.setInt(0, str.length)
 
   if ((i % 100) == 0) {
-memoryManager.markExecutionAsOutOfMemoryOnce()
-sorter.closeCurrentPage()
+val sorter2 = map.destructAndCreateExternalSorter()
+sorter.merge(sorter2)
+map = createMap()
   }
 }
+val sorter2 = map.destructAndCreateExternalSorter()
+sorter.merge(sorter2)
--- End diff --

Looks like this updated test is not doing the exactly same thing with the 
previous version? In the old version, after we get the sorter we add new 
records to the sorter directly. In this updated version, we will always use 
merge to add spilled files to the sorter. Should we just create new tests for 
this new behavior (using `merge`)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11489][SQL] Only include common first o...

2015-11-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9446


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10949] Update Snappy version to 1.1.2

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9439#issuecomment-153533167
  
**[Test build #44958 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44958/consoleFull)**
 for PR 9439 at commit 
[`f9a021b`](https://github.com/apache/spark/commit/f9a021bd7d75992d8a0f9c82ba33efa60a5df55b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11466] [core] Avoid mockito in multi-th...

2015-11-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9425


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9454#issuecomment-153532873
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9454#issuecomment-153532869
  
**[Test build #44978 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44978/consoleFull)**
 for PR 9454 at commit 
[`df81d61`](https://github.com/apache/spark/commit/df81d61f73c6a854913df638770f0b0409f046a3).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:\n  * 
`abstract class Saver extends BaseSaveLoad `\n  * `trait Saveable `\n  * 
`abstract class Loader[T] extends BaseSaveLoad `\n  * `trait Loadable[T] `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9454#issuecomment-153532876
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44978/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-153532681
  
**[Test build #44977 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44977/consoleFull)**
 for PR 8984 at commit 
[`24976fb`](https://github.com/apache/spark/commit/24976fb7ae222507f28beec47e77ef8b1136fa9a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9454#issuecomment-153532562
  
**[Test build #44978 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44978/consoleFull)**
 for PR 9454 at commit 
[`df81d61`](https://github.com/apache/spark/commit/df81d61f73c6a854913df638770f0b0409f046a3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11425] [SPARK-11486] Improve Hybrid agg...

2015-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9383#discussion_r43827776
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala
 ---
@@ -502,44 +511,37 @@ class TungstenAggregationIterator(
 processRow(buffer, newInput)
   }
 } else {
-  while (!sortBased && inputIter.hasNext) {
+  var i = 0
+  while (inputIter.hasNext) {
 val newInput = inputIter.next()
 numInputRows += 1
 val groupingKey = groupProjection.apply(newInput)
-val buffer: UnsafeRow = 
hashMap.getAggregationBufferFromUnsafeRow(groupingKey)
+var buffer: UnsafeRow = null
+if (i < fallbackStartsAt) {
+  buffer = hashMap.getAggregationBufferFromUnsafeRow(groupingKey)
+}
 if (buffer == null) {
-  // buffer == null means that we could not allocate more memory.
-  // Now, we need to spill the map and switch to sort-based 
aggregation.
-  switchToSortBasedAggregation(groupingKey, newInput)
-} else {
-  processRow(buffer, newInput)
+  val sorter = hashMap.destructAndCreateExternalSorter()
+  if (externalSorter == null) {
+externalSorter = sorter
+  } else {
+externalSorter.merge(sorter)
+  }
+  i = 0
+  hashMap = createHashMap()
--- End diff --

I mean from this part of code, it is not obvious when we free the map.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11489][SQL] Only include common first o...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9446#issuecomment-153532514
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2960][Deploy] Support executing Spark f...

2015-11-03 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/8669#discussion_r43827693
  
--- Diff: sbin/start-slaves.sh ---
@@ -52,11 +51,11 @@ if [ "$SPARK_MASTER_IP" = "" ]; then
 fi
 
 if [ "$START_TACHYON" == "true" ]; then
-  "$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin"/../tachyon/bin/tachyon 
bootstrap-conf "$SPARK_MASTER_IP"
+  "${SPARK_HOME}/sbin/slaves.sh" cd "$SPARK_HOME" \; 
"${SPARK_HOME}/sbin"/../tachyon/bin/tachyon bootstrap-conf "$SPARK_MASTER_IP"
--- End diff --

Sorry @srowen I haven't noticed your comment, I will fix this today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11489][SQL] Only include common first o...

2015-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9446#issuecomment-153532516
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44963/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11425] [SPARK-11486] Improve Hybrid agg...

2015-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/9383#discussion_r43827738
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala
 ---
@@ -502,44 +511,37 @@ class TungstenAggregationIterator(
 processRow(buffer, newInput)
   }
 } else {
-  while (!sortBased && inputIter.hasNext) {
+  var i = 0
+  while (inputIter.hasNext) {
 val newInput = inputIter.next()
 numInputRows += 1
 val groupingKey = groupProjection.apply(newInput)
-val buffer: UnsafeRow = 
hashMap.getAggregationBufferFromUnsafeRow(groupingKey)
+var buffer: UnsafeRow = null
+if (i < fallbackStartsAt) {
+  buffer = hashMap.getAggregationBufferFromUnsafeRow(groupingKey)
+}
 if (buffer == null) {
-  // buffer == null means that we could not allocate more memory.
-  // Now, we need to spill the map and switch to sort-based 
aggregation.
-  switchToSortBasedAggregation(groupingKey, newInput)
-} else {
-  processRow(buffer, newInput)
+  val sorter = hashMap.destructAndCreateExternalSorter()
+  if (externalSorter == null) {
+externalSorter = sorter
+  } else {
+externalSorter.merge(sorter)
+  }
+  i = 0
+  hashMap = createHashMap()
--- End diff --

Before we create the new map, I guess it will not hurt if we call 
`hashMap`? We just make our clear that the previous map will be freed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11489][SQL] Only include common first o...

2015-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9446#issuecomment-153532394
  
**[Test build #44963 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44963/consoleFull)**
 for PR 9446 at commit 
[`132ea9a`](https://github.com/apache/spark/commit/132ea9a1769bbaf1d0e0c662d26a947ef34dd73f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   8   9   10   >