[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/6354


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112886719
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112886667
  
  [Test build #35046 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35046/console)
 for   PR 6354 at commit 
[`fc4dc1e`](https://github.com/apache/spark/commit/fc4dc1e8d69e0eb6803fab23e8835b9753908f3a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MatrixUDT(UserDefinedType):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112875483
  
LGTM, waiting for the tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112858102
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112858304
  
  [Test build #35046 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35046/consoleFull)
 for   PR 6354 at commit 
[`fc4dc1e`](https://github.com/apache/spark/commit/fc4dc1e8d69e0eb6803fab23e8835b9753908f3a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112858082
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112857730
  
jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112830735
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112830642
  
  [Test build #35043 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35043/console)
 for   PR 6354 at commit 
[`fc4dc1e`](https://github.com/apache/spark/commit/fc4dc1e8d69e0eb6803fab23e8835b9753908f3a).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MatrixUDT(UserDefinedType):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112805307
  
  [Test build #35043 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35043/consoleFull)
 for   PR 6354 at commit 
[`fc4dc1e`](https://github.com/apache/spark/commit/fc4dc1e8d69e0eb6803fab23e8835b9753908f3a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112803651
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112803758
  
ping @davies anything left?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-112803607
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-107773521
  
  [Test build #868 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/868/consoleFull)
 for   PR 6354 at commit 
[`c940a44`](https://github.com/apache/spark/commit/c940a44191e072894289e67924922860b30e4e8d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MatrixUDT(UserDefinedType):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-06-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-107751555
  
  [Test build #868 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/868/consoleFull)
 for   PR 6354 at commit 
[`c940a44`](https://github.com/apache/spark/commit/c940a44191e072894289e67924922860b30e4e8d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105685053
  
**[Test build #33533 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33533/consoleFull)**
 for PR 6354 at commit 
[`c940a44`](https://github.com/apache/spark/commit/c940a44191e072894289e67924922860b30e4e8d)
 after a configured wait of `150m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105685060
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105685061
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33533/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/6354#discussion_r31074471
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -224,7 +224,10 @@ def schema(self):
 
StructType(List(StructField(age,IntegerType,true),StructField(name,StringType,true)))
 """
 if self._schema is None:
-self._schema = 
_parse_datatype_json_string(self._jdf.schema().json())
+try:
+self._schema = 
_parse_datatype_json_string(self._jdf.schema().json())
+except AttributeError:
+raise Exception("Unable to parse datatype from schema.")
--- End diff --

Could you put something about the original exception into the message?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105644996
  
  [Test build #33533 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33533/consoleFull)
 for   PR 6354 at commit 
[`c940a44`](https://github.com/apache/spark/commit/c940a44191e072894289e67924922860b30e4e8d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105644587
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105644614
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105644210
  
@davies thanks. that worked. could you give a pass now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105608870
  
@MechCoder  I had the same problem, but it turned out that there is an 
outdated python/lib/pyspark.zip there. After removing it, it worked fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105596699
  
Sure.

from pyspark.mllib.linalg import DenseMatrix, SparseMatrix, MatrixUDT
dm1 = DenseMatrix(3, 2, [0, 1, 4, 5, 9, 10])
sm1 = SparseMatrix(1, 1, [0, 1], [0], [2.0])
rdd = sc.parallelize([("dense", dm1)])
df = rdd.toDF() 
df.collect()

I know there is something silly, but I'm not able to figure out on my own, :


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-26 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-105574591
  
@MechCoder Could you have a unit test to reproduce the error?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104879867
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104879844
  
  [Test build #33404 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33404/consoleFull)
 for   PR 6354 at commit 
[`aa9c391`](https://github.com/apache/spark/commit/aa9c3914d761c63ace177974b74d3eeb01d389e6).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MatrixUDT(UserDefinedType):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104879869
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33404/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104868248
  
  [Test build #33404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33404/consoleFull)
 for   PR 6354 at commit 
[`aa9c391`](https://github.com/apache/spark/commit/aa9c3914d761c63ace177974b74d3eeb01d389e6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104868178
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104868180
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-23 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104855363
  
>  pyUDT should be defined in MatrixUDT

Thanks. I am now able to create a dataframe, but when I do `df.collect` it 
crashes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104790464
  
Could you also catch the exception from `_parse_datatype_json_string` in 
DataFrame.schema() and raise a different one (For example, just Exception()) ? 
Or AttributeError will be handled specially, causing infinite loop. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104790085
  
@MechCoder pyUDT should be defined in MatrixUDT


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/6354#discussion_r30934342
  
--- Diff: python/pyspark/mllib/linalg.py ---
@@ -163,6 +163,59 @@ def simpleString(self):
 return "vector"
 
 
+class MatrixUDT(UserDefinedType):
+"""
+SQL user-defined type (UDT) for Matrix.
+"""
+
+@classmethod
+def sqlType(cls):
+return StructType([
+StructField("type", ByteType(), False),
+StructField("numRows", IntegerType(), False),
+StructField("numCols", IntegerType(), False),
+StructField("colPtrs", ArrayType(IntegerType(), False), True),
+StructField("rowIndices", ArrayType(IntegerType(), False), 
True),
+StructField("values", ArrayType(DoubleType(), False), True),
+StructField("isTransposed", BooleanType(), False)])
+
+@classmethod
+def module(cls):
+return "pyspark.mllib.linalg"
+
+@classmethod
+def scalaUDT(cls):
+return "org.apache.spark.mllib.linalg.MatrixUDT"
+
+def serialize(self, obj):
+if isinstance(obj, SparseMatrix):
+colPtrs = [int(i) for i in obj.colPtrs]
+rowIndices = [int(i) for i in obj.rowIndices]
+values = [float(v) for v in obj.values]
+return (0, obj.numRows, obj.numCols, colPtrs,
+rowIndices, values, bool(obj.isTransposed))
+elif isinstance(obj, DenseMatrix):
+values = [float(v) for v in obj.values]
+return (1, obj.numRows, obj.numCols, None, None, values,
+bool(obj.isTransposed))
+else:
+raise TypeError("cannot serialize %r of type %r" % (obj, 
type(obj)))
--- End diff --

the `repr(obj)` could be very long, I think just having `type(obj)` is fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/6354#discussion_r30934294
  
--- Diff: python/pyspark/mllib/linalg.py ---
@@ -163,6 +163,59 @@ def simpleString(self):
 return "vector"
 
 
+class MatrixUDT(UserDefinedType):
+"""
+SQL user-defined type (UDT) for Matrix.
+"""
+
+@classmethod
+def sqlType(cls):
+return StructType([
+StructField("type", ByteType(), False),
+StructField("numRows", IntegerType(), False),
+StructField("numCols", IntegerType(), False),
+StructField("colPtrs", ArrayType(IntegerType(), False), True),
+StructField("rowIndices", ArrayType(IntegerType(), False), 
True),
+StructField("values", ArrayType(DoubleType(), False), True),
+StructField("isTransposed", BooleanType(), False)])
+
+@classmethod
+def module(cls):
+return "pyspark.mllib.linalg"
+
+@classmethod
+def scalaUDT(cls):
+return "org.apache.spark.mllib.linalg.MatrixUDT"
+
+def serialize(self, obj):
+if isinstance(obj, SparseMatrix):
+colPtrs = [int(i) for i in obj.colPtrs]
+rowIndices = [int(i) for i in obj.rowIndices]
+values = [float(v) for v in obj.values]
--- End diff --

Do we still need `values` for SparseMatrix, that could be HUGE.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104655938
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33341/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104655924
  
  [Test build #33341 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33341/consoleFull)
 for   PR 6354 at commit 
[`62a2a7d`](https://github.com/apache/spark/commit/62a2a7d06aaba477b999e854c27b624123e7a006).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MatrixUDT(UserDefinedType):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104655937
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread MechCoder
Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104635990
  
The serialization and deserialization works but trying to create a 
DataFrame using a Matrix, gives me this error, 

RuntimeError: maximum recursion depth exceeded in __instancecheck__

Code to replicate

from pyspark.mllib.linalg import DenseMatrix, SparseMatrix, MatrixUDT
dm1 = DenseMatrix(3, 2, [0, 1, 4, 5, 9, 10])
sm1 = SparseMatrix(1, 1, [0, 1], [0], [2.0])
rdd = sc.parallelize([("dense", dm1)])
rdd.toDF()

This fails with the above mentioned error.

cc @davies @rxin Any thoughts would be appreciated,,


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104624457
  
  [Test build #33341 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33341/consoleFull)
 for   PR 6354 at commit 
[`62a2a7d`](https://github.com/apache/spark/commit/62a2a7d06aaba477b999e854c27b624123e7a006).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104623765
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6390] [SQL] [MLlib] Port MatrixUDT to P...

2015-05-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6354#issuecomment-104623787
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org