Repository: spark
Updated Branches:
  refs/heads/master 39e333ec4 -> 8ead999fd


[SPARK-5223] [MLlib] [PySpark] fix MapConverter and ListConverter in MLlib

It will introduce problems if the object in dict/list/tuple can not support by 
py4j, such as Vector.
Also, pickle may have better performance for larger object (less RPC).

In some cases that the object in dict/list can not be pickled (such as 
JavaObject), we should still use MapConvert/ListConvert.

This PR should be ported into branch-1.2

Author: Davies Liu <dav...@databricks.com>

Closes #4023 from davies/listconvert and squashes the following commits:

55d4ab2 [Davies Liu] fix MapConverter and ListConverter in MLlib


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8ead999f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8ead999f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8ead999f

Branch: refs/heads/master
Commit: 8ead999fd627b12837fb2f082a0e76e9d121d269
Parents: 39e333e
Author: Davies Liu <dav...@databricks.com>
Authored: Tue Jan 13 12:50:31 2015 -0800
Committer: Xiangrui Meng <m...@databricks.com>
Committed: Tue Jan 13 12:50:31 2015 -0800

----------------------------------------------------------------------
 python/pyspark/mllib/common.py | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/8ead999f/python/pyspark/mllib/common.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/common.py b/python/pyspark/mllib/common.py
index 33c49e2..3c5ee66 100644
--- a/python/pyspark/mllib/common.py
+++ b/python/pyspark/mllib/common.py
@@ -18,7 +18,7 @@
 import py4j.protocol
 from py4j.protocol import Py4JJavaError
 from py4j.java_gateway import JavaObject
-from py4j.java_collections import MapConverter, ListConverter, JavaArray, 
JavaList
+from py4j.java_collections import ListConverter, JavaArray, JavaList
 
 from pyspark import RDD, SparkContext
 from pyspark.serializers import PickleSerializer, AutoBatchedSerializer
@@ -70,9 +70,7 @@ def _py2java(sc, obj):
         obj = _to_java_object_rdd(obj)
     elif isinstance(obj, SparkContext):
         obj = obj._jsc
-    elif isinstance(obj, dict):
-        obj = MapConverter().convert(obj, sc._gateway._gateway_client)
-    elif isinstance(obj, (list, tuple)):
+    elif isinstance(obj, list) and (obj or isinstance(obj[0], JavaObject)):
         obj = ListConverter().convert(obj, sc._gateway._gateway_client)
     elif isinstance(obj, JavaObject):
         pass


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to