[jira] [Created] (SPARK-5224) parallelize list/ndarray is really slow

Davies Liu (JIRA) Tue, 13 Jan 2015 10:54:11 -0800

Davies Liu created SPARK-5224:
---------------------------------

             Summary: parallelize list/ndarray is really slow
                 Key: SPARK-5224
                 URL: https://issues.apache.org/jira/browse/SPARK-5224
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.2.0
            Reporter: Davies Liu
            Priority: Blocker



After the default batchSize changed to 0 (batched based on the size of object), 
but parallelize() still use BatchedSerializer with batchSize=1.

Also, BatchedSerializer did not work well with list and numpy.ndarray



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5224) parallelize list/ndarray is really slow

Reply via email to