[ 
https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangrui Meng updated SPARK-3600:
---------------------------------
    Description: RDD's classTag is not passed in through CacheManager. So 
RDD[Double] uses object arrays for caching, which leads to huge overhead. 
However, we need to send the classTag down many levels to make it work.  (was: 
RandomDataGenerator doesn't have a classTag or @specilaized. So the generated 
RDDs are RDDs of objects, that cause huge storage overhead.)

> RDD[Double] doesn't use primitive arrays for caching
> ----------------------------------------------------
>
>                 Key: SPARK-3600
>                 URL: https://issues.apache.org/jira/browse/SPARK-3600
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 1.1.0
>            Reporter: Xiangrui Meng
>            Assignee: Xiangrui Meng
>
> RDD's classTag is not passed in through CacheManager. So RDD[Double] uses 
> object arrays for caching, which leads to huge overhead. However, we need to 
> send the classTag down many levels to make it work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to