[ 
https://issues.apache.org/jira/browse/SPARK-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484595#comment-14484595
 ] 

Yi Liu commented on SPARK-4590:
-------------------------------

Agree that Parameter server is a good approach to solve the (memory/network io) 
issue if features are high-dimensional. We are also interested in this and have 
done some work.
IndexedRDD is key-value store built on RDDs, which does not seem to be the 
right abstraction to support PS. RDD just represents some logic dataset that 
can be always reconstructed from its lineage; while the user can provide some 
hint, there is no mechanism to control exactly how the data is stored, 
distributed, replicated, etc., which is needed for a NoSQL style system like 
PS. [~rezazadeh], could you describe more details about your plan?

In general, we have several ways to add parameter server in Spark:
# Define a parameter server interfaces in Spark, and allow user to use custom 
implementations, but not add an implementation in Spark.
# Besides the interfaces, also add a default implementation in Spark. Do 
similar thing with spark shuffle service, each parameter server node will run 
as a service in Spark Worker (standalone mode) or auxiliary service in YARN NM 
(Spark on YARN). The parameter server is decentralized (no-master node) and is 
implemented using Java. Since parameter server is memory-intensive and in this 
approach parameter server is in the same process with Spark worker or Yarn NM, 
so it’s better to use off-heap for parameter store.
# Similar with #2 to have a default implementation in Spark. We extend Spark 
BlockManager to support distributing the features to executors and allow 
efficiently getting/updating subset or range of features. We can use java heap 
here.

> Early investigation of parameter server
> ---------------------------------------
>
>                 Key: SPARK-4590
>                 URL: https://issues.apache.org/jira/browse/SPARK-4590
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: ML, MLlib
>            Reporter: Xiangrui Meng
>            Assignee: Reza Zadeh
>
> In the currently implementation of GLM solvers, we save intermediate models 
> on the driver node and update it through broadcast and aggregation. Even with 
> torrent broadcast and tree aggregation added in 1.1, it is hard to go beyond 
> ~10 million features. This JIRA is for investigating the parameter server 
> approach, including algorithm, infrastructure, and dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to