from:"sjk \(JIRA\)"

[jira] [Commented] (SPARK-6932) A Prototype of Parameter Server

2015-04-15 Thread sjk (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497271#comment-14497271
 ] 

sjk commented on SPARK-6932:


Spark-4590, just like it's name "Early investigation of parameter server", is 
focusing on investigating possibility and difference of parameter server. It's 
a brainstorming issue with low activities. Thanks to its contribution, we know 
it's feasible to implement PS on Spark. And now it should come to the next 
stage. This issue focus on designing and implementing details of integrating 
parameter server into Spark, with better usability and performance. In fact, a 
prototype is already implemented  and workable. But some improvement is 
necessary before we commit it. And because there‘s some change to the core 
module, it's not proper to be put into spark-packages. [~srowen]

> A Prototype of Parameter Server
> ---
>
> Key: SPARK-6932
> URL: https://issues.apache.org/jira/browse/SPARK-6932
> Project: Spark
>  Issue Type: New Feature
>  Components: ML, MLlib
>Reporter: Qiping Li
>
> h2. Introduction
> As specified in 
> [SPARK-4590|https://issues.apache.org/jira/browse/SPARK-4590],it would be 
> very helpful to integrate parameter server into Spark for machine learning 
> algorithms, especially for those with ultra high dimensions features. 
> After carefully studying the design doc of [Parameter 
> Servers|https://docs.google.com/document/d/1SX3nkmF41wFXAAIr9BgqvrHSS5mW362fJ7roBXJm06o/edit?usp=sharing],and
>  the paper of [Factorbird|http://stanford.edu/~rezab/papers/factorbird.pdf], 
> we proposed a prototype of Parameter Server on Spark(Ps-on-Spark), with 
> several key design concerns:
> * *User friendly interface*
>   Careful investigation is done to most existing Parameter Server 
> systems(including:  [petuum|http://petuum.github.io], [parameter 
> server|http://parameterserver.org], 
> [paracel|https://github.com/douban/paracel]) and a user friendly interface is 
> design by absorbing essence from all these system. 
> * *Prototype of distributed array*
> IndexRDD (see 
> [SPARK-4590|https://issues.apache.org/jira/browse/SPARK-4590]) doesn't seem 
> to be a good option for distributed array, because in most case, the #key 
> updates/second is not be very high. 
> So we implement a distributed HashMap to store the parameters, which can 
> be easily extended to get better performance.
> 
> * *Minimal code change*
>   Quite a lot of effort in done to avoid code change of Spark core. Tasks 
> which need parameter server are still created and scheduled by Spark's 
> scheduler. Tasks communicate with parameter server with a client object, 
> through *akka* or *netty*.
> With all these concerns we propose the following architecture:
> h2. Architecture
> !https://cloud.githubusercontent.com/assets/1285855/7158179/f2d25cc4-e3a9-11e4-835e-89681596c478.jpg!
> Data is stored in RDD and is partitioned across workers. During each 
> iteration, each worker gets parameters from parameter server then computes 
> new parameters based on old parameters and data in the partition. Finally 
> each worker updates parameters to parameter server.Worker communicates with 
> parameter server through a parameter server client,which is initialized in 
> `TaskContext` of this worker.
> The current implementation is based on YARN cluster mode, 
> but it should not be a problem to transplanted it to other modes. 
> h3. Interface
> We refer to existing parameter server systems(petuum, parameter server, 
> paracel) when design the interface of parameter server. 
> *`PSClient` provides the following interface for workers to use:*
> {code}
> //  get parameter indexed by key from parameter server
> def get[T](key: String): T
> // get multiple parameters from parameter server
> def multiGet[T](keys: Array[String]): Array[T]
> // add parameter indexed by `key` by `delta`, 
> // if multiple `delta` to update on the same parameter,
> // use `reduceFunc` to reduce these `delta`s frist.
> def update[T](key: String, delta: T, reduceFunc: (T, T) => T): Unit
> // update multiple parameters at the same time, use the same `reduceFunc`.
> def multiUpdate(keys: Array[String], delta: Array[T], reduceFunc: (T, T) => 
> T: Unit
> 
> // advance clock to indicate that current iteration is finished.
> def clock(): Unit
>  
> // block until all workers have reached this line of code.
> def sync(): Unit
> {code}
> *`PSContext` provides following functions to use on driver:*
> {code}
> // load parameters from existing rdd.
> def loadPSModel[T](model: RDD[String, T]) 
> // fetch parameters from parameter server to construct model.
> def fetchPSModel[T](keys: Array[String]): Array[T]
> {code} 
> 
> *A new function has been add to `RDD` to run parameter server tasks:*
> {code}
> // run the provided

[jira] [Created] (SPARK-6494) rdd polymorphic method zipPartitions refactor

2015-03-24 Thread sjk (JIRA)

sjk created SPARK-6494:
--

 Summary: rdd polymorphic method zipPartitions refactor
 Key: SPARK-6494
 URL: https://issues.apache.org/jira/browse/SPARK-6494
 Project: Spark
  Issue Type: Improvement
Reporter: sjk


no need so many polymorphic method, only add default value instead.
modify partition.size instead of partition.length, partitions is Array object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5062) Pregel use aggregateMessage instead of mapReduceTriplets function

2015-01-29 Thread sjk (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296601#comment-14296601
 ] 

sjk commented on SPARK-5062:


is any one care?

> Pregel use aggregateMessage instead of mapReduceTriplets function
> -
>
> Key: SPARK-5062
> URL: https://issues.apache.org/jira/browse/SPARK-5062
> Project: Spark
>  Issue Type: Wish
>  Components: GraphX
>Reporter: sjk
> Attachments: graphx_aggreate_msg.jpg
>
>
> since spark 1.2 introduce aggregateMessage instead of mapReduceTriplets, it 
> improve the performance indeed.
> it's time to replace mapReduceTriplets with aggregateMessage in Pregel.
> we can discuss it.
> i have draw a graph of aggregateMessage to show why it can improve the 
> performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5062) Pregel use aggregateMessage instead of mapReduceTriplets function

2015-01-02 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk updated SPARK-5062:
---
Attachment: graphx_aggreate_msg.jpg

> Pregel use aggregateMessage instead of mapReduceTriplets function
> -
>
> Key: SPARK-5062
> URL: https://issues.apache.org/jira/browse/SPARK-5062
> Project: Spark
>  Issue Type: Wish
>  Components: GraphX
>Reporter: sjk
> Attachments: graphx_aggreate_msg.jpg
>
>
> since spark 1.2 introduce aggregateMessage instead of mapReduceTriplets, it 
> improve the performance indeed.
> it's time to replace mapReduceTriplets with aggregateMessage in Pregel.
> we can discuss it.
> i have draw a graph of aggregateMessage to show why it can improve the 
> performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-5062) Pregel use aggregateMessage instead of mapReduceTriplets function

2015-01-02 Thread sjk (JIRA)

sjk created SPARK-5062:
--

 Summary: Pregel use aggregateMessage instead of mapReduceTriplets 
function
 Key: SPARK-5062
 URL: https://issues.apache.org/jira/browse/SPARK-5062
 Project: Spark
  Issue Type: Wish
  Components: GraphX
Reporter: sjk


since spark 1.2 introduce aggregateMessage instead of mapReduceTriplets, it 
improve the performance indeed.

it's time to replace mapReduceTriplets with aggregateMessage in Pregel.

we can discuss it.

i have draw a graph of aggregateMessage to show why it can improve the 
performance.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5036) Better support sending partial messages in Pregel API

2015-01-02 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk updated SPARK-5036:
---
Description: 
Better support sending partial messages in Pregel API

1. the reqirement

In many iterative graph algorithms, only a part of the vertexes (we call them 
ActiveVertexes) need to send messages to their neighbours in each iteration. In 
many cases, ActiveVertexes are the vertexes that their attributes do not change 
between the previous and current iteration. To implement this requirement, we 
can use Pregel API + a flag (e.g., `bool isAttrChanged`) in each vertex's 
attribute. 

However, after `aggregateMessage` or `mapReduceTriplets` of each iteration, we 
need to reset this flag to the init value in every vertex, which needs a heavy 
`joinVertices`. 

We find a more efficient way to meet this requirement and want to discuss it 
here.


Look at a simple example as follows:

In i-th iteartion, the previous attribute of each vertex is `Attr` and the 
newly computed attribute is `NewAttr`:

|VID| Attr| NewAttr| Neighbours|
|:|:-|:|:--|
| 1 | 4| 5| 2, 3 |
| 2 | 3| 2| 1, 4 |
| 3 | 2| 2| 1, 4 |
| 4|  3| 4| 1, 2, 3 |

Our requirement is that: 

1.  Set each vertex's `Attr` to be `NewAttr` in i-th iteration
2.  For each vertex whose `Attr!=NewAttr`, send message to its neighbours 
in the next iteration's `aggregateMessage`.


We found it is hard to implement this requirment using current Pregel API 
efficiently. The reason is that we not only need to perform `pregel()` to  
compute the `NewAttr`  (2) but also need to perform `outJoin()` to satisfy (1).

A simple idea is to keep a `isAttrChanged:Boolean` (solution 1)  or `flag:Int` 
(solution 2) in each vertex's attribute.

 2. two solution  
---

2.1 solution 1: label and reset `isAttrChanged:Boolean` of Vertex Attr

![alt text](s1.jpeg "Title")

1. init message by `aggregateMessage`
it return a messageRDD
2. `innerJoin`
compute the messages on the received vertex, return a new VertexRDD 
which have the computed value by customed logic function `vprog`, set 
`isAttrChanged = true`
3. `outerJoinVertices`
update the changed vertex to the whole graph. now the graph is new.
4. `aggregateMessage`. it return a messageRDD
5. `joinVertices`  reset erery `isAttrChanged` of Vertex attr to false

```
//  here reset the isAttrChanged to false
g = updateG.joinVertices(updateG.vertices) {
(vid, oriVertex, updateGVertex) => updateGVertex.reset()
}
   ```
   here need to reset the vertex attribute object's variable as false

if don't reset the `isAttrChanged`, it will send message next iteration 
directly.

**result:**  

*   Edge: 890041895 
*   Vertex: 181640208
*   Iterate: 150 times
*   Cost total: 8.4h
*   can't run until the 0 message 


solution 2. color vertex

![alt text](s2.jpeg "Title")

iterate process:

1. innerJoin 
  `vprog` using as a partial function, looks like `vprog(curIter, _: VertexId, 
_: VD, _: A)`
  ` i = i + 1; val curIter = i`. 
  in `vprog`, user can fetch `curIter` and assign to `falg`.
2. outerJoinVertices
`graph = graph.outerJoinVertices(changedVerts) { (vid, old, newOpt) => 
newOpt.getOrElse(old)}.cache()`
3. aggregateMessages 
sendMsg is partial function, looks like `sendMsg(curIter, _: 
EdgeContext[VD, ED, A]`
**in `sendMsg`, compare `curIter` with `flag`, determine whether 
sending message**

result

raw data   from

*   vertex: 181640208
*   edge: 890041895


|  | iteration average cost | 150 iteration cost | 420 iteration cost | 
|  | - |  |  |
|  solution 1 | 188m | 7.8h | cannot finish  |
|  solution 2 | 24 | 1.2h   | 3.1h | 
| compare  | 7x  | 6.5x  | finished in 3.1 |


##  the end

i think the second solution(Pregel + a flag) is better.
this can really support the iterative graph algorithms which only part of the 
vertexes send messages to their neighbours in each iteration.

we shall use it in product environment.

pr: https://github.com/apache/spark/pull/3866

EOF


  was:
Better support sending partial messages in Pregel API

1. the reqirement

In many iterative graph algorithms, only a part of the vertexes (we call them 
ActiveVertexes) need to send messages to their neighbours in each iteration. In 
many cases, ActiveVertexes are the vertexes that their attributes do not change 
between the previous and current iteration. To implement this requirement, we 
can use Pregel API + a flag (e.g., `bool isAttrChanged`) in each vertex's 
attribute. 

However, after `aggregateMessage` or `mapReduceTriplets` of each iteration, we 
need to reset this flag to the init value in every vertex, which needs a heavy 
`joinVertices`. 

We find a more efficient way to meet this requirement and want to dis

[jira] [Updated] (SPARK-5036) Better support sending partial messages in Pregel API

2014-12-31 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk updated SPARK-5036:
---
Attachment: s2.jpeg

> Better support sending partial messages in Pregel API
> -
>
> Key: SPARK-5036
> URL: https://issues.apache.org/jira/browse/SPARK-5036
> Project: Spark
>  Issue Type: Improvement
>  Components: GraphX
>Reporter: sjk
> Attachments: s1.jpeg, s2.jpeg
>
>
> Better support sending partial messages in Pregel API
> 1. the reqirement
> In many iterative graph algorithms, only a part of the vertexes (we call them 
> ActiveVertexes) need to send messages to their neighbours in each iteration. 
> In many cases, ActiveVertexes are the vertexes that their attributes do not 
> change between the previous and current iteration. To implement this 
> requirement, we can use Pregel API + a flag (e.g., `bool isAttrChanged`) in 
> each vertex's attribute. 
> However, after `aggregateMessage` or `mapReduceTriplets` of each iteration, 
> we need to reset this flag to the init value in every vertex, which needs a 
> heavy `joinVertices`. 
> We find a more efficient way to meet this requirement and want to discuss it 
> here.
> Look at a simple example as follows:
> In i-th iteartion, the previous attribute of each vertex is `Attr` and the 
> newly computed attribute is `NewAttr`:
> |VID| Attr| NewAttr| Neighbours|
> |:|:-|:|:--|
> | 1 | 4| 5| 2, 3 |
> | 2 | 3| 2| 1, 4 |
> | 3 | 2| 2| 1, 4 |
> | 4|  3| 4| 1, 2, 3 |
> Our requirement is that: 
> 1.Set each vertex's `Attr` to be `NewAttr` in i-th iteration
> 2.For each vertex whose `Attr!=NewAttr`, send message to its neighbours 
> in the next iteration's `aggregateMessage`.
> We found it is hard to implement this requirment using current Pregel API 
> efficiently. The reason is that we not only need to perform `pregel()` to  
> compute the `NewAttr`  (2) but also need to perform `outJoin()` to satisfy 
> (1).
> A simple idea is to keep a `isAttrChanged:Boolean` (solution 1)  or 
> `flag:Int` (solution 2) in each vertex's attribute.
>  2. two solution  
> ---
> 2.1 solution 1: label and reset `isAttrChanged:Boolean` of Vertex Attr
> ![alt text](s1.jpeg "Title")
> 1. init message by `aggregateMessage`
>   it return a messageRDD
> 2. `innerJoin`
>   compute the messages on the received vertex, return a new VertexRDD 
> which have the computed value by customed logic function `vprog`, set 
> `isAttrChanged = true`
> 3. `outerJoinVertices`
>   update the changed vertex to the whole graph. now the graph is new.
> 4. `aggregateMessage`. it return a messageRDD
> 5. `joinVertices`  reset erery `isAttrChanged` of Vertex attr to false
>   ```
>   //  here reset the isAttrChanged to false
>   g = updateG.joinVertices(updateG.vertices) {
>   (vid, oriVertex, updateGVertex) => updateGVertex.reset()
>   }
>```
>here need to reset the vertex attribute object's variable as false
> if don't reset the `isAttrChanged`, it will send message next iteration 
> directly.
> **result:**  
> * Edge: 890041895 
> * Vertex: 181640208
> * Iterate: 150 times
> * Cost total: 8.4h
> * can't run until the 0 message 
> solution 2. color vertex
> ![alt text](s2.jpeg "Title")
> iterate process:
> 1. innerJoin 
>   `vprog` using as a partial function, looks like `vprog(curIter, _: 
> VertexId, _: VD, _: A)`
>   ` i = i + 1; val curIter = i`. 
>   in `vprog`, user can fetch `curIter` and assign to `falg`.
> 2. outerJoinVertices
>   `graph = graph.outerJoinVertices(changedVerts) { (vid, old, newOpt) => 
> newOpt.getOrElse(old)}.cache()`
> 3. aggregateMessages 
>   sendMsg is partial function, looks like `sendMsg(curIter, _: 
> EdgeContext[VD, ED, A]`
>   **in `sendMsg`, compare `curIter` with `flag`, determine whether 
> sending message**
>   result
> raw data   from
> * vertex: 181640208
> * edge: 890041895
> |  | iteration average cost | 150 iteration cost | 420 iteration cost | 
> |  | - |  |  |
> |  solution 1 | 188m | 7.8h | cannot finish  |
> |  solution 2 | 24 | 1.2h   | 3.1h | 
> | compare  | 7x  | 6.5x  | finished in 3.1 |
> 
> ##the end
> 
> i think the second solution(Pregel + a flag) is better.
> this can really support the iterative graph algorithms which only part of the 
> vertexes send messages to their neighbours in each iteration.
> we shall use it in product environment.
> EOF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5036) Better support sending partial messages in Pregel API

2014-12-31 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk updated SPARK-5036:
---
Attachment: s1.jpeg

> Better support sending partial messages in Pregel API
> -
>
> Key: SPARK-5036
> URL: https://issues.apache.org/jira/browse/SPARK-5036
> Project: Spark
>  Issue Type: Improvement
>  Components: GraphX
>Reporter: sjk
> Attachments: s1.jpeg
>
>
> Better support sending partial messages in Pregel API
> 1. the reqirement
> In many iterative graph algorithms, only a part of the vertexes (we call them 
> ActiveVertexes) need to send messages to their neighbours in each iteration. 
> In many cases, ActiveVertexes are the vertexes that their attributes do not 
> change between the previous and current iteration. To implement this 
> requirement, we can use Pregel API + a flag (e.g., `bool isAttrChanged`) in 
> each vertex's attribute. 
> However, after `aggregateMessage` or `mapReduceTriplets` of each iteration, 
> we need to reset this flag to the init value in every vertex, which needs a 
> heavy `joinVertices`. 
> We find a more efficient way to meet this requirement and want to discuss it 
> here.
> Look at a simple example as follows:
> In i-th iteartion, the previous attribute of each vertex is `Attr` and the 
> newly computed attribute is `NewAttr`:
> |VID| Attr| NewAttr| Neighbours|
> |:|:-|:|:--|
> | 1 | 4| 5| 2, 3 |
> | 2 | 3| 2| 1, 4 |
> | 3 | 2| 2| 1, 4 |
> | 4|  3| 4| 1, 2, 3 |
> Our requirement is that: 
> 1.Set each vertex's `Attr` to be `NewAttr` in i-th iteration
> 2.For each vertex whose `Attr!=NewAttr`, send message to its neighbours 
> in the next iteration's `aggregateMessage`.
> We found it is hard to implement this requirment using current Pregel API 
> efficiently. The reason is that we not only need to perform `pregel()` to  
> compute the `NewAttr`  (2) but also need to perform `outJoin()` to satisfy 
> (1).
> A simple idea is to keep a `isAttrChanged:Boolean` (solution 1)  or 
> `flag:Int` (solution 2) in each vertex's attribute.
>  2. two solution  
> ---
> 2.1 solution 1: label and reset `isAttrChanged:Boolean` of Vertex Attr
> ![alt text](s1.jpeg "Title")
> 1. init message by `aggregateMessage`
>   it return a messageRDD
> 2. `innerJoin`
>   compute the messages on the received vertex, return a new VertexRDD 
> which have the computed value by customed logic function `vprog`, set 
> `isAttrChanged = true`
> 3. `outerJoinVertices`
>   update the changed vertex to the whole graph. now the graph is new.
> 4. `aggregateMessage`. it return a messageRDD
> 5. `joinVertices`  reset erery `isAttrChanged` of Vertex attr to false
>   ```
>   //  here reset the isAttrChanged to false
>   g = updateG.joinVertices(updateG.vertices) {
>   (vid, oriVertex, updateGVertex) => updateGVertex.reset()
>   }
>```
>here need to reset the vertex attribute object's variable as false
> if don't reset the `isAttrChanged`, it will send message next iteration 
> directly.
> **result:**  
> * Edge: 890041895 
> * Vertex: 181640208
> * Iterate: 150 times
> * Cost total: 8.4h
> * can't run until the 0 message 
> solution 2. color vertex
> ![alt text](s2.jpeg "Title")
> iterate process:
> 1. innerJoin 
>   `vprog` using as a partial function, looks like `vprog(curIter, _: 
> VertexId, _: VD, _: A)`
>   ` i = i + 1; val curIter = i`. 
>   in `vprog`, user can fetch `curIter` and assign to `falg`.
> 2. outerJoinVertices
>   `graph = graph.outerJoinVertices(changedVerts) { (vid, old, newOpt) => 
> newOpt.getOrElse(old)}.cache()`
> 3. aggregateMessages 
>   sendMsg is partial function, looks like `sendMsg(curIter, _: 
> EdgeContext[VD, ED, A]`
>   **in `sendMsg`, compare `curIter` with `flag`, determine whether 
> sending message**
>   result
> raw data   from
> * vertex: 181640208
> * edge: 890041895
> |  | iteration average cost | 150 iteration cost | 420 iteration cost | 
> |  | - |  |  |
> |  solution 1 | 188m | 7.8h | cannot finish  |
> |  solution 2 | 24 | 1.2h   | 3.1h | 
> | compare  | 7x  | 6.5x  | finished in 3.1 |
> 
> ##the end
> 
> i think the second solution(Pregel + a flag) is better.
> this can really support the iterative graph algorithms which only part of the 
> vertexes send messages to their neighbours in each iteration.
> we shall use it in product environment.
> EOF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-5036) Better support sending partial messages in Pregel API

2014-12-31 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk updated SPARK-5036:
---
Description: 
Better support sending partial messages in Pregel API

1. the reqirement

In many iterative graph algorithms, only a part of the vertexes (we call them 
ActiveVertexes) need to send messages to their neighbours in each iteration. In 
many cases, ActiveVertexes are the vertexes that their attributes do not change 
between the previous and current iteration. To implement this requirement, we 
can use Pregel API + a flag (e.g., `bool isAttrChanged`) in each vertex's 
attribute. 

However, after `aggregateMessage` or `mapReduceTriplets` of each iteration, we 
need to reset this flag to the init value in every vertex, which needs a heavy 
`joinVertices`. 

We find a more efficient way to meet this requirement and want to discuss it 
here.


Look at a simple example as follows:

In i-th iteartion, the previous attribute of each vertex is `Attr` and the 
newly computed attribute is `NewAttr`:

|VID| Attr| NewAttr| Neighbours|
|:|:-|:|:--|
| 1 | 4| 5| 2, 3 |
| 2 | 3| 2| 1, 4 |
| 3 | 2| 2| 1, 4 |
| 4|  3| 4| 1, 2, 3 |

Our requirement is that: 

1.  Set each vertex's `Attr` to be `NewAttr` in i-th iteration
2.  For each vertex whose `Attr!=NewAttr`, send message to its neighbours 
in the next iteration's `aggregateMessage`.


We found it is hard to implement this requirment using current Pregel API 
efficiently. The reason is that we not only need to perform `pregel()` to  
compute the `NewAttr`  (2) but also need to perform `outJoin()` to satisfy (1).

A simple idea is to keep a `isAttrChanged:Boolean` (solution 1)  or `flag:Int` 
(solution 2) in each vertex's attribute.

 2. two solution  
---

2.1 solution 1: label and reset `isAttrChanged:Boolean` of Vertex Attr

![alt text](s1.jpeg "Title")

1. init message by `aggregateMessage`
it return a messageRDD
2. `innerJoin`
compute the messages on the received vertex, return a new VertexRDD 
which have the computed value by customed logic function `vprog`, set 
`isAttrChanged = true`
3. `outerJoinVertices`
update the changed vertex to the whole graph. now the graph is new.
4. `aggregateMessage`. it return a messageRDD
5. `joinVertices`  reset erery `isAttrChanged` of Vertex attr to false

```
//  here reset the isAttrChanged to false
g = updateG.joinVertices(updateG.vertices) {
(vid, oriVertex, updateGVertex) => updateGVertex.reset()
}
   ```
   here need to reset the vertex attribute object's variable as false

if don't reset the `isAttrChanged`, it will send message next iteration 
directly.

**result:**  

*   Edge: 890041895 
*   Vertex: 181640208
*   Iterate: 150 times
*   Cost total: 8.4h
*   can't run until the 0 message 


solution 2. color vertex

![alt text](s2.jpeg "Title")

iterate process:

1. innerJoin 
  `vprog` using as a partial function, looks like `vprog(curIter, _: VertexId, 
_: VD, _: A)`
  ` i = i + 1; val curIter = i`. 
  in `vprog`, user can fetch `curIter` and assign to `falg`.
2. outerJoinVertices
`graph = graph.outerJoinVertices(changedVerts) { (vid, old, newOpt) => 
newOpt.getOrElse(old)}.cache()`
3. aggregateMessages 
sendMsg is partial function, looks like `sendMsg(curIter, _: 
EdgeContext[VD, ED, A]`
**in `sendMsg`, compare `curIter` with `flag`, determine whether 
sending message**

result

raw data   from

*   vertex: 181640208
*   edge: 890041895


|  | iteration average cost | 150 iteration cost | 420 iteration cost | 
|  | - |  |  |
|  solution 1 | 188m | 7.8h | cannot finish  |
|  solution 2 | 24 | 1.2h   | 3.1h | 
| compare  | 7x  | 6.5x  | finished in 3.1 |


##  the end

i think the second solution(Pregel + a flag) is better.
this can really support the iterative graph algorithms which only part of the 
vertexes send messages to their neighbours in each iteration.

we shall use it in product environment.

EOF


  was:
# Better support sending partial messages in Pregel API

### 1. the reqirement

In many iterative graph algorithms, only a part of the vertexes (we call them 
ActiveVertexes) need to send messages to their neighbours in each iteration. In 
many cases, ActiveVertexes are the vertexes that their attributes do not change 
between the previous and current iteration. To implement this requirement, we 
can use Pregel API + a flag (e.g., `bool isAttrChanged`) in each vertex's 
attribute. 

However, after `aggregateMessage` or `mapReduceTriplets` of each iteration, we 
need to reset this flag to the init value in every vertex, which needs a heavy 
`joinVertices`. 

We find a more efficient way to meet this requirement and want to discuss it 
here.


Look at a simple exa

[jira] [Created] (SPARK-5036) Better support sending partial messages in Pregel API

2014-12-31 Thread sjk (JIRA)

sjk created SPARK-5036:
--

 Summary: Better support sending partial messages in Pregel API
 Key: SPARK-5036
 URL: https://issues.apache.org/jira/browse/SPARK-5036
 Project: Spark
  Issue Type: Improvement
  Components: GraphX
Reporter: sjk


# Better support sending partial messages in Pregel API

### 1. the reqirement

In many iterative graph algorithms, only a part of the vertexes (we call them 
ActiveVertexes) need to send messages to their neighbours in each iteration. In 
many cases, ActiveVertexes are the vertexes that their attributes do not change 
between the previous and current iteration. To implement this requirement, we 
can use Pregel API + a flag (e.g., `bool isAttrChanged`) in each vertex's 
attribute. 

However, after `aggregateMessage` or `mapReduceTriplets` of each iteration, we 
need to reset this flag to the init value in every vertex, which needs a heavy 
`joinVertices`. 

We find a more efficient way to meet this requirement and want to discuss it 
here.


Look at a simple example as follows:

In i-th iteartion, the previous attribute of each vertex is `Attr` and the 
newly computed attribute is `NewAttr`:

|VID| Attr| NewAttr| Neighbours|
|:|:-|:|:--|
| 1 | 4| 5| 2, 3 |
| 2 | 3| 2| 1, 4 |
| 3 | 2| 2| 1, 4 |
| 4|  3| 4| 1, 2, 3 |

Our requirement is that: 

1.  Set each vertex's `Attr` to be `NewAttr` in i-th iteration
2.  For each vertex whose `Attr!=NewAttr`, send message to its neighbours 
in the next iteration's `aggregateMessage`.


We found it is hard to implement this requirment using current Pregel API 
efficiently. The reason is that we not only need to perform `pregel()` to  
compute the `NewAttr`  (2) but also need to perform `outJoin()` to satisfy (1).

A simple idea is to keep a `isAttrChanged:Boolean` (solution 1)  or `flag:Int` 
(solution 2) in each vertex's attribute.

### 2. two solution  
---

2.1 solution 1: label and reset `isAttrChanged:Boolean` of Vertex Attr

![alt text](s1.jpeg "Title")

1. init message by `aggregateMessage`
it return a messageRDD
2. `innerJoin`
compute the messages on the received vertex, return a new VertexRDD 
which have the computed value by customed logic function `vprog`, set 
`isAttrChanged = true`
3. `outerJoinVertices`
update the changed vertex to the whole graph. now the graph is new.
4. `aggregateMessage`. it return a messageRDD
5. `joinVertices`  reset erery `isAttrChanged` of Vertex attr to false

```
//  here reset the isAttrChanged to false
g = updateG.joinVertices(updateG.vertices) {
(vid, oriVertex, updateGVertex) => updateGVertex.reset()
}
   ```
   here need to reset the vertex attribute object's variable as false

if don't reset the `isAttrChanged`, it will send message next iteration 
directly.

**result:**  

*   Edge: 890041895 
*   Vertex: 181640208
*   Iterate: 150 times
*   Cost total: 8.4h
*   can't run until the 0 message 


solution 2. color vertex

![alt text](s2.jpeg "Title")

iterate process:

1. innerJoin 
  `vprog` using as a partial function, looks like `vprog(curIter, _: VertexId, 
_: VD, _: A)`
  ` i = i + 1; val curIter = i`. 
  in `vprog`, user can fetch `curIter` and assign to `falg`.
2. outerJoinVertices
`graph = graph.outerJoinVertices(changedVerts) { (vid, old, newOpt) => 
newOpt.getOrElse(old)}.cache()`
3. aggregateMessages 
sendMsg is partial function, looks like `sendMsg(curIter, _: 
EdgeContext[VD, ED, A]`
**in `sendMsg`, compare `curIter` with `flag`, determine whether 
sending message**

result

raw data   from

*   vertex: 181640208
*   edge: 890041895


|  | iteration average cost | 150 iteration cost | 420 iteration cost | 
|  | - |  |  |
|  solution 1 | 188m | 7.8h | cannot finish  |
|  solution 2 | 24 | 1.2h   | 3.1h | 
| compare  | 7x  | 6.5x  | finished in 3.1 |


##  the end

i think the second solution(Pregel + a flag) is better.
this can really support the iterative graph algorithms which only part of the 
vertexes send messages to their neighbours in each iteration.

we shall use it in product environment.

EOF




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Reopened] (SPARK-3895) Scala style: Indentation of method

2014-10-13 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk reopened SPARK-3895:


> Scala style: Indentation of method
> --
>
> Key: SPARK-3895
> URL: https://issues.apache.org/jira/browse/SPARK-3895
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: sjk
>
> {code:title=core/src/main/scala/org/apache/spark/Aggregator.scala|borderStyle=solid}
> // for example
>   def combineCombinersByKey(iter: Iterator[_ <: Product2[K, C]], context: 
> TaskContext)
>   : Iterator[(K, C)] =
>   {
> ...
>   def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]],
>  context: TaskContext): Iterator[(K, C)] = {
> {code}
> there are not conform to the 
> rule.https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide
> there are so much code like this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-3895) Scala style: Indentation of method

2014-10-13 Thread sjk (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170541#comment-14170541
 ] 

sjk commented on SPARK-3895:


i have close the pr that code format changes are much.

but this sub task is the current code brace if not confirm the rule. 

shall we observe the rule of `Spark+Code+Style+Guide` ?

> Scala style: Indentation of method
> --
>
> Key: SPARK-3895
> URL: https://issues.apache.org/jira/browse/SPARK-3895
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: sjk
>
> {code:title=core/src/main/scala/org/apache/spark/Aggregator.scala|borderStyle=solid}
> // for example
>   def combineCombinersByKey(iter: Iterator[_ <: Product2[K, C]], context: 
> TaskContext)
>   : Iterator[(K, C)] =
>   {
> ...
>   def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]],
>  context: TaskContext): Iterator[(K, C)] = {
> {code}
> there are not conform to the 
> rule.https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide
> there are so much code like this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-3895) Scala style: Indentation of method

2014-10-13 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk updated SPARK-3895:
---
Description: 

{code:title=core/src/main/scala/org/apache/spark/Aggregator.scala|borderStyle=solid}
// for example
  def combineCombinersByKey(iter: Iterator[_ <: Product2[K, C]], context: 
TaskContext)
  : Iterator[(K, C)] =
  {

...

  def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]],
 context: TaskContext): Iterator[(K, C)] = {

{code}

there are not conform to the 
rule.https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide

there are so much code like this


  was:


such as https://github.com/apache/spark/pull/2734

{code:title=core/src/main/scala/org/apache/spark/Aggregator.scala|borderStyle=solid}
// for example
  def combineCombinersByKey(iter: Iterator[_ <: Product2[K, C]], context: 
TaskContext)
  : Iterator[(K, C)] =
  {

...

  def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]],
 context: TaskContext): Iterator[(K, C)] = {

{code}

there are not conform to the 
rule.https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide

there are so much code like this



> Scala style: Indentation of method
> --
>
> Key: SPARK-3895
> URL: https://issues.apache.org/jira/browse/SPARK-3895
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: sjk
>
> {code:title=core/src/main/scala/org/apache/spark/Aggregator.scala|borderStyle=solid}
> // for example
>   def combineCombinersByKey(iter: Iterator[_ <: Product2[K, C]], context: 
> TaskContext)
>   : Iterator[(K, C)] =
>   {
> ...
>   def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]],
>  context: TaskContext): Iterator[(K, C)] = {
> {code}
> there are not conform to the 
> rule.https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide
> there are so much code like this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Closed] (SPARK-3897) Scala style: format example code

2014-10-13 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk closed SPARK-3897.
--

> Scala style: format example code
> 
>
> Key: SPARK-3897
> URL: https://issues.apache.org/jira/browse/SPARK-3897
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: sjk
>
> https://github.com/apache/spark/pull/2754



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Reopened] (SPARK-3894) Scala style: line length increase to 120 for standard

2014-10-13 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk reopened SPARK-3894:


use 120 instead

scala function parameter effect the code readability

> Scala style: line length increase to 120 for standard
> -
>
> Key: SPARK-3894
> URL: https://issues.apache.org/jira/browse/SPARK-3894
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: sjk
>
> 100 is shorter
> our screen is bigger



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-3894) Scala style: line length increase to 120 for standard

2014-10-13 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk updated SPARK-3894:
---
Summary: Scala style: line length increase to 120 for standard  (was: Scala 
style: line length increase to 160)

> Scala style: line length increase to 120 for standard
> -
>
> Key: SPARK-3894
> URL: https://issues.apache.org/jira/browse/SPARK-3894
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: sjk
>
> 100 is shorter
> our screen is bigger



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-3894) Scala style: line length increase to 160

2014-10-13 Thread sjk (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170414#comment-14170414
 ] 

sjk commented on SPARK-3894:


too much function's parameters are more than four, the length is bigger than 
100.
it's not friendly on reading code.

maybe we can change to 120.

for the remained code, we do nothing change, only the new merged code use line 
120 length.

OK?

> Scala style: line length increase to 160
> 
>
> Key: SPARK-3894
> URL: https://issues.apache.org/jira/browse/SPARK-3894
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: sjk
>
> 100 is shorter
> our screen is bigger



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-3897) Scala style: format example code

2014-10-10 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk updated SPARK-3897:
---
Description: https://github.com/apache/spark/pull/2754

> Scala style: format example code
> 
>
> Key: SPARK-3897
> URL: https://issues.apache.org/jira/browse/SPARK-3897
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: sjk
>
> https://github.com/apache/spark/pull/2754



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-3897) Scala style: format example code

2014-10-10 Thread sjk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sjk updated SPARK-3897:
---

https://github.com/apache/spark/pull/2754

> Scala style: format example code
> 
>
> Key: SPARK-3897
> URL: https://issues.apache.org/jira/browse/SPARK-3897
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: sjk
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-3897) Scala style: format example code

2014-10-10 Thread sjk (JIRA)

sjk created SPARK-3897:
--

 Summary: Scala style: format example code
 Key: SPARK-3897
 URL: https://issues.apache.org/jira/browse/SPARK-3897
 Project: Spark
  Issue Type: Sub-task
Reporter: sjk






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-3896) checkSpeculatableTasks fask quit loop, invoking checkSpeculatableTasks is expensive

2014-10-09 Thread sjk (JIRA)

sjk created SPARK-3896:
--

 Summary: checkSpeculatableTasks fask quit loop, invoking 
checkSpeculatableTasks is expensive
 Key: SPARK-3896
 URL: https://issues.apache.org/jira/browse/SPARK-3896
 Project: Spark
  Issue Type: Improvement
Reporter: sjk






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-3895) Scala style: Indentation of method

2014-10-09 Thread sjk (JIRA)

sjk created SPARK-3895:
--

 Summary: Scala style: Indentation of method
 Key: SPARK-3895
 URL: https://issues.apache.org/jira/browse/SPARK-3895
 Project: Spark
  Issue Type: Sub-task
Reporter: sjk




such as https://github.com/apache/spark/pull/2734

{code:title=core/src/main/scala/org/apache/spark/Aggregator.scala|borderStyle=solid}
// for example
  def combineCombinersByKey(iter: Iterator[_ <: Product2[K, C]], context: 
TaskContext)
  : Iterator[(K, C)] =
  {

...

  def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]],
 context: TaskContext): Iterator[(K, C)] = {

{code}

there are not conform to the 
rule.https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide

there are so much code like this




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-3854) Scala style: require spaces before `{`

2014-10-09 Thread sjk (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14166362#comment-14166362
 ] 

sjk commented on SPARK-3854:


i think all the code after symbol should have one space length 

> Scala style: require spaces before `{`
> --
>
> Key: SPARK-3854
> URL: https://issues.apache.org/jira/browse/SPARK-3854
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Reporter: Josh Rosen
>
> We should require spaces before opening curly braces.  This isn't in the 
> style guide, but it probably should be:
> {code}
> // Correct:
> if (true) {
>   println("Wow!")
> }
> // Incorrect:
> if (true){
>println("Wow!")
> }
> {code}
> See https://github.com/apache/spark/pull/1658#discussion-diff-18611791 for an 
> example "in the wild."
> {{git grep "){"}} shows only a few occurrences of this style.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-3894) Scala style: line length increase to 160

2014-10-09 Thread sjk (JIRA)

sjk created SPARK-3894:
--

 Summary: Scala style: line length increase to 160
 Key: SPARK-3894
 URL: https://issues.apache.org/jira/browse/SPARK-3894
 Project: Spark
  Issue Type: Sub-task
Reporter: sjk


100 is shorter

our screen is bigger




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-3893) declare mutableMap/mutableSet explicitly

2014-10-09 Thread sjk (JIRA)

sjk created SPARK-3893:
--

 Summary: declare  mutableMap/mutableSet explicitly
 Key: SPARK-3893
 URL: https://issues.apache.org/jira/browse/SPARK-3893
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: sjk



{code:java}
  // current
  val workers = new HashSet[WorkerInfo]
  // sugguest
  val workers = new mutable.HashSet[WorkerInfo]
{code}

the other benefit is reminding us whether can use immutable collection instead 
of.

most of map we used is mutable.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-3781) code style format

2014-10-03 Thread sjk (JIRA)

sjk created SPARK-3781:
--

 Summary: code style format
 Key: SPARK-3781
 URL: https://issues.apache.org/jira/browse/SPARK-3781
 Project: Spark
  Issue Type: Improvement
Reporter: sjk






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6932) A Prototype of Parameter Server

[jira] [Created] (SPARK-6494) rdd polymorphic method zipPartitions refactor

[jira] [Commented] (SPARK-5062) Pregel use aggregateMessage instead of mapReduceTriplets function

[jira] [Updated] (SPARK-5062) Pregel use aggregateMessage instead of mapReduceTriplets function

[jira] [Created] (SPARK-5062) Pregel use aggregateMessage instead of mapReduceTriplets function

[jira] [Updated] (SPARK-5036) Better support sending partial messages in Pregel API

[jira] [Updated] (SPARK-5036) Better support sending partial messages in Pregel API

[jira] [Updated] (SPARK-5036) Better support sending partial messages in Pregel API

[jira] [Updated] (SPARK-5036) Better support sending partial messages in Pregel API

[jira] [Created] (SPARK-5036) Better support sending partial messages in Pregel API

[jira] [Reopened] (SPARK-3895) Scala style: Indentation of method

[jira] [Commented] (SPARK-3895) Scala style: Indentation of method

[jira] [Updated] (SPARK-3895) Scala style: Indentation of method

[jira] [Closed] (SPARK-3897) Scala style: format example code

[jira] [Reopened] (SPARK-3894) Scala style: line length increase to 120 for standard

[jira] [Updated] (SPARK-3894) Scala style: line length increase to 120 for standard

[jira] [Commented] (SPARK-3894) Scala style: line length increase to 160

[jira] [Updated] (SPARK-3897) Scala style: format example code

[jira] [Updated] (SPARK-3897) Scala style: format example code

[jira] [Created] (SPARK-3897) Scala style: format example code

[jira] [Created] (SPARK-3896) checkSpeculatableTasks fask quit loop, invoking checkSpeculatableTasks is expensive

[jira] [Created] (SPARK-3895) Scala style: Indentation of method

[jira] [Commented] (SPARK-3854) Scala style: require spaces before `{`

[jira] [Created] (SPARK-3894) Scala style: line length increase to 160

[jira] [Created] (SPARK-3893) declare mutableMap/mutableSet explicitly

[jira] [Created] (SPARK-3781) code style format

26 matches

Site Navigation

Mail list logo

Footer information