[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Hyunsik Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262278#comment-13262278
 ] 

Hyunsik Choi commented on GIRAPH-185:
-

If there is a trade-off relationship between the performance and memory 
consumption, the memory consumption seems more important in the current giraph 
implementation. Also, I agree that some benchmarks are necessary.

> Improve concurrency of putMsg / putMsgList
> --
>
> Key: GIRAPH-185
> URL: https://issues.apache.org/jira/browse/GIRAPH-185
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.2.0
>Reporter: Bo Wang
>Assignee: Bo Wang
> Fix For: 0.2.0
>
> Attachments: GIRAPH-185.patch, GIRAPH-185.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Currently in putMsg / putMsgList, a synchronized closure is used to protect 
> the whole transientInMessages when adding the new message. This lock prevents 
> other concurrent calls to putMsg/putMsgList and increases the response time. 
> We should use fine-grain locks to allow high concurrency in message 
> communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261906#comment-13261906
 ] 

Avery Ching commented on GIRAPH-185:


I agree that a benchmark should be done, although I expect the impact to be 
very small.  We should at least show it's not slower. =)

> Improve concurrency of putMsg / putMsgList
> --
>
> Key: GIRAPH-185
> URL: https://issues.apache.org/jira/browse/GIRAPH-185
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.2.0
>Reporter: Bo Wang
>Assignee: Bo Wang
> Fix For: 0.2.0
>
> Attachments: GIRAPH-185.patch, GIRAPH-185.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Currently in putMsg / putMsgList, a synchronized closure is used to protect 
> the whole transientInMessages when adding the new message. This lock prevents 
> other concurrent calls to putMsg/putMsgList and increases the response time. 
> We should use fine-grain locks to allow high concurrency in message 
> communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Claudio Martella (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261854#comment-13261854
 ] 

Claudio Martella commented on GIRAPH-185:
-

Personally, I'd like to see some benchmarking on this issue. If we commit this, 
we should have an idea of the impact.

> Improve concurrency of putMsg / putMsgList
> --
>
> Key: GIRAPH-185
> URL: https://issues.apache.org/jira/browse/GIRAPH-185
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.2.0
>Reporter: Bo Wang
>Assignee: Bo Wang
> Fix For: 0.2.0
>
> Attachments: GIRAPH-185.patch, GIRAPH-185.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Currently in putMsg / putMsgList, a synchronized closure is used to protect 
> the whole transientInMessages when adding the new message. This lock prevents 
> other concurrent calls to putMsg/putMsgList and increases the response time. 
> We should use fine-grain locks to allow high concurrency in message 
> communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Bo Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261830#comment-13261830
 ] 

Bo Wang commented on GIRAPH-185:


I checked the source and found the same thing. I think LinkedList should be ok 
in terms of space. ArrayList also has to keep empty space in the array to 
future insertion. Should we close this issue?

> Improve concurrency of putMsg / putMsgList
> --
>
> Key: GIRAPH-185
> URL: https://issues.apache.org/jira/browse/GIRAPH-185
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.2.0
>Reporter: Bo Wang
>Assignee: Bo Wang
> Fix For: 0.2.0
>
> Attachments: GIRAPH-185.patch, GIRAPH-185.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Currently in putMsg / putMsgList, a synchronized closure is used to protect 
> the whole transientInMessages when adding the new message. This lock prevents 
> other concurrent calls to putMsg/putMsgList and increases the response time. 
> We should use fine-grain locks to allow high concurrency in message 
> communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Claudio Martella (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261787#comment-13261787
 ] 

Claudio Martella commented on GIRAPH-185:
-

Actually I checked the source and there's no prev pointer, each Node has just a 
pointer to the payload and to the next node. The memory overhead should be 
small.

> Improve concurrency of putMsg / putMsgList
> --
>
> Key: GIRAPH-185
> URL: https://issues.apache.org/jira/browse/GIRAPH-185
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.2.0
>Reporter: Bo Wang
>Assignee: Bo Wang
> Fix For: 0.2.0
>
> Attachments: GIRAPH-185.patch, GIRAPH-185.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Currently in putMsg / putMsgList, a synchronized closure is used to protect 
> the whole transientInMessages when adding the new message. This lock prevents 
> other concurrent calls to putMsg/putMsgList and increases the response time. 
> We should use fine-grain locks to allow high concurrency in message 
> communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-185) Improve concurrency of putMsg / putMsgList

2012-04-25 Thread Claudio Martella (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261774#comment-13261774
 ] 

Claudio Martella commented on GIRAPH-185:
-

The performance of concurrentlinkedqueue is going to be faster than a 
synchronized block as it's just a CAS operation on the tail pointer, at least 
for the add() method which adds to the tail of the queue. Also, arrayList in 
any case should be slower on adding elements as it requires the memory 
expansion and copying when the allocated memory is exhausted.
Iteration could indeed be a bit slower than an arrayList because of cache.

The memory overhead of each entry of the queue is indeed something that should 
be investigated. Worst case, one might think of copying the 
concurrentlinkedqueue implementation and remove the "prev" pointer which we 
don't need.

> Improve concurrency of putMsg / putMsgList
> --
>
> Key: GIRAPH-185
> URL: https://issues.apache.org/jira/browse/GIRAPH-185
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.2.0
>Reporter: Bo Wang
>Assignee: Bo Wang
> Fix For: 0.2.0
>
> Attachments: GIRAPH-185.patch, GIRAPH-185.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Currently in putMsg / putMsgList, a synchronized closure is used to protect 
> the whole transientInMessages when adding the new message. This lock prevents 
> other concurrent calls to putMsg/putMsgList and increases the response time. 
> We should use fine-grain locks to allow high concurrency in message 
> communication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-153) HBase/Accumulo Input and Output formats

2012-04-25 Thread Brian Femiano (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261728#comment-13261728
 ] 

Brian Femiano commented on GIRAPH-153:
--

Updated contrib confluence wiki entry for clarity. 

> HBase/Accumulo Input and Output formats
> ---
>
> Key: GIRAPH-153
> URL: https://issues.apache.org/jira/browse/GIRAPH-153
> Project: Giraph
>  Issue Type: New Feature
>  Components: bsp
>Affects Versions: 0.1.0
> Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB
>Reporter: Brian Femiano
> Attachments: GIRAPH-153.1.patch, GIRAPH-153.patch
>
>
> Four abstract classes that wrap their respective delegate input/output 
> formats for
> easy hooks into vertex input format subclasses. I've included some sample 
> programs that show two very simple graph
> algorithms. I have a graph generator that builds out a very simple directed 
> structure, starting with a few 'root' nodes.
> Root nodes are defined as nodes which are not listed as a child anywhere in 
> the graph. 
> Algorithm 1) AccumuloRootMarker.java  --> Accumulo as read/write source. 
> Every vertex starts thinking it's a root. At superstep 0, send a message down 
> to each
> child as a non-root notification. After superstep 1, only root nodes will 
> have never been messaged. 
> Algorithm 2) TableRootMarker --> HBase as read/write source. Expands on A1 by 
> bundling the notification logic followed by root node propagation. Once we've 
> marked the appropriate nodes as roots, tell every child which roots it can be 
> traced back to via one or more spanning trees. This will take N + 2 
> supersteps where N is the maximum number of hops from any root to any leaf, 
> plus 2 supersteps for the initial root flagging. 
> I've included all relevant code plus DistributedCacheHelper.java for 
> recursive cache file and archive searches. It is more hadoop centric than 
> giraph, but these jobs use it so I figured why not commit here. 
> These have been tested through local JobRunner, pseudo-distributed on the 
> aforementioned hardware, and full distributed on EC2. More details in the 
> comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira