[jira] [Commented] (GIRAPH-307) InputSplit list can be long with many workers (and locality info) and should not be re-created every time a worker calls reserveInputSplit()

2012-09-02 Thread Eli Reisman (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447061#comment-13447061
 ] 

Eli Reisman commented on GIRAPH-307:


going to rebase this now that 301 & 318 are in, will post patch ASAP.



> InputSplit list can be long with many workers (and locality info) and should 
> not be re-created every time a worker calls reserveInputSplit()
> 
>
> Key: GIRAPH-307
> URL: https://issues.apache.org/jira/browse/GIRAPH-307
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp, graph
>Affects Versions: 0.2.0
>Reporter: Eli Reisman
>Assignee: Eli Reisman
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: GIRAPH-307-1.patch
>
>
> While instrumenting the INPUT_SUPERSTEP and watching various runs, I see the 
> input split list generated every time a worker calls reserveInputSplit is, 
> for all intents and purposes, immutable per job. Therefore, we can save a 
> fair amount of memory by not re-creating the list and re-querying ZooKeeper 
> on each pass to claim another split. Only the reserved and finished children 
> lists are ever mutated during the input phase of the job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GIRAPH-307) InputSplit list can be long with many workers (and locality info) and should not be re-created every time a worker calls reserveInputSplit()

2012-10-04 Thread Maja Kabiljo (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13469490#comment-13469490
 ] 

Maja Kabiljo commented on GIRAPH-307:
-

Looks good to me. Just one comment, can you please change the name of the test 
to reflect the class name change? 
Did you see any speed improvement because of less zookeeper reads?

> InputSplit list can be long with many workers (and locality info) and should 
> not be re-created every time a worker calls reserveInputSplit()
> 
>
> Key: GIRAPH-307
> URL: https://issues.apache.org/jira/browse/GIRAPH-307
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp, graph
>Affects Versions: 0.2.0
>Reporter: Eli Reisman
>Assignee: Eli Reisman
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: GIRAPH-307-1.patch, GIRAPH-307-2.patch
>
>
> While instrumenting the INPUT_SUPERSTEP and watching various runs, I see the 
> input split list generated every time a worker calls reserveInputSplit is, 
> for all intents and purposes, immutable per job. Therefore, we can save a 
> fair amount of memory by not re-creating the list and re-querying ZooKeeper 
> on each pass to claim another split. Only the reserved and finished children 
> lists are ever mutated during the input phase of the job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GIRAPH-307) InputSplit list can be long with many workers (and locality info) and should not be re-created every time a worker calls reserveInputSplit()

2012-10-05 Thread Maja Kabiljo (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470401#comment-13470401
 ] 

Maja Kabiljo commented on GIRAPH-307:
-

I see, reading this list is fast comparing to other things happening at that 
time. But still if we don't need to read it multiple times we shouldn't.

Thanks, Eli, +1. Unless somebody has an objection, I'll commit this tonight.

> InputSplit list can be long with many workers (and locality info) and should 
> not be re-created every time a worker calls reserveInputSplit()
> 
>
> Key: GIRAPH-307
> URL: https://issues.apache.org/jira/browse/GIRAPH-307
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp, graph
>Affects Versions: 0.2.0
>Reporter: Eli Reisman
>Assignee: Eli Reisman
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: GIRAPH-307-1.patch, GIRAPH-307-2.patch, 
> GIRAPH-307-3.patch
>
>
> While instrumenting the INPUT_SUPERSTEP and watching various runs, I see the 
> input split list generated every time a worker calls reserveInputSplit is, 
> for all intents and purposes, immutable per job. Therefore, we can save a 
> fair amount of memory by not re-creating the list and re-querying ZooKeeper 
> on each pass to claim another split. Only the reserved and finished children 
> lists are ever mutated during the input phase of the job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GIRAPH-307) InputSplit list can be long with many workers (and locality info) and should not be re-created every time a worker calls reserveInputSplit()

2012-10-06 Thread Maja Kabiljo (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471052#comment-13471052
 ] 

Maja Kabiljo commented on GIRAPH-307:
-

If I see correctly the build failed for some strange reason which have happened 
before. What do we do in this case?

> InputSplit list can be long with many workers (and locality info) and should 
> not be re-created every time a worker calls reserveInputSplit()
> 
>
> Key: GIRAPH-307
> URL: https://issues.apache.org/jira/browse/GIRAPH-307
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp, graph
>Affects Versions: 0.2.0
>Reporter: Eli Reisman
>Assignee: Eli Reisman
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: GIRAPH-307-1.patch, GIRAPH-307-2.patch, 
> GIRAPH-307-3.patch
>
>
> While instrumenting the INPUT_SUPERSTEP and watching various runs, I see the 
> input split list generated every time a worker calls reserveInputSplit is, 
> for all intents and purposes, immutable per job. Therefore, we can save a 
> fair amount of memory by not re-creating the list and re-querying ZooKeeper 
> on each pass to claim another split. Only the reserved and finished children 
> lists are ever mutated during the input phase of the job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GIRAPH-307) InputSplit list can be long with many workers (and locality info) and should not be re-created every time a worker calls reserveInputSplit()

2012-10-06 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471061#comment-13471061
 ] 

Avery Ching commented on GIRAPH-307:


I just restarted it.  https://builds.apache.org/job/Giraph-trunk-Commit/229/, 
let's see how it does this time.

> InputSplit list can be long with many workers (and locality info) and should 
> not be re-created every time a worker calls reserveInputSplit()
> 
>
> Key: GIRAPH-307
> URL: https://issues.apache.org/jira/browse/GIRAPH-307
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp, graph
>Affects Versions: 0.2.0
>Reporter: Eli Reisman
>Assignee: Eli Reisman
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: GIRAPH-307-1.patch, GIRAPH-307-2.patch, 
> GIRAPH-307-3.patch
>
>
> While instrumenting the INPUT_SUPERSTEP and watching various runs, I see the 
> input split list generated every time a worker calls reserveInputSplit is, 
> for all intents and purposes, immutable per job. Therefore, we can save a 
> fair amount of memory by not re-creating the list and re-querying ZooKeeper 
> on each pass to claim another split. Only the reserved and finished children 
> lists are ever mutated during the input phase of the job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GIRAPH-307) InputSplit list can be long with many workers (and locality info) and should not be re-created every time a worker calls reserveInputSplit()

2012-10-06 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471060#comment-13471060
 ] 

Avery Ching commented on GIRAPH-307:


In this case, it was a problem on the Hudson side.

https://builds.apache.org/job/Giraph-trunk-Commit/228/console

So log in to hudson and run it again =).

> InputSplit list can be long with many workers (and locality info) and should 
> not be re-created every time a worker calls reserveInputSplit()
> 
>
> Key: GIRAPH-307
> URL: https://issues.apache.org/jira/browse/GIRAPH-307
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp, graph
>Affects Versions: 0.2.0
>Reporter: Eli Reisman
>Assignee: Eli Reisman
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: GIRAPH-307-1.patch, GIRAPH-307-2.patch, 
> GIRAPH-307-3.patch
>
>
> While instrumenting the INPUT_SUPERSTEP and watching various runs, I see the 
> input split list generated every time a worker calls reserveInputSplit is, 
> for all intents and purposes, immutable per job. Therefore, we can save a 
> fair amount of memory by not re-creating the list and re-querying ZooKeeper 
> on each pass to claim another split. Only the reserved and finished children 
> lists are ever mutated during the input phase of the job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GIRAPH-307) InputSplit list can be long with many workers (and locality info) and should not be re-created every time a worker calls reserveInputSplit()

2012-10-06 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471065#comment-13471065
 ] 

Avery Ching commented on GIRAPH-307:


This one passed. https://builds.apache.org/job/Giraph-trunk-Commit/229/

> InputSplit list can be long with many workers (and locality info) and should 
> not be re-created every time a worker calls reserveInputSplit()
> 
>
> Key: GIRAPH-307
> URL: https://issues.apache.org/jira/browse/GIRAPH-307
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp, graph
>Affects Versions: 0.2.0
>Reporter: Eli Reisman
>Assignee: Eli Reisman
>Priority: Minor
> Fix For: 0.2.0
>
> Attachments: GIRAPH-307-1.patch, GIRAPH-307-2.patch, 
> GIRAPH-307-3.patch
>
>
> While instrumenting the INPUT_SUPERSTEP and watching various runs, I see the 
> input split list generated every time a worker calls reserveInputSplit is, 
> for all intents and purposes, immutable per job. Therefore, we can save a 
> fair amount of memory by not re-creating the list and re-querying ZooKeeper 
> on each pass to claim another split. Only the reserved and finished children 
> lists are ever mutated during the input phase of the job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira