[jira] [Commented] (GIRAPH-37) Implement Netty-backed rpc solution

2011-09-19 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108141#comment-13108141
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-37:
-

(moving my comment from email thread onto jira):

Note that finagle is not thrift specific. It's rpc protocol agnostic.
We can make a finagle-hadooprpc connector. Granted, the thrift
implementation is pretty hardened. Actually the fact that finagle is
independent of rpc frework may be another reason to use it -- flip
between hadooprpc and thrift depending on whether you want performance
or security.

> Implement Netty-backed rpc solution
> ---
>
> Key: GIRAPH-37
> URL: https://issues.apache.org/jira/browse/GIRAPH-37
> Project: Giraph
>  Issue Type: New Feature
>Reporter: Jakob Homan
>Assignee: Jakob Homan
>
> GIRAPH-12 considered replacing the current Hadoop based rpc method with 
> Netty, but didn't went in another direction. I think there is still value in 
> this approach, and will also look at Finagle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-34) Failure of Vertex reflection for putVertexList from GIRAPH-27

2011-09-16 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13106789#comment-13106789
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-34:
-

Never mind, Writable is an envelope for the actual message, and it doesn't 
matter what we do to writable -- the concern here is calling methods on the 
contained message, and of course we can't control that.

+1

> Failure of Vertex reflection for putVertexList from GIRAPH-27 
> --
>
> Key: GIRAPH-34
> URL: https://issues.apache.org/jira/browse/GIRAPH-34
> Project: Giraph
>  Issue Type: Bug
>Reporter: Christian Kunz
>Assignee: Avery Ching
> Attachments: GIRAPH-34.patch
>
>
> Christian actually found this bug.  I am filing the JIRA on his behalf.  
> Here's my error when running TestVertexRangeBalancer.  
> java.lang.RuntimeException: java.io.IOException: Call to 
> returnwhose-lm/10.72.107.231:30002 failed on local exception: 
> java.io.EOFException
>   at 
> org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:768)
>   at 
> org.apache.giraph.graph.BspServiceWorker.exchangeVertexRanges(BspServiceWorker.java:1282)
>   at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:589)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>   at org.apache.hadoop.mapred.Child.main(Child.java:253)
> Caused by: java.io.IOException: Call to returnwhose-lm/10.72.107.231:30002 
> failed on local exception: java.io.EOFException
>   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1033)
>   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
>   at $Proxy3.putVertexList(Unknown Source)
>   at 
> org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:766)
>   ... 10 more
> Caused by: java.io.EOFException
>   at java.io.DataInputStream.readInt(DataInputStream.java:375)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
> I identified and fixed the issue by making BasicVertex implement Configurable 
> and making the graph state set in BasicRPCCommunications.  There is one more 
> error though that I'll try and solve before putting up a reviewboard.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-34) Failure of Vertex reflection for putVertexList from GIRAPH-27

2011-09-16 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13106769#comment-13106769
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-34:
-

Would it make sense to make the messages immutable in sendMsg?

> Failure of Vertex reflection for putVertexList from GIRAPH-27 
> --
>
> Key: GIRAPH-34
> URL: https://issues.apache.org/jira/browse/GIRAPH-34
> Project: Giraph
>  Issue Type: Bug
>Reporter: Christian Kunz
>Assignee: Avery Ching
> Attachments: GIRAPH-34.patch
>
>
> Christian actually found this bug.  I am filing the JIRA on his behalf.  
> Here's my error when running TestVertexRangeBalancer.  
> java.lang.RuntimeException: java.io.IOException: Call to 
> returnwhose-lm/10.72.107.231:30002 failed on local exception: 
> java.io.EOFException
>   at 
> org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:768)
>   at 
> org.apache.giraph.graph.BspServiceWorker.exchangeVertexRanges(BspServiceWorker.java:1282)
>   at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:589)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>   at org.apache.hadoop.mapred.Child.main(Child.java:253)
> Caused by: java.io.IOException: Call to returnwhose-lm/10.72.107.231:30002 
> failed on local exception: java.io.EOFException
>   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1033)
>   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
>   at $Proxy3.putVertexList(Unknown Source)
>   at 
> org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:766)
>   ... 10 more
> Caused by: java.io.EOFException
>   at java.io.DataInputStream.readInt(DataInputStream.java:375)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
> I identified and fixed the issue by making BasicVertex implement Configurable 
> and making the graph state set in BasicRPCCommunications.  There is one more 
> error though that I'll try and solve before putting up a reviewboard.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-35) Modifying the site to indicated that Jake Mannix and Dmitriy Ryaboy are now Giraph committers

2011-09-15 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105800#comment-13105800
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-35:
-

Uh.. +1? :)

> Modifying the site to indicated that Jake Mannix and Dmitriy Ryaboy are now 
> Giraph committers
> -
>
> Key: GIRAPH-35
> URL: https://issues.apache.org/jira/browse/GIRAPH-35
> Project: Giraph
>  Issue Type: Task
>Reporter: Avery Ching
>Assignee: Avery Ching
> Attachments: GIRAPH-35.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-14 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105039#comment-13105039
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

Nice job, Jimmy. Looks like you managed to get rid of something that was 
sucking in memory in both Object and LDFD.. what was it?

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff, GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-14 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105040#comment-13105040
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

E Jake. Sorry. Clearly, I can't tell you remote search people apart :-P

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff, GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103740#comment-13103740
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-31:
-

Avery,
It seems like requiring all BasicVertex implementations to implement a sorted 
iterator even when they don't need it is a bit heavy-handed.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103442#comment-13103442
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-31:
-

I was just commenting on the javadoc, not the implementation. Though now that 
you say that, i think you are right, false is a safer thing to do.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103412#comment-13103412
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-31:
-

non-committer +1.

Please change javadoc for providesSortedIterator to not just say "@return true" 
-- implementations that override this to return false might forget to provide 
their own javadoc, inherit this, and this claim behavior opposite from what 
they actually do.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-12 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103369#comment-13103369
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

This:

bq. Alternatively, Edge could act just like a typical Writable, and the 
Iterator> iterates over the same Edge object setting different 
values on it as next() is called.

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff, GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-12 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103365#comment-13103365
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

I'd caution against the approach of using a MutatorIterator (that's my name for 
that pattern. Like it? :)).
It's effective, but leads to extremely confusing bugs when people try to do 
things like take the first three edges, etc. Presenting a familiar interface 
but providing a tricky unintuitive implementation is not super friendly to 
developers; I don't think we want people to have to study the API to such an 
extent they have to know these details.

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff, GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-12 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103354#comment-13103354
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

Technically you shouldn't *have* to use hasEdge when adding and removing if you 
don't care. removeEdge() can return null ambiguously (value was null, or no 
such edge existed), and if you care, you can use hasEdge(), but if you don't, 
you don't. addEdge() can be an upsert.

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff, GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-11 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102412#comment-13102412
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

There is something that's eating up memory in the LDFDVertex implementation 
that's not the OpenLongFloatHashMap.

I changed my code to specifically test consumption of the edge map memory, and 
removing all the Vertex complexity, by just putting together 4 implementations 
of a most basic interface:

{code}
 public static abstract class Vertex {
public abstract void addEdge(Edge edge);
  }
{code}

The 4 implementations are "Current" (a map of LongWritables to 
Edge), "Just Value" (a map of LongWritable to 
FloatWritable), "Primitive Map" (an OpenLongFloatHashMap), and "Primitive" (two 
arrays of longs and floats, which resize and copy the whole array on every edge 
add -- an obviously untenable, but least memory intensive, implementation).

The code is at https://gist.github.com/1210524

The results are much more sensible. Switching to the PrimitiveMap should be 
huge savings; even without that, getting rid of the duplicated LongWritable is 
quite noticeable.

{code}
Current   : 0   144
Just Value: 0   144
Primitive Map : 0   272
Primitive : 0   56
Current   : 1   240
Just Value: 1   216
Primitive Map : 1   272
Primitive : 1   72
Current   : 10  1104
Just Value: 10  864
Primitive Map : 10  528
Primitive : 10  176
Current   : 100 10704
Just Value: 100 8304
Primitive Map : 100 3728
Primitive : 100 1256
Current   : 1000104272
Just Value: 100080272
Primitive Map : 100045976
Primitive : 100012056
Current   : 1   1025616
Just Value: 1   785616
Primitive Map : 1   301192
Primitive : 1   120056
{code}


> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-11 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102377#comment-13102377
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

I tried to benchmark the memory footprint on this and was surprised to find 
that it's actually larger than the existing implementation!

At Jake's suggestion, I also added a dummy "TinyVertex" that just does the 
simplest thing possible, by keeping an array of primitives and resizing it when 
needed. This gives us the lower bound on what our memory utilization could look 
like. The answer is, we could be 30x more efficient in terms of memory 
utilization, at the cost of some CPU.

Code is here: https://gist.github.com/1210245

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-11 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102349#comment-13102349
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

Jake, patch doesn't apply to current trunk. Can you rebase?

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-25) NPE in BspServiceMaster when failing a job

2011-09-09 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101711#comment-13101711
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-25:
-

I think usually committer resolves the issue.

Thanks for taking the patch! I'm going to try and break Giraph in a few more 
ways this weekend :-)

> NPE in BspServiceMaster when failing a job
> --
>
> Key: GIRAPH-25
> URL: https://issues.apache.org/jira/browse/GIRAPH-25
> Project: Giraph
>  Issue Type: Bug
>Reporter: Dmitriy V. Ryaboy
>Assignee: Dmitriy V. Ryaboy
>Priority: Minor
> Attachments: GIRAPH-25.2.patch, GIRAPH-25.patch
>
>
> When BspServiceMaster times out waiting for all workers to check in, it dies 
> with a NullPointerException.
> This can perhaps be handled a bit more gracefully.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-27) Mutable static global state in Vertex.java should be refactored

2011-09-08 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100852#comment-13100852
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-27:
-

Avery, looks like you forget to add GraphState to svn before generating the 
patch. 

> Mutable static global state in Vertex.java should be refactored
> ---
>
> Key: GIRAPH-27
> URL: https://issues.apache.org/jira/browse/GIRAPH-27
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-27.patch, GIRAPH-27.patch
>
>
> Vertex.java has a bunch of static methods for getting/setting global graph 
> state (total number of vertices, edges, a reference to the GraphMapper, etc). 
>  Refactoring this into a GraphState object, which every Vertex can hold onto 
> a reference to (yes, a tiny bit more memory per Vertex, but in comparison to 
> what's already in there...)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-27) Mutable static global state in Vertex.java should be refactored

2011-09-08 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100666#comment-13100666
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-27:
-

non-committer +1

> Mutable static global state in Vertex.java should be refactored
> ---
>
> Key: GIRAPH-27
> URL: https://issues.apache.org/jira/browse/GIRAPH-27
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
> Attachments: GIRAPH-27.patch, GIRAPH-27.patch
>
>
> Vertex.java has a bunch of static methods for getting/setting global graph 
> state (total number of vertices, edges, a reference to the GraphMapper, etc). 
>  Refactoring this into a GraphState object, which every Vertex can hold onto 
> a reference to (yes, a tiny bit more memory per Vertex, but in comparison to 
> what's already in there...)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-27) Mutable static global state in Vertex.java should be refactored

2011-09-08 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100616#comment-13100616
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-27:
-

I took a brief look; it'd be great if we agreed on import ordering so that 
everyone's IDEs didn't reorder every time.

You do have some two-space padding in places; I believe Giraph conventions are 
4 spaces.

public class GraphState needs a javadoc (what are I, V, E, and M? I know, but 
it'd be nice to have it in writing..) Said javadoc should probably include 
scary warnings about making sure that one doesn't wind up with multiple 
different states floating around in a job.


Can you make the GraphState setters chainable (return this instead of void)? 
That'll make creating them flow much nicer in GraphMapper.


Why remove the \, etc in calls to BspUtils methods in Vertex?



> Mutable static global state in Vertex.java should be refactored
> ---
>
> Key: GIRAPH-27
> URL: https://issues.apache.org/jira/browse/GIRAPH-27
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
> Attachments: GIRAPH-27.patch
>
>
> Vertex.java has a bunch of static methods for getting/setting global graph 
> state (total number of vertices, edges, a reference to the GraphMapper, etc). 
>  Refactoring this into a GraphState object, which every Vertex can hold onto 
> a reference to (yes, a tiny bit more memory per Vertex, but in comparison to 
> what's already in there...)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-25) NPE in BspServiceMaster when failing a job

2011-09-08 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100590#comment-13100590
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-25:
-

Thanks Avery!
Mind adding me to the contributors list on the project so I can post-factum 
"assign" this one to myself?

FYI the way we've done the attribution in Pig (and Hadoop, I think) in the 
commit message is the more succinct "JIRA-123: description. $patch_author via 
$committer."

> NPE in BspServiceMaster when failing a job
> --
>
> Key: GIRAPH-25
> URL: https://issues.apache.org/jira/browse/GIRAPH-25
> Project: Giraph
>  Issue Type: Bug
>Reporter: Dmitriy V. Ryaboy
>Priority: Minor
> Attachments: GIRAPH-25.2.patch, GIRAPH-25.patch
>
>
> When BspServiceMaster times out waiting for all workers to check in, it dies 
> with a NullPointerException.
> This can perhaps be handled a bit more gracefully.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-26) Improve PseudoRandomVertexInputFormat to create a more realistic synthetic graph (e.g. power-law distributed vertex-cardinality).

2011-09-06 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13098614#comment-13098614
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-26:
-

I am not so sure it's that easy in a parallel world, Jake :)
http://arxiv.org/pdf/1003.3684v1

> Improve PseudoRandomVertexInputFormat to create a more realistic synthetic 
> graph (e.g. power-law distributed vertex-cardinality).
> -
>
> Key: GIRAPH-26
> URL: https://issues.apache.org/jira/browse/GIRAPH-26
> Project: Giraph
>  Issue Type: Test
>  Components: benchmark
>Reporter: Jake Mannix
>Priority: Minor
>
> The PageRankBenchmark class, to be a proper benchmark, should run over graphs 
> which look more like data seen in the wild, and web link graphs, social 
> network graphs, and text corpora (represented as a bipartite graph) all have 
> power-law distributions, so benchmarking a synthetic graph which looks more 
> like this would be a nice test which would stress cases of uneven 
> split-distribution and bottlenecks of subclusters of the graph of heavily 
> connected vertices.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (GIRAPH-25) NPE in BspServiceMaster when failing a job

2011-09-06 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated GIRAPH-25:


Attachment: GIRAPH-25.patch

Attached a basic fix.

The problem was that failing the job did everything correctly, but did not stop 
BspServiceMaster to proceed. 

There are two choices here -- declare an exception and throw it in this case, 
and deal with that upstream; or, c-style, return a -1. I chose the latter 
because it makes code that deals with this more succinct and it didn't change a 
public api. But I can rewrite if you prefer to throw an exception.

No test as I wasn't sure how best to fit this into the way the tests are set up.

> NPE in BspServiceMaster when failing a job
> --
>
> Key: GIRAPH-25
> URL: https://issues.apache.org/jira/browse/GIRAPH-25
> Project: Giraph
>  Issue Type: Bug
>Reporter: Dmitriy V. Ryaboy
>Priority: Minor
> Attachments: GIRAPH-25.patch
>
>
> When BspServiceMaster times out waiting for all workers to check in, it dies 
> with a NullPointerException.
> This can perhaps be handled a bit more gracefully.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-25) NPE in BspServiceMaster when failing a job

2011-09-03 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096816#comment-13096816
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-25:
-

Here's the log I saw on a timed out master:

{code}
2011-09-04 05:22:11,115 INFO org.apache.giraph.graph.BspServiceMaster: 
checkWorkers: Only found 182 responses of 186 needed to start superstep -1.  
Sleeping for 3 msecs and used 9 of 10 attempts.
2011-09-04 05:22:11,115 WARN org.apache.giraph.graph.BspServiceMaster: 
checkWorkers: Did not receive enough processes in time (only 182 of 186 
required)
2011-09-04 05:22:11,120 INFO org.apache.giraph.graph.BspServiceMaster: 
setJobState: 
{"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1} on 
superstep -1
2011-09-04 05:22:11,129 FATAL org.apache.giraph.graph.BspServiceMaster: 
failJob: Killing job job_201109012213_17306
2011-09-04 05:22:11,159 ERROR org.apache.giraph.graph.MasterThread: 
masterThread: Master algorithm failed: 
java.lang.NullPointerException
at 
org.apache.giraph.graph.BspServiceMaster.createInputSplits(BspServiceMaster.java:486)
at org.apache.giraph.graph.MasterThread.run(MasterThread.java:94)
2011-09-04 05:22:11,160 FATAL org.apache.giraph.graph.GraphMapper: 
uncaughtException: OverrideExceptionHandler on thread 
org.apache.giraph.graph.MasterThread, msg = java.lang.NullPointerException, 
exiting...
java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.giraph.graph.MasterThread.run(MasterThread.java:177)
Caused by: java.lang.NullPointerException
at 
org.apache.giraph.graph.BspServiceMaster.createInputSplits(BspServiceMaster.java:486)
at org.apache.giraph.graph.MasterThread.run(MasterThread.java:94)
2011-09-04 05:22:11,161 WARN org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
{code}

> NPE in BspServiceMaster when failing a job
> --
>
> Key: GIRAPH-25
> URL: https://issues.apache.org/jira/browse/GIRAPH-25
> Project: Giraph
>  Issue Type: Bug
>Reporter: Dmitriy V. Ryaboy
>Priority: Minor
>
> When BspServiceMaster times out waiting for all workers to check in, it dies 
> with a NullPointerException.
> This can perhaps be handled a bit more gracefully.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (GIRAPH-25) NPE in BspServiceMaster when failing a job

2011-09-03 Thread Dmitriy V. Ryaboy (JIRA)
NPE in BspServiceMaster when failing a job
--

 Key: GIRAPH-25
 URL: https://issues.apache.org/jira/browse/GIRAPH-25
 Project: Giraph
  Issue Type: Bug
Reporter: Dmitriy V. Ryaboy
Priority: Minor


When BspServiceMaster times out waiting for all workers to check in, it dies 
with a NullPointerException.
This can perhaps be handled a bit more gracefully.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira