[jira] [Commented] (GIRAPH-37) Implement Netty-backed rpc solution

2011-09-17 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107165#comment-13107165
 ] 

Jake Mannix commented on GIRAPH-37:
---

We should make sure we don't all work on the same thing (note the discussion at 
the end of GIRAPH-12) - two at a time might be fine, but half of the developers 
all on RPC might be excessive.  Do you want to take this one?  I was going to 
go in and try and implement a Finagle-based solution, as it's already an async 
RPC-system on top of Netty, but if you're already going to look at this, I can 
drop what I was doing and work on something else.

 Implement Netty-backed rpc solution
 ---

 Key: GIRAPH-37
 URL: https://issues.apache.org/jira/browse/GIRAPH-37
 Project: Giraph
  Issue Type: New Feature
Reporter: Jakob Homan
Assignee: Jakob Homan

 GIRAPH-12 considered replacing the current Hadoop based rpc method with 
 Netty, but didn't went in another direction. I think there is still value in 
 this approach, and will also look at Finagle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (GIRAPH-12) Investigate communication improvements

2011-09-17 Thread Jake Mannix (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jake Mannix reassigned GIRAPH-12:
-

Assignee: Avery Ching  (was: Hyunsik Choi)

 Investigate communication improvements
 --

 Key: GIRAPH-12
 URL: https://issues.apache.org/jira/browse/GIRAPH-12
 Project: Giraph
  Issue Type: Improvement
  Components: bsp
Reporter: Avery Ching
Assignee: Avery Ching
Priority: Minor
 Attachments: GIRAPH-12_1.patch


 Currently every worker will start up a thread to communicate with every other 
 workers.  Hadoop RPC is used for communication.  For instance if there are 
 400 workers, each worker will create 400 threads.  This ends up using a lot 
 of memory, even with the option  
 -Dmapred.child.java.opts=-Xss64k.  
 It would be good to investigate using frameworks like Netty or custom roll 
 our own to improve this situation.  By moving away from Hadoop RPC, we would 
 also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (GIRAPH-12) Investigate communication improvements

2011-09-17 Thread Jake Mannix (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jake Mannix reassigned GIRAPH-12:
-

Assignee: Hyunsik Choi  (was: Avery Ching)

Sorry, my 4-year old clicked when I was looking at this ticket.  Didn't notice 
that it managed to make an actual assignment, reverting!

 Investigate communication improvements
 --

 Key: GIRAPH-12
 URL: https://issues.apache.org/jira/browse/GIRAPH-12
 Project: Giraph
  Issue Type: Improvement
  Components: bsp
Reporter: Avery Ching
Assignee: Hyunsik Choi
Priority: Minor
 Attachments: GIRAPH-12_1.patch


 Currently every worker will start up a thread to communicate with every other 
 workers.  Hadoop RPC is used for communication.  For instance if there are 
 400 workers, each worker will create 400 threads.  This ends up using a lot 
 of memory, even with the option  
 -Dmapred.child.java.opts=-Xss64k.  
 It would be good to investigate using frameworks like Netty or custom roll 
 our own to improve this situation.  By moving away from Hadoop RPC, we would 
 also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps

2011-09-17 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107281#comment-13107281
 ] 

Jake Mannix commented on GIRAPH-36:
---

Initial thoughts:

  VertexReader defines a next(MutableVertex vertex) method, which does the 
sensible thing of filling in the vertex from the HDFS block, and because it 
takes a vertex object and messes with it, it's natural that the vertex be 
required to be a MutableVertex.

  But of course this implies that *everything* be a MutableVertex, because if 
you can't be read in by a VertexReader, where do you get instantiated at all?  
If BasicVertex implements Writable, we could always readFields() data in, but 
not allow mutation, but this seems like it would interfere with the way 
VertexReader allows users to read straight from Text, etc.  This would allow 
VertexList to extend ArrayListBasicVertex instead of ArrayListVertex, at 
the same time.

Anyone have any thoughts/ideas?  Are we wedded to making VertexReader 
implementations deal with MutableVertex, or can we swap them to handle Writable 
BasicVertex?

 Ensure that subclassing BasicVertex is possible by user apps
 

 Key: GIRAPH-36
 URL: https://issues.apache.org/jira/browse/GIRAPH-36
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
Priority: Blocker
 Fix For: 0.70.0


 Original assumptions in Giraph were that all users would subclass Vertex 
 (which extended MutableVertex extended BasicVertex).  Classes which wish to 
 have application specific data structures (ie. not a TreeMapI, EdgeI,E) 
 may need to extend either MutableVertex or BasicVertex.  Unfortunately 
 VertexRange extends ArrayListVertex, and there are other places where the 
 assumption is that vertex classes are either Vertex, or at least 
 MutableVertex.
 Let's make sure the internal APIs allow for BasicVertex to be the base class.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps

2011-09-17 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107282#comment-13107282
 ] 

Jake Mannix commented on GIRAPH-36:
---

In fact, thinking about VertexReader further, it seems its entire API is a 
little backwards.  Why are we *passing in* instantiated Vertices, and filling 
them in?  Shouldn't they effectively be iterators over the InputSplit?

 Ensure that subclassing BasicVertex is possible by user apps
 

 Key: GIRAPH-36
 URL: https://issues.apache.org/jira/browse/GIRAPH-36
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
Priority: Blocker
 Fix For: 0.70.0


 Original assumptions in Giraph were that all users would subclass Vertex 
 (which extended MutableVertex extended BasicVertex).  Classes which wish to 
 have application specific data structures (ie. not a TreeMapI, EdgeI,E) 
 may need to extend either MutableVertex or BasicVertex.  Unfortunately 
 VertexRange extends ArrayListVertex, and there are other places where the 
 assumption is that vertex classes are either Vertex, or at least 
 MutableVertex.
 Let's make sure the internal APIs allow for BasicVertex to be the base class.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-34) Failure of Vertex reflection for putVertexList from GIRAPH-27

2011-09-16 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106717#comment-13106717
 ] 

Jake Mannix commented on GIRAPH-34:
---

Wait, why would the sending Vertex modify the message object they just sent?  
Why would the even have a reference to it anymore?  It's a message, right?   
Could we not simply document that messages should be treated as ephemeral and 
not retained?  It seems like doing a bunch of reflection and object copying for 
each message to be sent could get prohibitively expensive.

As I look through the VertexRangeBalance code, I notice also that VertexList 
extends ArrayListWritableVertexI, V, E, M.  Yikes!  Not everything needs to 
be a Vertex anymore - if we let people extend BasicVertex (or MutableVertex) 
instead of always extending Vertex, they'll get killed with runtime classcast 
exceptions if they try to do any balancing.

 Failure of Vertex reflection for putVertexList from GIRAPH-27 
 --

 Key: GIRAPH-34
 URL: https://issues.apache.org/jira/browse/GIRAPH-34
 Project: Giraph
  Issue Type: Bug
Reporter: Christian Kunz
Assignee: Avery Ching
 Attachments: GIRAPH-34.patch


 Christian actually found this bug.  I am filing the JIRA on his behalf.  
 Here's my error when running TestVertexRangeBalancer.  
 java.lang.RuntimeException: java.io.IOException: Call to 
 returnwhose-lm/10.72.107.231:30002 failed on local exception: 
 java.io.EOFException
   at 
 org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:768)
   at 
 org.apache.giraph.graph.BspServiceWorker.exchangeVertexRanges(BspServiceWorker.java:1282)
   at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:589)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
   at org.apache.hadoop.mapred.Child.main(Child.java:253)
 Caused by: java.io.IOException: Call to returnwhose-lm/10.72.107.231:30002 
 failed on local exception: java.io.EOFException
   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
   at org.apache.hadoop.ipc.Client.call(Client.java:1033)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
   at $Proxy3.putVertexList(Unknown Source)
   at 
 org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:766)
   ... 10 more
 Caused by: java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:375)
   at 
 org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
 I identified and fixed the issue by making BasicVertex implement Configurable 
 and making the graph state set in BasicRPCCommunications.  There is one more 
 error though that I'll try and solve before putting up a reviewboard.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-34) Failure of Vertex reflection for putVertexList from GIRAPH-27

2011-09-16 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106748#comment-13106748
 ] 

Jake Mannix commented on GIRAPH-34:
---

I'll definitely open another JIRA for the Vertex subclasses, and dig into that 
a bit.

But on this current topic, I see how users could possibly do something like 
sendMsg(destVertex, getVertexValue()), yes.  But isn't this analogous to in 
regular Hadoop-land, that you simply cannot expect to hang onto your Writable 
instances and use them later.  If you're in 
Mapper.map(SomethingWritableComparable key, SomethingWritable value, Context 
c), you should *never* just buffer up the key and value instances, as this is 
practically guaranteed to break - Hadoop will be re-using the key and value as 
container objects to read new bytes off of disk for the next invocation to 
map(), so that java objects are rarely created, instead you're just constantly 
doing simple bit/byte operations on the disk stream, and setting values inside 
of Writable containers.  

It seems like one of the basic contracts of Writables (at least in Hadoop-land) 
is that they are always to be considered containers: call get() or 
getSomeKindOfThing() on them as soon as you have a handle on one, and use 
whatever *that* is, assuming that the framework can and will reuse your 
original Writable.

 Failure of Vertex reflection for putVertexList from GIRAPH-27 
 --

 Key: GIRAPH-34
 URL: https://issues.apache.org/jira/browse/GIRAPH-34
 Project: Giraph
  Issue Type: Bug
Reporter: Christian Kunz
Assignee: Avery Ching
 Attachments: GIRAPH-34.patch


 Christian actually found this bug.  I am filing the JIRA on his behalf.  
 Here's my error when running TestVertexRangeBalancer.  
 java.lang.RuntimeException: java.io.IOException: Call to 
 returnwhose-lm/10.72.107.231:30002 failed on local exception: 
 java.io.EOFException
   at 
 org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:768)
   at 
 org.apache.giraph.graph.BspServiceWorker.exchangeVertexRanges(BspServiceWorker.java:1282)
   at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:589)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
   at org.apache.hadoop.mapred.Child.main(Child.java:253)
 Caused by: java.io.IOException: Call to returnwhose-lm/10.72.107.231:30002 
 failed on local exception: java.io.EOFException
   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
   at org.apache.hadoop.ipc.Client.call(Client.java:1033)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
   at $Proxy3.putVertexList(Unknown Source)
   at 
 org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:766)
   ... 10 more
 Caused by: java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:375)
   at 
 org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
 I identified and fixed the issue by making BasicVertex implement Configurable 
 and making the graph state set in BasicRPCCommunications.  There is one more 
 error though that I'll try and solve before putting up a reviewboard.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-34) Failure of Vertex reflection for putVertexList from GIRAPH-27

2011-09-16 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106780#comment-13106780
 ] 

Jake Mannix commented on GIRAPH-34:
---

Yeah, how do you do that, Dmitriy?

 Failure of Vertex reflection for putVertexList from GIRAPH-27 
 --

 Key: GIRAPH-34
 URL: https://issues.apache.org/jira/browse/GIRAPH-34
 Project: Giraph
  Issue Type: Bug
Reporter: Christian Kunz
Assignee: Avery Ching
 Attachments: GIRAPH-34.patch


 Christian actually found this bug.  I am filing the JIRA on his behalf.  
 Here's my error when running TestVertexRangeBalancer.  
 java.lang.RuntimeException: java.io.IOException: Call to 
 returnwhose-lm/10.72.107.231:30002 failed on local exception: 
 java.io.EOFException
   at 
 org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:768)
   at 
 org.apache.giraph.graph.BspServiceWorker.exchangeVertexRanges(BspServiceWorker.java:1282)
   at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:589)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
   at org.apache.hadoop.mapred.Child.main(Child.java:253)
 Caused by: java.io.IOException: Call to returnwhose-lm/10.72.107.231:30002 
 failed on local exception: java.io.EOFException
   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
   at org.apache.hadoop.ipc.Client.call(Client.java:1033)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
   at $Proxy3.putVertexList(Unknown Source)
   at 
 org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:766)
   ... 10 more
 Caused by: java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:375)
   at 
 org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
 I identified and fixed the issue by making BasicVertex implement Configurable 
 and making the graph state set in BasicRPCCommunications.  There is one more 
 error though that I'll try and solve before putting up a reviewboard.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-34) Failure of Vertex reflection for putVertexList from GIRAPH-27

2011-09-16 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106804#comment-13106804
 ] 

Jake Mannix commented on GIRAPH-34:
---

+1 from me - although I haven't run it on an actual cluster, so I'm going by my 
reading of the code.

Although we should think further about ways we can be safe:  it's possible that 
the right and efficient thing to do is analogous to your context.write() 
example: we take the Writable message, and we serialize the Writable to a 
byte[], and pass that byte[] to the local recipient if there is one.  That 
recipient should be able to inexpensively deserialize and rehydrate the 
messages on the fly when running the VertexCombiner (only using one container 
Writable at a time, doing the same thing that Hadoop does, essentially) and 
just before the call to compute().

 Failure of Vertex reflection for putVertexList from GIRAPH-27 
 --

 Key: GIRAPH-34
 URL: https://issues.apache.org/jira/browse/GIRAPH-34
 Project: Giraph
  Issue Type: Bug
Reporter: Christian Kunz
Assignee: Avery Ching
 Attachments: GIRAPH-34.patch


 Christian actually found this bug.  I am filing the JIRA on his behalf.  
 Here's my error when running TestVertexRangeBalancer.  
 java.lang.RuntimeException: java.io.IOException: Call to 
 returnwhose-lm/10.72.107.231:30002 failed on local exception: 
 java.io.EOFException
   at 
 org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:768)
   at 
 org.apache.giraph.graph.BspServiceWorker.exchangeVertexRanges(BspServiceWorker.java:1282)
   at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:589)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
   at org.apache.hadoop.mapred.Child.main(Child.java:253)
 Caused by: java.io.IOException: Call to returnwhose-lm/10.72.107.231:30002 
 failed on local exception: java.io.EOFException
   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
   at org.apache.hadoop.ipc.Client.call(Client.java:1033)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
   at $Proxy3.putVertexList(Unknown Source)
   at 
 org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:766)
   ... 10 more
 Caused by: java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:375)
   at 
 org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
 I identified and fixed the issue by making BasicVertex implement Configurable 
 and making the graph state set in BasicRPCCommunications.  There is one more 
 error though that I'll try and solve before putting up a reviewboard.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-12) Investigate communication improvements

2011-09-16 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106842#comment-13106842
 ] 

Jake Mannix commented on GIRAPH-12:
---

Hey Hyunsik, if you're going to write a benchmark for the RPC stuff, that 
would be totally great.  I'd like to start playing around with trying Finagle 
in here, and we can compare notes on what kinds of techniques among both 
approaches work better, unless I'd be stepping on your toes by doing so...

 Investigate communication improvements
 --

 Key: GIRAPH-12
 URL: https://issues.apache.org/jira/browse/GIRAPH-12
 Project: Giraph
  Issue Type: Improvement
  Components: bsp
Reporter: Avery Ching
Assignee: Hyunsik Choi
Priority: Minor
 Attachments: GIRAPH-12_1.patch


 Currently every worker will start up a thread to communicate with every other 
 workers.  Hadoop RPC is used for communication.  For instance if there are 
 400 workers, each worker will create 400 threads.  This ends up using a lot 
 of memory, even with the option  
 -Dmapred.child.java.opts=-Xss64k.  
 It would be good to investigate using frameworks like Netty or custom roll 
 our own to improve this situation.  By moving away from Hadoop RPC, we would 
 also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps

2011-09-16 Thread Jake Mannix (JIRA)
Ensure that subclassing BasicVertex is possible by user apps


 Key: GIRAPH-36
 URL: https://issues.apache.org/jira/browse/GIRAPH-36
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
Priority: Blocker
 Fix For: 0.70.0


Original assumptions in Giraph were that all users would subclass Vertex (which 
extended MutableVertex extended BasicVertex).  Classes which wish to have 
application specific data structures (ie. not a TreeMapI, EdgeI,E) may need 
to extend either MutableVertex or BasicVertex.  Unfortunately VertexRange 
extends ArrayListVertex, and there are other places where the assumption is 
that vertex classes are either Vertex, or at least MutableVertex.

Let's make sure the internal APIs allow for BasicVertex to be the base class.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-15 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105568#comment-13105568
 ] 

Jake Mannix commented on GIRAPH-28:
---

I don't know what it was, I just re-patched with current trunk, after the 
refactorings of the most recent few patches.  Memory use dropped to what it 
should be!

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff, GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-15 Thread Jake Mannix (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jake Mannix updated GIRAPH-28:
--

Attachment: GIRAPH-28.diff

Newly regenerated against trunk.

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff, GIRAPH-28.diff, GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-14 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105004#comment-13105004
 ] 

Jake Mannix commented on GIRAPH-28:
---

Ok another patch coming soon for this, but good news:  this is the output of 
the object size calculator now:
(key: Primitive is what Dmitriy put in that test code, LDFD is a trivial class 
which extends the new LongDoubleFloatDoubleVertex class, and shows exactly the 
same memory as this)

Tiny:   0   840
Object: 0   872
Primitive:  0   4536
LDFD:   0   4536
Tiny:   1   840
Object: 1   976
Primitive:  1   4536
LDFD:   1   4536
Tiny:   10  840
Object: 10  1912
Primitive:  10  4536
LDFD:   10  4536
Tiny:   100 2640
Object: 100 11272
Primitive:  100 4536
LDFD:   100 4536
Tiny:   100016080
Object: 1000104872
Primitive:  100046784
LDFD:   100046784
Tiny:   1   123600
Object: 1   1040872
Primitive:  1   302000
LDFD:   1   302000

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff, GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103745#comment-13103745
 ] 

Jake Mannix commented on GIRAPH-31:
---

And for the implementations which have both the ability to provide a sorted 
iterator which isn't prohibitively expensive, but also provide a much faster 
unsorted iterator, they can choose whether to return true or false from the 
isSorted() method, and provide another method of the type you're suggesting. 


 Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. 
 detail), replace with appropriate accessor methods
 ---

 Key: GIRAPH-31
 URL: https://issues.apache.org/jira/browse/GIRAPH-31
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-31.diff


 As discussed on the list, and on GIRAPH-28, the SortedMapI, EdgeI,E is an 
 implementation detail which needs not be exposed to application developers - 
 they need to iterate over the edges, and possibly access them one-by-one, and 
 remove them (in the Mutable case), but they don't need the SortedMap, and 
 creating primitive-optimized BasicVertex implementations is hampered by the 
 fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103798#comment-13103798
 ] 

Jake Mannix commented on GIRAPH-31:
---

+1 to that, given your argument on the current use of the class.  It may come a 
time when we have generic things going on in GraphMapper or BspServiceWorker 
which need to do special optimized things to sorted vertices, and at that time 
we can add an isSorted() or getSortedIterator() method.

 Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. 
 detail), replace with appropriate accessor methods
 ---

 Key: GIRAPH-31
 URL: https://issues.apache.org/jira/browse/GIRAPH-31
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-31.diff


 As discussed on the list, and on GIRAPH-28, the SortedMapI, EdgeI,E is an 
 implementation detail which needs not be exposed to application developers - 
 they need to iterate over the edges, and possibly access them one-by-one, and 
 remove them (in the Mutable case), but they don't need the SortedMap, and 
 creating primitive-optimized BasicVertex implementations is hampered by the 
 fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (GIRAPH-31) Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jake Mannix updated GIRAPH-31:
--

Attachment: GIRAPH-31.diff

Updated patch - remove isSorted(), document the fact that the iterator may or 
may not be sorted (and in fact is, in Vertex), and that users may subclass 
either Vertex *or* MutableVertex.  

I have not tested subclassing BasicVertex, which I suspect would fail in 
various ways, as VertexReader, GraphMapper, and some other classes may expect 
to get a MutableVertex for some methods.

 Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. 
 detail), replace with appropriate accessor methods
 ---

 Key: GIRAPH-31
 URL: https://issues.apache.org/jira/browse/GIRAPH-31
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-31.diff, GIRAPH-31.diff


 As discussed on the list, and on GIRAPH-28, the SortedMapI, EdgeI,E is an 
 implementation detail which needs not be exposed to application developers - 
 they need to iterate over the edges, and possibly access them one-by-one, and 
 remove them (in the Mutable case), but they don't need the SortedMap, and 
 creating primitive-optimized BasicVertex implementations is hampered by the 
 fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103948#comment-13103948
 ] 

Jake Mannix commented on GIRAPH-31:
---

Sounds good to me!  Lazy consensus is pretty common to The Apache Way ( 
http://www.apache.org/foundation/voting.html#LazyConsensus ).

 Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. 
 detail), replace with appropriate accessor methods
 ---

 Key: GIRAPH-31
 URL: https://issues.apache.org/jira/browse/GIRAPH-31
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-31.diff, GIRAPH-31.diff


 As discussed on the list, and on GIRAPH-28, the SortedMapI, EdgeI,E is an 
 implementation detail which needs not be exposed to application developers - 
 they need to iterate over the edges, and possibly access them one-by-one, and 
 remove them (in the Mutable case), but they don't need the SortedMap, and 
 creating primitive-optimized BasicVertex implementations is hampered by the 
 fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-12 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1310#comment-1310
 ] 

Jake Mannix commented on GIRAPH-28:
---

Ok, so I went ahead and made a 'straw-man' refactoring branch (on GitHub: 
https://github.com/jakemannix/giraph/tree/vertex_map_refactor ), removing the 
getDestEdgeMap() method, and having BasicVertex implement Iterable, as well as 
the random-access read method getEdgeValue(targetVertexId).

I got it passing tests, but ran into a few things we may want to consider:

testing for existence of a target vertex is no longer possible: 
getEdgeValue(targetVertexId) returns the *value* associated with the edge.  
Edges are allowed to have null values and still denote a connection between the 
source and target vertex, right?  Maybe we should just have an EdgeI, E 
getEdge(I targetVertexId) method instead?

Secondly, far less importantly, is we need to have getNumOutEdges(), because 
clients often want to know the out-degree of the vertex, and they used to call 
getDestEdgeMap().size().

Thirdly: for the same reason that getEdgeValue() can return superfluous nulls, 
removeEdge(), used as a boolean, can trick the caller into thinking there was 
no connection to the target (because removeEdge() returned null), but really 
it's because I was trying to be clever and return the *value* which could be 
null.  Having removeEdge() return the actual Edge fixes this.

I'll open another ticket for this stuff, as patching this patch seems a bit 
silly.

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff, GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (GIRAPH-31) Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-12 Thread Jake Mannix (JIRA)
Hide the SortedMapI, EdgeI,E in Vertex from client visibility (impl. 
detail), replace with appropriate accessor methods
---

 Key: GIRAPH-31
 URL: https://issues.apache.org/jira/browse/GIRAPH-31
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix


As discussed on the list, and on GIRAPH-28, the SortedMapI, EdgeI,E is an 
implementation detail which needs not be exposed to application developers - 
they need to iterate over the edges, and possibly access them one-by-one, and 
remove them (in the Mutable case), but they don't need the SortedMap, and 
creating primitive-optimized BasicVertex implementations is hampered by the 
fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-12 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103339#comment-13103339
 ] 

Jake Mannix commented on GIRAPH-28:
---

I'm suggesting that iterator() be always sorted.  SortedMap implements Iterable 
(by way of Collection), and the iterator it returns is always in the sorted 
order.  We'd have BasicVertex do the same thing.

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff, GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-12 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103341#comment-13103341
 ] 

Jake Mannix commented on GIRAPH-28:
---

Also, to contradict my 1st and 3rd points, Dmitriy pointed out (in an 
out-of-band chat) that if we don't want to expose Edge to the user, because a) 
don't want to store it in memory (as his test showed that even switching 
TreeMapI, EdgeI,E to TreeMapI, E reduced memory usage by a fair amount), 
and b) don't want to have to instantiate tons of useless objects by lazily 
creating them, we could instead just keep the getEdgeValue() and removeEdge() 
as they were, but also add a boolean hasEdge(I targetVertexId) to test for 
connection.  

Then you get everything you need without exposing the Edge class (which only 
gets used internally for its Writable nature):

if(vertex.hasEdge(targetVertexId)) { 
  E value = vertex.getEdgeValue(targetVertexId);
  vertex.removeEdge(targetVertexId);
}

etc...

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff, GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-12 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103357#comment-13103357
 ] 

Jake Mannix commented on GIRAPH-28:
---

The alternative to IterableEdgeI, E is IterableI, returning only the 
target vertices, and you can call getEdgeValue(targetVertexId) on any of these 
if you need it.  Again, many algorithms will simply do something like

for(I targetId : vertex) {
  sendMsg(targetId, someFunction(baseMsg, getEdgeValue(targetId));
}

which is maybe a little nicer looking (or at least not uglier) than:

for(EdgeI, E edge : vertex) {
  sendMsg(edge.getVertexId(), someFunction(baseMsg, edge.getValue());
}

And then there are no Edge objects hanging around.

Alternatively, Edge could act just like a typical Writable, and the 
IteratorEdgeI, E iterates over the *same* Edge object setting different 
values on it as next() is called.


 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff, GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-12 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103363#comment-13103363
 ] 

Jake Mannix commented on GIRAPH-28:
---

As for sorting, I'd imagine that assuming it always returns a sorted iterator 
is fine, but yes, some implementations I can imagine might not want to do that. 
 I'd lean against having multiple iterators until it was known that they were 
needed, and maybe just document the ones which return nonsorted ones so that 
things don't get messed up? 

Vertex subclasses are where the algorithms are implemented, right?  So a 
Vertex knows whether it has a sorted iterator or not... the only question would 
be: are there generic methods implemented in things like BspServiceWorker, or 
GraphMapper, which would be expected to need to do things to a sorted iterator? 
 Currently there are no such places that I can see.   Without such cases, we 
could easily leave Vertex implementations to decide whether they needed to 
return sorted iterators or not.

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff, GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-11 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102430#comment-13102430
 ] 

Jake Mannix commented on GIRAPH-28:
---

So Avery, the question I have for you is regarding the getOutEdgeMap() method - 
if we get rid of that, and instead maybe offer something like the other methods 
discussed on the list thread:  

  E getEdge(I targetVertexId); 
  ImmutableListI getSortedOutVertices();
  boolean removeEdge(I targetVertexId);

we could do away with being tied to this TreeMap (although for now, keep it 
around in Vertex.java, as there's not much else possible in the generic object 
case, most likely), in addition to allowing me to remove my insane pretend 
SortedMap wrapper class.

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-11 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102442#comment-13102442
 ] 

Jake Mannix commented on GIRAPH-28:
---

I like the Iterator more than ImmutableList, yeah, that's great.  I wonder if 
then just making BasicVertex implement IterableEdgeI,E would be called for: 
for(EdgeI,E edge : vertex) { ... } ?  Not sure if that syntactic sugar is 
worth it.

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-09 Thread Jake Mannix (JIRA)
Introduce new primitive-specific MutableVertex subclasses
-

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix


As discussed on the list, 
MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
example) could be highly optimized in its memory footprint if the vertex and 
edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-09 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101021#comment-13101021
 ] 

Jake Mannix commented on GIRAPH-28:
---

This is a toy version of LongDoubleFloatDoubleVertex, a proof of concept that 
you can get SimplePageRankVertex extends LongDoubleFloatDoubleVertex to pass 
its current unit tests without subclassing Vertex (and only using primitives 
internally!)

 Introduce new primitive-specific MutableVertex subclasses
 -

 Key: GIRAPH-28
 URL: https://issues.apache.org/jira/browse/GIRAPH-28
 Project: Giraph
  Issue Type: New Feature
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
 Attachments: GIRAPH-28.diff


 As discussed on the list, 
 MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for 
 example) could be highly optimized in its memory footprint if the vertex and 
 edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-27) Mutable static global state in Vertex.java should be refactored

2011-09-08 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100901#comment-13100901
 ] 

Jake Mannix commented on GIRAPH-27:
---

Awesome, thanks Avery.  Looks good to me.

In looking over the diff in more detail in reviewboard, I notice that there are 
still a bunch of places where Vertex is referred to, but really BasicVertex (or 
at most MutableVertex) is all that's needed.  But I'll open another ticket for 
those changes once this has been merged in.

 Mutable static global state in Vertex.java should be refactored
 ---

 Key: GIRAPH-27
 URL: https://issues.apache.org/jira/browse/GIRAPH-27
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Affects Versions: 0.70.0
Reporter: Jake Mannix
Assignee: Jake Mannix
 Attachments: GIRAPH-27.patch, GIRAPH-27.patch


 Vertex.java has a bunch of static methods for getting/setting global graph 
 state (total number of vertices, edges, a reference to the GraphMapper, etc). 
  Refactoring this into a GraphState object, which every Vertex can hold onto 
 a reference to (yes, a tiny bit more memory per Vertex, but in comparison to 
 what's already in there...)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira