[jira] [Commented] (GIRAPH-12) Investigate communication improvements

2011-09-14 Thread Hyunsik Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104340#comment-13104340
 ] 

Hyunsik Choi commented on GIRAPH-12:


You mean that we need some benchmark program to test the performance and 
scalability of message passing methods. If so, I'll add two benchmarking 
programs, which are sending messages to peers in random and skewed distribution 
respectively. For this, I'll create another issue.

Let me know what you think :)

> Investigate communication improvements
> --
>
> Key: GIRAPH-12
> URL: https://issues.apache.org/jira/browse/GIRAPH-12
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp
>Reporter: Avery Ching
>Assignee: Hyunsik Choi
>Priority: Minor
> Attachments: GIRAPH-12_1.patch
>
>
> Currently every worker will start up a thread to communicate with every other 
> workers.  Hadoop RPC is used for communication.  For instance if there are 
> 400 workers, each worker will create 400 threads.  This ends up using a lot 
> of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll 
> our own to improve this situation.  By moving away from Hadoop RPC, we would 
> also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (GIRAPH-32) Implement benchmarks to evaluate the performance of message passing

2011-09-14 Thread Hyunsik Choi (JIRA)
Implement benchmarks to evaluate the performance of message passing 


 Key: GIRAPH-32
 URL: https://issues.apache.org/jira/browse/GIRAPH-32
 Project: Giraph
  Issue Type: Task
  Components: benchmark
Reporter: Hyunsik Choi
Assignee: Hyunsik Choi
 Fix For: 0.70.0


Message passing framework plays an important role in Giraph.
We need some benchmark programs to evaluate the improvement related to message 
passing method.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-21) Revise CODE_CONVENTIONS

2011-09-14 Thread Sebastian Schelter (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-21?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104363#comment-13104363
 ] 

Sebastian Schelter commented on GIRAPH-21:
--

I'm currently reading a lot of giraph code as I'm evaluating it for usage in 
research and I must admit that 80 chars per line really makes the code hard to 
read. 

Although I'm not involved with your project, I'd suggest 2 space indent and 120 
chars per line. Mahout uses the same.

> Revise CODE_CONVENTIONS
> ---
>
> Key: GIRAPH-21
> URL: https://issues.apache.org/jira/browse/GIRAPH-21
> Project: Giraph
>  Issue Type: Improvement
>Reporter: Avery Ching
>Assignee: Avery Ching
>Priority: Minor
> Attachments: GIRAPH-21.diff
>
>
> Currently there is a CODE_CONVENTIONS file in the base path of Giraph.  It's 
> fairly sparse and we have been assuming an 80 char limit per line.  It's good 
> to have common conventions so that the code doesn't get too messy.  Does 
> anyone have any opinions on this now?  Probably best to tackle early and then 
> have something to follow.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (GIRAPH-33) Missing license header of GraphState.java

2011-09-14 Thread Hyunsik Choi (JIRA)
Missing license header of GraphState.java
-

 Key: GIRAPH-33
 URL: https://issues.apache.org/jira/browse/GIRAPH-33
 Project: Giraph
  Issue Type: Task
  Components: graph
Reporter: Hyunsik Choi
Priority: Trivial
 Fix For: 0.70.0


GraphState.java doesn't contain apache license header.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (GIRAPH-33) Missing license header of GraphState.java

2011-09-14 Thread Hyunsik Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyunsik Choi updated GIRAPH-33:
---

Attachment: GIRAPH-33.patch

This patch adds apache license header.

> Missing license header of GraphState.java
> -
>
> Key: GIRAPH-33
> URL: https://issues.apache.org/jira/browse/GIRAPH-33
> Project: Giraph
>  Issue Type: Task
>  Components: graph
>Reporter: Hyunsik Choi
>Priority: Trivial
> Fix For: 0.70.0
>
> Attachments: GIRAPH-33.patch
>
>
> GraphState.java doesn't contain apache license header.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (GIRAPH-33) Missing license header of GraphState.java

2011-09-14 Thread Hyunsik Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyunsik Choi resolved GIRAPH-33.


Resolution: Fixed

This is a trivial fix.
I just committed.

> Missing license header of GraphState.java
> -
>
> Key: GIRAPH-33
> URL: https://issues.apache.org/jira/browse/GIRAPH-33
> Project: Giraph
>  Issue Type: Task
>  Components: graph
>Reporter: Hyunsik Choi
>Priority: Trivial
> Fix For: 0.70.0
>
> Attachments: GIRAPH-33.patch
>
>
> GraphState.java doesn't contain apache license header.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-33) Missing license header of GraphState.java

2011-09-14 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-33?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104499#comment-13104499
 ] 

Avery Ching commented on GIRAPH-33:
---

Thanks Hyunsik, sorry about that.

> Missing license header of GraphState.java
> -
>
> Key: GIRAPH-33
> URL: https://issues.apache.org/jira/browse/GIRAPH-33
> Project: Giraph
>  Issue Type: Task
>  Components: graph
>Reporter: Hyunsik Choi
>Priority: Trivial
> Fix For: 0.70.0
>
> Attachments: GIRAPH-33.patch
>
>
> GraphState.java doesn't contain apache license header.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Port to YARN: GIRAPH and HAMA

2011-09-14 Thread Thomas Jungblut
>
>  We are also thinking about other underlying computing models (i.e.
> streaming (asynchronous) graph processing - see


That is a really cool idea. But I don't think we are going to focus solely
on graph computing. We want to enable an interface which can be used for it
(straight forward as described in the Pregel Paper), but I think you are
really graph experts- so we don't want to compete with each other :D
Our asynchronous processing (in my opinion) will just enable the sending of
messages within the computation phase. So the BarrierSync is just a little
transition to make sure every task is ready and every message has been send.
Your Vertex locking is a graph-only feature, this won't be effecting us
anyways.

Giraph runs completely as a MapReduce job on Hadoop today.
>

Allright.

I think our result is the following:
We (Apache Hama) are focussing on the YARN implementation of the BSP
paradigm.
If you want to run Giraph on a real BSP engine later, feel free to put your
stuff on top of that.
As far as I have seen, there is a 100% backward compatibility of YARN, so
your current solution will run on YARN either.

Best Regards,

Thomas


[jira] [Commented] (GIRAPH-30) NPE in ZooKeeperManager if base directory cannot be created

2011-09-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104526#comment-13104526
 ] 

Hudson commented on GIRAPH-30:
--

Integrated in Giraph-trunk-Commit #5 (See 
[https://builds.apache.org/job/Giraph-trunk-Commit/5/])
GIRAPH-30: NPE in ZooKeeperManager if base directory cannot be
created. apurtell via aching.

aching : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170427
Files : 
* /incubator/giraph/trunk/CHANGELOG
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/zk/ZooKeeperManager.java


> NPE in ZooKeeperManager if base directory cannot be created
> ---
>
> Key: GIRAPH-30
> URL: https://issues.apache.org/jira/browse/GIRAPH-30
> Project: Giraph
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: GIRAPH-30.2.patch, GIRAPH-30.patch
>
>
> If the base directory cannot be created, for example if running on secure 
> Hadoop and the user home directory does not exist, ZooKeeperManager will 
> throw an NPE when trying to list it. It would be better to throw an 
> IOException with an informative message.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-33) Missing license header of GraphState.java

2011-09-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-33?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104528#comment-13104528
 ] 

Hudson commented on GIRAPH-33:
--

Integrated in Giraph-trunk-Commit #5 (See 
[https://builds.apache.org/job/Giraph-trunk-Commit/5/])
GIRAPH-33: Missing license header of GraphState.java

hyunsik : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170531
Files : 
* /incubator/giraph/trunk/CHANGELOG
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphState.java


> Missing license header of GraphState.java
> -
>
> Key: GIRAPH-33
> URL: https://issues.apache.org/jira/browse/GIRAPH-33
> Project: Giraph
>  Issue Type: Task
>  Components: graph
>Reporter: Hyunsik Choi
>Priority: Trivial
> Fix For: 0.70.0
>
> Attachments: GIRAPH-33.patch
>
>
> GraphState.java doesn't contain apache license header.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104527#comment-13104527
 ] 

Hudson commented on GIRAPH-31:
--

Integrated in Giraph-trunk-Commit #5 (See 
[https://builds.apache.org/job/Giraph-trunk-Commit/5/])
GIRAPH-31: Hide the SortedMap> in Vertex from client
visibility (impl. detail), replace with appropriate accessor
methods. jake.mannix via aching.

aching : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170431
Files : 
* /incubator/giraph/trunk/CHANGELOG
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/benchmark/PageRankBenchmark.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/benchmark/PseudoRandomVertexInputFormat.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimpleCheckpointVertex.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimpleFailVertex.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimplePageRankVertex.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimpleShortestPathsVertex.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimpleSuperstepVertex.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BasicVertex.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/MutableVertex.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/Vertex.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/VertexRange.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/VertexResolver.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/lib/JsonBase64VertexInputFormat.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/lib/JsonBase64VertexOutputFormat.java


> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff, GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Port to YARN: GIRAPH and HAMA

2011-09-14 Thread Vinod Kumar Vavilapalli
Avery,

Some replies inline to the issues you outlined.

>1)  Giraph runs completely as a MapReduce job on Hadoop today.  This needs
to be maintained to support our current users, who will not likely move to
MRv2 for at least a year.
I think what you need is to support Giraph's graph API for your users, but
no, not the underlying implementation. (Or are you leaking MapReduce APIs to
your users?) Sure, you are restricted to the under implementation(Hadoop
MRV1 or MRV2 whenever it gets used) at any point of time, but what we are
discussing is _that_ future when the underlying implementation itself also
moves to MRV2.

>2)  The internals of Giraph are implemented differently than Hama..
Sure, but only at present. My original question is - given a BSP
implementation on a YARN cluster, can GiraphV2(BSP based) be simply
implemented over that or not. If today, GiraphV1 uses (its own) BSP
implementation over mapreduce APIs on Hadoop MRV1 cluster, I can clearly see
how GiraphV2 can be using (HAMA's) BSP implemented over YARN APIs.

>3)  If we have various graph processing computing models (BSP based,
streams or asynchronous, or a combination), then being on Hama brings little
value for Giraph.
That future isn't there yet. In any case, I'd bet when you get there, lot of
what you have now also wouldn't be an out-of-the-box fit.

>From my perspective (a third person POV), this is what I can conclude.
Giraph's velocity on Hadoop MapReduce may be real the impedence for thinking
about a possible sharing of the bsp based implementation with HAMAV2. Sure,
Giraph has other ideas regarding the computation model itself, but that is a
future that isn't here yet.

I just hope the same velocity isn't an impedance for thinking about the
next-gen version on top of YARN :) The way I see it, porting Giraph to YARN
is also a revolution in itself; most, if not all, of the implementation will
change yet with the API level compatibility. I am still eagerly looking
forward to the port of Giraph to YARN. May be more digging into Giraph
internals may help my cause too.

If nothing, this discussion atleast helped sharing of some of the ideas
between the two communities.

Thanks all for putting down in your thoughts.
+Vinod


On Wed, Sep 14, 2011 at 11:46 AM, Thomas Jungblut <
thomas.jungb...@googlemail.com> wrote:

>  We are also thinking about other underlying computing models (i.e.
>> streaming (asynchronous) graph processing - see
>
>
> That is a really cool idea. But I don't think we are going to focus solely
> on graph computing. We want to enable an interface which can be used for it
> (straight forward as described in the Pregel Paper), but I think you are
> really graph experts- so we don't want to compete with each other :D
> Our asynchronous processing (in my opinion) will just enable the sending of
> messages within the computation phase. So the BarrierSync is just a little
> transition to make sure every task is ready and every message has been send.
> Your Vertex locking is a graph-only feature, this won't be effecting us
> anyways.
>
>
> Giraph runs completely as a MapReduce job on Hadoop today.
>>
>
> Allright.
>
> I think our result is the following:
> We (Apache Hama) are focussing on the YARN implementation of the BSP
> paradigm.
> If you want to run Giraph on a real BSP engine later, feel free to put your
> stuff on top of that.
> As far as I have seen, there is a 100% backward compatibility of YARN, so
> your current solution will run on YARN either.
>
> Best Regards,
>
> Thomas
>


Re: Port to YARN: GIRAPH and HAMA

2011-09-14 Thread Owen O'Malley
On Tue, Sep 13, 2011 at 10:47 AM, Avery Ching  wrote:
> 1)  Giraph runs completely as a MapReduce job on Hadoop today.  This needs
> to be maintained to support our current users, who will not likely move to
> MRv2 for at least a year.

Giraph already has ifdefs to deal with the 0.20 and 0.20.2xx API
changes, so it shouldn't be hard to deal with MRv2 the same way.

-- Owen


Re: Port to YARN: GIRAPH and HAMA

2011-09-14 Thread Avery Ching

Vinod, thanks for your comments.  I've replied inline.

Avery

On 9/14/11 11:09 AM, Vinod Kumar Vavilapalli wrote:

Avery,

Some replies inline to the issues you outlined.


1)  Giraph runs completely as a MapReduce job on Hadoop today.  This needs

to be maintained to support our current users, who will not likely move to
MRv2 for at least a year.
I think what you need is to support Giraph's graph API for your users, but
no, not the underlying implementation. (Or are you leaking MapReduce APIs to
your users?) Sure, you are restricted to the under implementation(Hadoop
MRV1 or MRV2 whenever it gets used) at any point of time, but what we are
discussing is _that_ future when the underlying implementation itself also
moves to MRV2.
I think the takeaway should be that our clients (at Yahoo! and 
elsewhere) are currently using Giraph on MRv1.  While the Giraph API is 
not exposing the underlying infrastructure APIs (i.e. MRv1 and MRv2), we 
still need to support the MRv1 implementation even while we 
begin/complete the port to MRv2.  I imagine that we will need to support 
both MRv1 and MRv2 for a fairly long period of time as the transition to 
MRv2 for a company (i.e. Yahoo!) could take a very long time (i.e. 
anywhere between 8 months to multiple years).  Some of our internal 
clusters at Yahoo! today are still running 0.20.1 for example.

2)  The internals of Giraph are implemented differently than Hama..

Sure, but only at present. My original question is - given a BSP
implementation on a YARN cluster, can GiraphV2(BSP based) be simply
implemented over that or not. If today, GiraphV1 uses (its own) BSP
implementation over mapreduce APIs on Hadoop MRV1 cluster, I can clearly see
how GiraphV2 can be using (HAMA's) BSP implemented over YARN APIs.

In theory this is true.  However, as mentioned previously, we still have 
users on MRv1 and will need to support it for a long time (i.e. at least 
a year, probably more).   Also I'm fairly certain that during the next 
year, we will have non-BSP based graph processing computing models in 
place as well.  For these reasons, it may not make sense to try to put 
Giraph on top of HAMA even when we are both on MRv2.  It's hard to say 
now as it is early.  Let's visit this at a later time.



3)  If we have various graph processing computing models (BSP based,

streams or asynchronous, or a combination), then being on Hama brings little
value for Giraph.
That future isn't there yet. In any case, I'd bet when you get there, lot of
what you have now also wouldn't be an out-of-the-box fit.

 From my perspective (a third person POV), this is what I can conclude.
Giraph's velocity on Hadoop MapReduce may be real the impedence for thinking
about a possible sharing of the bsp based implementation with HAMAV2. Sure,
Giraph has other ideas regarding the computation model itself, but that is a
future that isn't here yet.

I just hope the same velocity isn't an impedance for thinking about the
next-gen version on top of YARN :) The way I see it, porting Giraph to YARN
is also a revolution in itself; most, if not all, of the implementation will
change yet with the API level compatibility. I am still eagerly looking
forward to the port of Giraph to YARN. May be more digging into Giraph
internals may help my cause too.
Giraph does appear to be moving with a fast velocity currently, but we 
have a clear intention to run on top of MRv2.  Please see 
https://issues.apache.org/jira/browse/GIRAPH-13.  Obviously, the MRv2 
changes are much better suited for Giraph and we look forward to the day 
when nearly all Hadoop instances are running MRv2.

If nothing, this discussion atleast helped sharing of some of the ideas
between the two communities.

Thanks all for putting down in your thoughts.
+Vinod


On Wed, Sep 14, 2011 at 11:46 AM, Thomas Jungblut<
thomas.jungb...@googlemail.com>  wrote:


  We are also thinking about other underlying computing models (i.e.

streaming (asynchronous) graph processing - see


That is a really cool idea. But I don't think we are going to focus solely
on graph computing. We want to enable an interface which can be used for it
(straight forward as described in the Pregel Paper), but I think you are
really graph experts- so we don't want to compete with each other :D
Our asynchronous processing (in my opinion) will just enable the sending of
messages within the computation phase. So the BarrierSync is just a little
transition to make sure every task is ready and every message has been send.
Your Vertex locking is a graph-only feature, this won't be effecting us
anyways.


Giraph runs completely as a MapReduce job on Hadoop today.
Allright.

I think our result is the following:
We (Apache Hama) are focussing on the YARN implementation of the BSP
paradigm.
If you want to run Giraph on a real BSP engine later, feel free to put your
stuff on top of that.
As far as I have seen, there is a 100% backward compatibility of YARN, so
your current solution will run on YA

[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-14 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105004#comment-13105004
 ] 

Jake Mannix commented on GIRAPH-28:
---

Ok another patch coming soon for this, but good news:  this is the output of 
the object size calculator now:
(key: Primitive is what Dmitriy put in that test code, LDFD is a trivial class 
which extends the new LongDoubleFloatDoubleVertex class, and shows exactly the 
same memory as this)

Tiny:   0   840
Object: 0   872
Primitive:  0   4536
LDFD:   0   4536
Tiny:   1   840
Object: 1   976
Primitive:  1   4536
LDFD:   1   4536
Tiny:   10  840
Object: 10  1912
Primitive:  10  4536
LDFD:   10  4536
Tiny:   100 2640
Object: 100 11272
Primitive:  100 4536
LDFD:   100 4536
Tiny:   100016080
Object: 1000104872
Primitive:  100046784
LDFD:   100046784
Tiny:   1   123600
Object: 1   1040872
Primitive:  1   302000
LDFD:   1   302000

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff, GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-14 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105039#comment-13105039
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

Nice job, Jimmy. Looks like you managed to get rid of something that was 
sucking in memory in both Object and LDFD.. what was it?

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff, GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses

2011-09-14 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105040#comment-13105040
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-28:
-

E Jake. Sorry. Clearly, I can't tell you remote search people apart :-P

> Introduce new primitive-specific MutableVertex subclasses
> -
>
> Key: GIRAPH-28
> URL: https://issues.apache.org/jira/browse/GIRAPH-28
> Project: Giraph
>  Issue Type: New Feature
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-28.diff, GIRAPH-28.diff
>
>
> As discussed on the list, 
> MutableVertex (for 
> example) could be highly optimized in its memory footprint if the vertex and 
> edge data were held in a form which minimized Java object usage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (GIRAPH-34) Failure of Vertex reflection for putVertexList from GIRAPH-27

2011-09-14 Thread Avery Ching (JIRA)
Failure of Vertex reflection for putVertexList from GIRAPH-27 
--

 Key: GIRAPH-34
 URL: https://issues.apache.org/jira/browse/GIRAPH-34
 Project: Giraph
  Issue Type: Bug
Reporter: Christian Kunz
Assignee: Avery Ching


Christian actually found this bug.  I am filing the JIRA on his behalf.  Here's 
my error when running TestVertexRangeBalancer.  

java.lang.RuntimeException: java.io.IOException: Call to 
returnwhose-lm/10.72.107.231:30002 failed on local exception: 
java.io.EOFException
at 
org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:768)
at 
org.apache.giraph.graph.BspServiceWorker.exchangeVertexRanges(BspServiceWorker.java:1282)
at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:589)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.io.IOException: Call to returnwhose-lm/10.72.107.231:30002 
failed on local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
at org.apache.hadoop.ipc.Client.call(Client.java:1033)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
at $Proxy3.putVertexList(Unknown Source)
at 
org.apache.giraph.comm.BasicRPCCommunications.sendVertexListReq(BasicRPCCommunications.java:766)
... 10 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)

I identified and fixed the issue by making BasicVertex implement Configurable 
and making the graph state set in BasicRPCCommunications.  There is one more 
error though that I'll try and solve before putting up a reviewboard.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira