[jira] [Assigned] (GIRAPH-30) NPE in ZooKeeperManager if base directory cannot be created
[ https://issues.apache.org/jira/browse/GIRAPH-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching reassigned GIRAPH-30: - Assignee: Andrew Purtell NPE in ZooKeeperManager if base directory cannot be created --- Key: GIRAPH-30 URL: https://issues.apache.org/jira/browse/GIRAPH-30 Project: Giraph Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: GIRAPH-30.2.patch, GIRAPH-30.patch If the base directory cannot be created, for example if running on secure Hadoop and the user home directory does not exist, ZooKeeperManager will throw an NPE when trying to list it. It would be better to throw an IOException with an informative message. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-28) Introduce new primitive-specific MutableVertex subclasses
[ https://issues.apache.org/jira/browse/GIRAPH-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1310#comment-1310 ] Avery Ching commented on GIRAPH-28: --- Jake, this is pretty cool that you got it to work. Any thoughts on what to do next? Perhaps add this to the examples directory? Maybe do some sort of memory test to see the gains of a primitive subclass? Introduce new primitive-specific MutableVertex subclasses - Key: GIRAPH-28 URL: https://issues.apache.org/jira/browse/GIRAPH-28 Project: Giraph Issue Type: New Feature Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Attachments: GIRAPH-28.diff As discussed on the list, MutableVertexLongWritable,DoubleWritable,FloatWritable,DoubleWritable (for example) could be highly optimized in its memory footprint if the vertex and edge data were held in a form which minimized Java object usage. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: GIRAPH-27 Mutable static global state in Vertex.java should be refactored
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1771/ --- (Updated 2011-09-09 06:26:15.766329) Review request for giraph. Changes --- Moved as much of the getGraphState() related method implementations from Vertex to BasicVertex and MutableVertex. Other changes for primitive implementations can be done in another JIRA. Summary --- Based on Jake's submission https://issues.apache.org/jira/secure/attachment/12493654/GIRAPH-27.patch Couple of small changes: - Do not expose GraphState to application developers - Fixing a few formatting issues This addresses bug GIRAPH-27. https://issues.apache.org/jira /browse/GIRAPH-27 Diffs (updated) - http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/bsp/CentralizedService.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BasicVertex.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspService.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspUtils.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphState.java PRE-CREATION http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/MutableVertex.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/Vertex.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/VertexRange.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/VertexResolver.java 1166925 http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/test/java/org/apache/giraph/TestBspBasic.java 1166925 Diff: https://reviews.apache.org/r/1771/diff Testing --- Unittest and page rank benchmark on Yahoo! grid with 10 workers. Thanks, Avery
[jira] [Resolved] (GIRAPH-25) NPE in BspServiceMaster when failing a job
[ https://issues.apache.org/jira/browse/GIRAPH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching resolved GIRAPH-25. --- Resolution: Fixed Not sure if I am supposed to close this issue, or the reporter should, but I'll close it since it's been committed. Please reopen if there is an issue. NPE in BspServiceMaster when failing a job -- Key: GIRAPH-25 URL: https://issues.apache.org/jira/browse/GIRAPH-25 Project: Giraph Issue Type: Bug Reporter: Dmitriy V. Ryaboy Assignee: Dmitriy V. Ryaboy Priority: Minor Attachments: GIRAPH-25.2.patch, GIRAPH-25.patch When BspServiceMaster times out waiting for all workers to check in, it dies with a NullPointerException. This can perhaps be handled a bit more gracefully. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-27) Mutable static global state in Vertex.java should be refactored
[ https://issues.apache.org/jira/browse/GIRAPH-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100856#comment-13100856 ] Avery Ching commented on GIRAPH-27: --- Thanks. I just updated it. Mutable static global state in Vertex.java should be refactored --- Key: GIRAPH-27 URL: https://issues.apache.org/jira/browse/GIRAPH-27 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Assignee: Jake Mannix Attachments: GIRAPH-27.patch, GIRAPH-27.patch Vertex.java has a bunch of static methods for getting/setting global graph state (total number of vertices, edges, a reference to the GraphMapper, etc). Refactoring this into a GraphState object, which every Vertex can hold onto a reference to (yes, a tiny bit more memory per Vertex, but in comparison to what's already in there...) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-27) Mutable static global state in Vertex.java should be refactored
[ https://issues.apache.org/jira/browse/GIRAPH-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100928#comment-13100928 ] Avery Ching commented on GIRAPH-27: --- That is actually intentional, since I need to have access to the get/setGraphState() internally and I removed the get/setGraphState() from BasicVertex. So rather than expose get/setGraphState() to the user (BasicVertex), I opted to to this. I suppose we could have another interface internally that extended BasicVertex to allow getting and setting the graph state if you're concerned about exposed the vertex to the internals. Let me know what you think. Mutable static global state in Vertex.java should be refactored --- Key: GIRAPH-27 URL: https://issues.apache.org/jira/browse/GIRAPH-27 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Assignee: Jake Mannix Attachments: GIRAPH-27.patch, GIRAPH-27.patch Vertex.java has a bunch of static methods for getting/setting global graph state (total number of vertices, edges, a reference to the GraphMapper, etc). Refactoring this into a GraphState object, which every Vertex can hold onto a reference to (yes, a tiny bit more memory per Vertex, but in comparison to what's already in there...) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-21) Revise CODE_CONVENTIONS
[ https://issues.apache.org/jira/browse/GIRAPH-21?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-21: -- Attachment: GIRAPH-21.diff First proposal of the developer suggested code conventions. Revise CODE_CONVENTIONS --- Key: GIRAPH-21 URL: https://issues.apache.org/jira/browse/GIRAPH-21 Project: Giraph Issue Type: Improvement Reporter: Avery Ching Assignee: Avery Ching Priority: Minor Attachments: GIRAPH-21.diff Currently there is a CODE_CONVENTIONS file in the base path of Giraph. It's fairly sparse and we have been assuming an 80 char limit per line. It's good to have common conventions so that the code doesn't get too messy. Does anyone have any opinions on this now? Probably best to tackle early and then have something to follow. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-22) Sort out examples from unit test helpers in examples package
[ https://issues.apache.org/jira/browse/GIRAPH-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094901#comment-13094901 ] Avery Ching commented on GIRAPH-22: --- Good idea. I think a few of them are full programs though. Not sure about the best way to do this. Sort out examples from unit test helpers in examples package Key: GIRAPH-22 URL: https://issues.apache.org/jira/browse/GIRAPH-22 Project: Giraph Issue Type: Improvement Reporter: Jakob Homan Within src/examples there are quite a few files defined that are mainly used in unit or other tests: * GeneratedVertexInputFormat * GeneratedVertexInputFormat * LongSumAggregator * MaxAggregator * MinAggregator * SimpleCombinerVertex * SimpleFailVertex * SimpleMsgVertex * SimpleMutateGraphVertex * SimpleSumCombiner * SumAggregator * SuperstepBalancer Several of these explicitly say they're designed to aid in unit testing. If these are indeed meant for testing, they should be moved to the test directory. If they're examples, it would be better to sort out the overly complicated ones and ones that include lots of tests and asserts, so only to show the essence of the example. Hopefully the examples directory have a few, very heavily documented programs of the helloworld/word count/shortest path variety (with sample inputs) that can be quickly launched. Once new developers grok these, they can turn to the unit tests, which can of course be great sources to learn the code from. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-13) Port Giraph to YARN
[ https://issues.apache.org/jira/browse/GIRAPH-13?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095043#comment-13095043 ] Avery Ching commented on GIRAPH-13: --- This is going to be a fun one. =) Thanks for taking it on. Port Giraph to YARN --- Key: GIRAPH-13 URL: https://issues.apache.org/jira/browse/GIRAPH-13 Project: Giraph Issue Type: New Feature Reporter: Jakob Homan Assignee: Jakob Homan Now that YARN (aka MR2 aka MAPREDUCE-279) has been merged into the Hadoop trunk, we should think about what it would take to separate out the graph processing bits of Giraph from the MR1-specific code so as to take advantage of the less-MR centric aspects of YARN, while still supporting both over the medium term. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-17) Giraph doesn't give up properly after the maximum connect attempts to ZooKeeper
[ https://issues.apache.org/jira/browse/GIRAPH-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093808#comment-13093808 ] Avery Ching commented on GIRAPH-17: --- Sure. The main part of this fix is -if (connectAttempts == 5) { +if (connectAttempts == maxConnectAttempts) { Basically this condition should be hit if the max connect attempts was tried, but never was because they because maxConnectAttempts is now 10 and became out of sync at some point (maxConnectAttempts probably used to be 5). The limit is stil not configurable, we can address that in a later issue. Giraph doesn't give up properly after the maximum connect attempts to ZooKeeper --- Key: GIRAPH-17 URL: https://issues.apache.org/jira/browse/GIRAPH-17 Project: Giraph Issue Type: Bug Reporter: Avery Ching Assignee: Avery Ching Priority: Minor Attachments: ZooKeeperManager.java.diff This produces incorrect and strange behavior. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Reviewboard for code reviews
Okay, let's make it optional for now. For me, it definitely helps to visualize the changes better. Also, I think the feedback tool is pretty good. Avery On Aug 30, 2011, at 11:52 AM, Henry Saputra wrote: Argh I meant It should just an option to help review and should not be required for patches. - Henry On Tue, Aug 30, 2011 at 11:51 AM, Henry Saputra henry.sapu...@gmail.com wrote: +1 It should just optional to help review not required. - Henry On Tue, Aug 30, 2011 at 11:48 AM, Jakob Homan jgho...@gmail.com wrote: We've just gone around on this one for Kafka and, if reviewboard is provided, it would be good to keep it as an optional part of the process. I've had very negative experiences with it, both in Hadoop and Hive. If one would like to do a reviewboard review, that's great - but for those who don't, standard bullet points should suffice. -jakob On Tue, Aug 30, 2011 at 11:38 AM, Avery Ching ach...@yahoo-inc.com wrote: Thanks Henry. I have filed issue https://issues.apache.org/jira/browse/INFRA-3892 to get reviewboard access. Avery On Aug 30, 2011, at 11:35 AM, Henry Saputra wrote: Hi Avery, yes you should file INFRA ticket to add Giraph as Groups in reviews board. I filed tickets to create one for Kafka and Gora. - Henry On Mon, Aug 29, 2011 at 10:13 PM, Avery Ching ach...@yahoo-inc.commailto:ach...@yahoo-inc.com wrote: https://blogs.apache.org/infra/entry/reviewboard_instance_running_at_the I'll file an INFRA ticket. Thanks, Avery On Aug 29, 2011, at 10:07 PM, Hyunsik Choi wrote: Looks possible. Some incubator project (e.g., Kafka) already has a reviewboard group. Best regards, -- Hyunsik Choi On Tue, Aug 30, 2011 at 1:48 PM, Avery Ching ach...@yahoo-inc.commailto:ach...@yahoo-inc.com wrote: Anyone know if we have reviewboard access? Thanks, Avery
[jira] [Commented] (GIRAPH-4) New project logo
[ https://issues.apache.org/jira/browse/GIRAPH-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094134#comment-13094134 ] Avery Ching commented on GIRAPH-4: -- Yes, it would certainly be nice to have a real logo. Do you want to give it a shot? New project logo Key: GIRAPH-4 URL: https://issues.apache.org/jira/browse/GIRAPH-4 Project: Giraph Issue Type: New Feature Reporter: Jakob Homan Now for the hard part: the project logo. We should create one and add it to the website once done. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-14) Support for the Facebook Hadoop branch
[ https://issues.apache.org/jira/browse/GIRAPH-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094305#comment-13094305 ] Avery Ching commented on GIRAPH-14: --- In theory, I believe that Facebook's distro is online (https://github.com/facebook/hadoop-20-warehouse). The long term story is to factor out the parts into modules and then compile them based on the user profile. Then we don't have to munge anything anymore. At least that's what I've thought of for now. I'm open to better solutions. Pre-processing will get unmaintainable if we have to support every version of Hadoop. That being said, we should support the big customers of Giraph and that likely includes Facebook as well. I'll add instructions to the README and submit a new patch. Support for the Facebook Hadoop branch -- Key: GIRAPH-14 URL: https://issues.apache.org/jira/browse/GIRAPH-14 Project: Giraph Issue Type: New Feature Reporter: Avery Ching Assignee: Avery Ching Attachments: facebook.txt, facebook2.txt I've been working with Joe Xie on support to get Giraph running on the Facebook Hadoop branch. He verified today that the examples worked on their cluster. I need to clean up my changes a little, but otherwise, will submit a cleaned up diff. As a side note, does anyone know how we can get Hudson support for Giraph? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-14) Support for the Facebook Hadoop branch
[ https://issues.apache.org/jira/browse/GIRAPH-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093348#comment-13093348 ] Avery Ching commented on GIRAPH-14: --- Thanks Hyunsik. Support for the Facebook Hadoop branch -- Key: GIRAPH-14 URL: https://issues.apache.org/jira/browse/GIRAPH-14 Project: Giraph Issue Type: New Feature Reporter: Avery Ching Assignee: Avery Ching I've been working with Joe Xie on support to get Giraph running on the Facebook Hadoop branch. He verified today that the examples worked on their cluster. I need to clean up my changes a little, but otherwise, will submit a cleaned up diff. As a side note, does anyone know how we can get Hudson support for Giraph? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GIRAPH-17) Giraph doesn't give up properly after the maximum connect attempts to ZooKeeper
Giraph doesn't give up properly after the maximum connect attempts to ZooKeeper --- Key: GIRAPH-17 URL: https://issues.apache.org/jira/browse/GIRAPH-17 Project: Giraph Issue Type: Bug Reporter: Avery Ching Assignee: Avery Ching Priority: Minor This produces incorrect and strange behavior. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GIRAPH-11) Improve the graph distribution of Giraph
Improve the graph distribution of Giraph Key: GIRAPH-11 URL: https://issues.apache.org/jira/browse/GIRAPH-11 Project: Giraph Issue Type: Improvement Reporter: Avery Ching Assignee: Avery Ching Currently, Giraph assumes that the data from the VertexInputFormat is sorted. If the user data is not sorted by the vertex id, they must first run a MapReduce or Pig job to generate a sorted dataset. This is often a bit inconvenient. Giraph graph partitioning is currently range based and there are some advantages and disadvantages of this approach. The proposal of this JIRA would be to allow for both range and hash based partitioning and provide more flexibility to the user. Design goals for the graph distribution: * Allow vertices to be unordered or unordered * Ability to repartition * Select the partitioning scheme based on user needs (i.e. hash or range based) * Ability to provide user-specific hints about partitions Hash-based partitioning * Good vertex balancing across ranges for random data * Bad at vertex id locality Range-based partitioning * Good at vertex id locality * Ability to split ranges easily * Can cause hotspots for hot ranges -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-6) Remove Yahoo-specific code from pom.xml
[ https://issues.apache.org/jira/browse/GIRAPH-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092117#comment-13092117 ] Avery Ching commented on GIRAPH-6: -- Thanks for doing this. Remove Yahoo-specific code from pom.xml --- Key: GIRAPH-6 URL: https://issues.apache.org/jira/browse/GIRAPH-6 Project: Giraph Issue Type: Bug Reporter: Jakob Homan Assignee: Jakob Homan Priority: Blocker Attachments: GIRAPH-6.patch There are remaining references to Y! infrastructure in the pom.xml, which prevents the build from succeeding. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GIRAPH-5) Remove Yahoo directories
Remove Yahoo directories Key: GIRAPH-5 URL: https://issues.apache.org/jira/browse/GIRAPH-5 Project: Giraph Issue Type: Task Reporter: Avery Ching Assignee: Avery Ching Priority: Minor As an artifact of pulling from the Yahoo! svn repository, we need to re-remove the Yahoo! specific build stuff. This was done already in GitHub, but of course, they are different places. I would like to remove the following directories: src/ci/ src/main/pkg Also, as Jakob has seen, our pom.xml needs cleanup. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (GIRAPH-5) Remove Yahoo directories
[ https://issues.apache.org/jira/browse/GIRAPH-5?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching resolved GIRAPH-5. -- Resolution: Fixed Committed after Jakob's +1. Remove Yahoo directories Key: GIRAPH-5 URL: https://issues.apache.org/jira/browse/GIRAPH-5 Project: Giraph Issue Type: Task Reporter: Avery Ching Assignee: Avery Ching Priority: Minor Attachments: diff.txt As an artifact of pulling from the Yahoo! svn repository, we need to re-remove the Yahoo! specific build stuff. This was done already in GitHub, but of course, they are different places. I would like to remove the following directories: src/ci/ src/main/pkg Also, as Jakob has seen, our pom.xml needs cleanup. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira