Please try to build without -Dmaven.test.skip=true option. > Problem 2: > Compared to 0.6.0, you use VerticesInfo to store the vertices, and in the > DiskVerticesInfo, they serialize and deserialize every vertices into the > local file system. > Is this used to provides fault tolerance (the checkpoint part)? Or is it > designed for other purposes?
It was a part of effort to reduce the memory usage[1]. In the TRUNK version, we've optimized memory consumption, by serializing vertex objects in memory, without big degradation of performance. As I mentioned before, the vertices doesn't occupy large memory. > If it is designed for checkpoint part in fault tolerance, why it write to > local disk but not HDFS? > In my mind, if a machine crashed, if the fault tolerance mechanism depends > on the manual reboot or repair of the crashed machine, the potential lengthy > recovery time is intolerant. > Do you agree with me? > Or maybe you have other trade-off? User will be able to set the checkpointing interval. Then, the content of memory buffers need to be written to HDFS only when checkpoint occurs. 1. https://issues.apache.org/jira/browse/HAMA-704?focusedCommentId=13580454&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13580454 On Fri, Feb 28, 2014 at 11:55 AM, developer wang <[email protected]> wrote: > Thank you very much for the previous useful reply. > But I encounters with other problems. > ----------------------------------------------------------------------------------------------------------------------------------------------- > 1 Problem > I tried the trunk (with the commit-point in git: > 1b3f1744a33a29686c2eafe7764bb3640938fcc8), but it can not pass the > compilation phase (I use this command: mvn install package > -Dmaven.test.skip=true) it complained this: > > [INFO] > [INFO] > ------------------------------------------------------------------------ > [INFO] Building graph 0.7.0-SNAPSHOT > [INFO] > ------------------------------------------------------------------------ > [INFO] > ------------------------------------------------------------------------ > [INFO] Reactor Summary: > [INFO] > [INFO] Apache Hama parent POM ............................ SUCCESS [2.960s] > [INFO] pipes ............................................. SUCCESS [10.033s] > [INFO] commons ........................................... SUCCESS [6.664s] > [INFO] core .............................................. SUCCESS [23.909s] > [INFO] graph ............................................. FAILURE [0.048s] > [INFO] machine learning .................................. SKIPPED > [INFO] examples .......................................... SKIPPED > [INFO] hama-dist ......................................... SKIPPED > [INFO] > ------------------------------------------------------------------------ > [INFO] BUILD FAILURE > [INFO] > ------------------------------------------------------------------------ > [INFO] Total time: 44.102s > [INFO] Finished at: Fri Feb 28 10:06:23 HKT 2014 > [INFO] Final Memory: 50M/384M > [INFO] > ------------------------------------------------------------------------ > [ERROR] Failed to execute goal on project hama-graph: Could not resolve > dependencies for project org.apache.hama:hama-graph:jar:0.7.0-SNAPSHOT: > Failure to find org.apache.hama:hama-core:jar:tests:0.7.0-SNAPSHOT in > https://repository.cloudera.com/artifactory/cloudera-repos was cached in the > local repository, resolution will not be reattempted until the update > interval of cloudera-repo has elapsed or updates are forced -> [Help 1] > > Does this mean you forget upload some jars to the maven remote repository? > (I can the above command to compile 0.6.3) > > I surveyed about this question on the Internet, some says I should run maven > with -U. > So I tried to this command: mvn -U compile > but it still fails. almost the same error: > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-remote-resources-plugin:1.1:process (default) > on project hama-graph: Failed to resolve dependencies for one or more > projects in the reactor. Reason: Missing: > [ERROR] ---------- > [ERROR] 1) org.apache.hama:hama-core:test-jar:tests:0.7.0-SNAPSHOT > [ERROR] > [ERROR] Try downloading the file manually from the project website. > [ERROR] > [ERROR] Then, install it using the command: > [ERROR] mvn install:install-file -DgroupId=org.apache.hama > -DartifactId=hama-core -Dversion=0.7.0-SNAPSHOT -Dclassifier=tests > -Dpackaging=test-jar -Dfile=/path/to/file > [ERROR] > [ERROR] Alternatively, if you host your own repository you can deploy the > file there: > [ERROR] mvn deploy:deploy-file -DgroupId=org.apache.hama > -DartifactId=hama-core -Dversion=0.7.0-SNAPSHOT -Dclassifier=tests > -Dpackaging=test-jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id] > [ERROR] > [ERROR] Path to dependency: > [ERROR] 1) org.apache.hama:hama-graph:jar:0.7.0-SNAPSHOT > [ERROR] 2) org.apache.hama:hama-core:test-jar:tests:0.7.0-SNAPSHOT > [ERROR] > [ERROR] ---------- > [ERROR] 1 required artifact is missing. > > > > > > ----------------------------------------------------------------------------------------------------------------------------------------------- > > Problem 2: > Compared to 0.6.0, you use VerticesInfo to store the vertices, and in the > DiskVerticesInfo, they serialize and deserialize every vertices into the > local file system. > Is this used to provides fault tolerance (the checkpoint part)? Or is it > designed for other purposes? > > If it is designed for checkpoint part in fault tolerance, why it write to > local disk but not HDFS? > In my mind, if a machine crashed, if the fault tolerance mechanism depends > on the manual reboot or repair of the crashed machine, the potential lengthy > recovery time is intolerant. > Do you agree with me? > Or maybe you have other trade-off? > > > > 2014-02-28 8:07 GMT+08:00 Edward J. Yoon <[email protected]>: > >> In 0.6.3, you can use only ListVerticesInfo. Please use the TRUNK if you >> want. >> >> And, vertices doesn't occupy large memory. Please use ListVerticesInfo. >> >> FT is not supported yet. >> >> On Thu, Feb 27, 2014 at 10:06 PM, developer wang <[email protected]> >> wrote: >> > Hi, all. >> > Thank you for your detailed reply. >> > >> > In the previous test, I used a incomplete graph to run PageRank, then >> > I >> > got this error: >> > java.lang.IllegalArgumentException: Messages must never be behind the >> > vertex in ID! Current Message ID: >> > >> > With your detailed reply, I knew it was because some vertices tried to >> > send messages to dangling nodes (Actually 0.6.0 can handle this by >> > adding a >> > repair phase). Then I fixed this by adding dangling nodes explicitly >> > with a >> > line which only contains a vertex id. >> > >> > After this, I could run the PageRank (in the attachment) with the >> > ListVerticesInfo. >> > >> > But if I use DiskVerticesInfo instead of DiskVerticesInfo by this: >> > pageJob.set("hama.graph.vertices.info", >> > "org.apache.hama.graph.DiskVerticesInfo"); >> > >> > I will still get the below error: >> > java.lang.IllegalArgumentException: Messages must never be behind the >> > vertex in ID! Current Message ID: >> > >> > What's the problem? >> > Do I use DiskVerticesInfo correctly? >> > >> > Or if I want run my application with fault tolerance, what should I do? >> > >> > Thank you very much. >> > >> > >> > >> > 2014-02-26 18:27 GMT+08:00 Edward J. Yoon <[email protected]>: >> > >> >> > Could you answer this question: >> >> > I found during the loading, peers would not exchange vertices with >> >> > each >> >> > other as hama 0.6.0 did. >> >> > So how does hama 0.6.3 solve the problem below: a peer load a vertex >> >> > which >> >> > is belong to another peer? (for example, suppose 3 peers for this >> >> > task >> >> > and >> >> > the partitoner is Hash, peer #1 loads vertex 2, in 0.6.3, peer #2 did >> >> > not >> >> > send vertex 2 to peer #2) >> >> >> >> Instead of network communication, 0.6.3 uses file communication for >> >> input data partitioning. Please see >> >> >> >> >> >> http://svn.apache.org/repos/asf/hama/trunk/core/src/main/java/org/apache/hama/bsp/PartitioningRunner.java >> >> >> >> On Wed, Feb 26, 2014 at 6:03 PM, developer wang >> >> <[email protected]> >> >> wrote: >> >> > Actually I comment the set statement >> >> > //pageJob.set("hama.graph.self.ref", "true"); >> >> > >> >> > and In GraphJobRunner: >> >> > final boolean selfReference = >> >> > conf.getBoolean("hama.graph.self.ref", >> >> > false); >> >> > >> >> > And I will explicitly set hama.graph.self.ref to false, and use a >> >> > complete >> >> > graph to have a try again. >> >> > >> >> > >> >> > Could you answer this question: >> >> > I found during the loading, peers would not exchange vertices with >> >> > each >> >> > other as hama 0.6.0 did. >> >> > So how does hama 0.6.3 solve the problem below: a peer load a vertex >> >> > which >> >> > is belong to another peer? (for example, suppose 3 peers for this >> >> > task >> >> > and >> >> > the partitoner is Hash, peer #1 loads vertex 2, in 0.6.3, peer #2 did >> >> > not >> >> > send vertex 2 to peer #2) >> >> > >> >> > or I have some misunderstanding about hama 0.6.3 or above? (In the >> >> > last >> >> > years, I used 0.6.0 to do the daily job) >> >> > >> >> > >> >> > >> >> > 2014-02-26 16:48 GMT+08:00 Edward J. Yoon <[email protected]>: >> >> > >> >> >> The background is described here: >> >> >> https://issues.apache.org/jira/browse/HAMA-758 >> >> >> >> >> >> On Wed, Feb 26, 2014 at 5:38 PM, Edward J. Yoon >> >> >> <[email protected]> >> >> >> wrote: >> >> >> > Oh, please try after set "hama.check.missing.vertex" to false in >> >> >> > job >> >> >> > configuration. >> >> >> > >> >> >> > On Wed, Feb 26, 2014 at 5:14 PM, developer wang >> >> >> > <[email protected]> >> >> >> > wrote: >> >> >> >> Thank you very much. >> >> >> >> >> >> >> >> Since I think the framework should not decide whether the graph >> >> >> >> should >> >> >> >> self-reference, so I disable this config. (Actually when I used >> >> >> >> 0.6.0, >> >> >> >> I >> >> >> >> also disabled this config) >> >> >> >> >> >> >> >> Since I use my PC to test whether my application works, I use a >> >> >> >> small >> >> >> >> graph. >> >> >> >> (It does have a lot of dangling node) >> >> >> >> >> >> >> >> The dataset and the PageRank is attached. >> >> >> >> >> >> >> >> Thank you very much. >> >> >> >> >> >> >> >> >> >> >> >> 2014-02-26 16:04 GMT+08:00 Edward J. Yoon >> >> >> >> <[email protected]>: >> >> >> >> >> >> >> >>> Hi Wang, >> >> >> >>> >> >> >> >>> Can you send me your input data so that I can debug? >> >> >> >>> >> >> >> >>> On Wed, Feb 26, 2014 at 4:55 PM, developer wang >> >> >> >>> <[email protected]> >> >> >> >>> wrote: >> >> >> >>> > Firstly, thank you very much for reply. >> >> >> >>> > >> >> >> >>> > But in the log, I found "14/02/25 16:45:00 INFO >> >> >> >>> > graph.GraphJobRunner: >> >> >> >>> > 2918 >> >> >> >>> > vertices are loaded into localhost:60340 " >> >> >> >>> > So it had finished the loading phase. is this true? >> >> >> >>> > >> >> >> >>> > Another problem is that: >> >> >> >>> > I found during the loading, peers would not exchange vertices >> >> >> >>> > with >> >> >> >>> > each >> >> >> >>> > other as hama 0.6.0 did. >> >> >> >>> > So how does hama 0.6.3 solve the problem below: a peer load a >> >> >> >>> > vertex >> >> >> >>> > which >> >> >> >>> > is belong to another peer? (for example, suppose 3 peers for >> >> >> >>> > this >> >> >> >>> > task >> >> >> >>> > and >> >> >> >>> > the partitoner is Hash, peer #1 loads vertex 2, in 0.6.3, peer >> >> >> >>> > #2 >> >> >> >>> > did >> >> >> >>> > not >> >> >> >>> > send vertex 2 to peer #2) >> >> >> >>> > >> >> >> >>> > >> >> >> >>> > 2014-02-26 15:46 GMT+08:00 Edward J. Yoon >> >> >> >>> > <[email protected]>: >> >> >> >>> > >> >> >> >>> >> > I tried PageRank with a small input of my own. >> >> >> >>> >> >> >> >> >>> >> Hi Wang, >> >> >> >>> >> >> >> >> >>> >> This error often occurs when there is a record conversion >> >> >> >>> >> error. >> >> >> >>> >> So, >> >> >> >>> >> you should check whether the vertex reader works correctly. >> >> >> >>> >> >> >> >> >>> >> And, I highly recommend you to use latest TRUNK version[1] as >> >> >> >>> >> possible. >> >> >> >>> >> >> >> >> >>> >> 1. >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> http://wiki.apache.org/hama/GettingStarted#Build_latest_version_from_source >> >> >> >>> >> >> >> >> >>> >> Thank you. >> >> >> >>> >> >> >> >> >>> >> On Wed, Feb 26, 2014 at 1:44 PM, developer wang >> >> >> >>> >> <[email protected]> >> >> >> >>> >> wrote: >> >> >> >>> >> > Hi, all. >> >> >> >>> >> > I am Peng Wang, a student trying to use and learn Hama. >> >> >> >>> >> > >> >> >> >>> >> > I cloned the develop git repository of Hama. >> >> >> >>> >> > >> >> >> >>> >> > I firstly tried the newest version in the tag, the tag: >> >> >> >>> >> > 0.7.0-SNAPSHOT. >> >> >> >>> >> > commit bef419747695d15de8a1087f44028ee40571b5f9 >> >> >> >>> >> > Author: Edward J. Yoon <[email protected]> >> >> >> >>> >> > Date: Fri Mar 29 00:44:59 2013 +0000 >> >> >> >>> >> > >> >> >> >>> >> > [maven-release-plugin] copy for tag 0.7.0-SNAPSHOT >> >> >> >>> >> > >> >> >> >>> >> > git-svn-id: >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > https://svn.apache.org/repos/asf/hama/tags/0.7.0-SNAPSHOT@1462366 >> >> >> >>> >> > 13f79535-47bb-0310-9956-ffa450edef68 >> >> >> >>> >> > >> >> >> >>> >> > But the tag: 0.6.3-RC3 >> >> >> >>> >> > commit c9526b1272c83d641332667ce5d81d7ccc94be06 >> >> >> >>> >> > Author: Edward J. Yoon <[email protected]> >> >> >> >>> >> > Date: Sun Oct 6 08:27:00 2013 +0000 >> >> >> >>> >> > >> >> >> >>> >> > [maven-release-plugin] copy for tag 0.6.3-RC3 >> >> >> >>> >> > >> >> >> >>> >> > git-svn-id: >> >> >> >>> >> > >> >> >> >>> >> > https://svn.apache.org/repos/asf/hama/tags/0.6.3-RC3@1529594 >> >> >> >>> >> > 13f79535-47bb-0310-9956-ffa450edef68 >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > From the commit log, 0.7.0-SNAPSHOT is earlier than >> >> >> >>> >> > 0.6.3-RC3, >> >> >> >>> >> > So I used 0.6.3-RC3 instead of 0.7.0-SNAPSHOT (but on >> >> >> >>> >> > the >> >> >> >>> >> > website >> >> >> >>> >> > of >> >> >> >>> >> > hama, 0.7.0-SNAPSHOT is the newest version) >> >> >> >>> >> > >> >> >> >>> >> > Then I deployed Hama with the Pseudo Distributed Mode on >> >> >> >>> >> > my >> >> >> >>> >> > desktop >> >> >> >>> >> > with >> >> >> >>> >> > 3 task runners. >> >> >> >>> >> > I tried PageRank with a small input of my own. >> >> >> >>> >> > But it failes. And its log is: >> >> >> >>> >> > java.lang.IllegalArgumentException: Messages must never be >> >> >> >>> >> > behind >> >> >> >>> >> > the >> >> >> >>> >> > vertex >> >> >> >>> >> > in ID! Current Message ID: 100128 vs. 1004 >> >> >> >>> >> > at >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:306) >> >> >> >>> >> > at >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:254) >> >> >> >>> >> > at >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145) >> >> >> >>> >> > at >> >> >> >>> >> > org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:177) >> >> >> >>> >> > at >> >> >> >>> >> > org.apache.hama.bsp.BSPTask.run(BSPTask.java:146) >> >> >> >>> >> > at >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1246) >> >> >> >>> >> > >> >> >> >>> >> > Could you tell me what is the problem in my situation? >> >> >> >>> >> > >> >> >> >>> >> > I check whether hama had finished the loading phase, and >> >> >> >>> >> > I >> >> >> >>> >> > found >> >> >> >>> >> > "14/02/25 16:45:00 INFO graph.GraphJobRunner: 2918 vertices >> >> >> >>> >> > are >> >> >> >>> >> > loaded >> >> >> >>> >> > into >> >> >> >>> >> > localhost:60340 "in the log. >> >> >> >>> >> > So it had finished the loading phase. >> >> >> >>> >> > >> >> >> >>> >> > After this, I read the source code, and I found during >> >> >> >>> >> > the >> >> >> >>> >> > loading, >> >> >> >>> >> > peers would not exchange vertices with each other as hama >> >> >> >>> >> > 0.5.0 >> >> >> >>> >> > did. >> >> >> >>> >> > So how does hama 0.6.3 solve the problem below: a peer >> >> >> >>> >> > load >> >> >> >>> >> > a >> >> >> >>> >> > vertex >> >> >> >>> >> > which is belong to another peer? >> >> >> >>> >> > >> >> >> >>> >> > Could you tell which branch or tag is a stable version? >> >> >> >>> >> > And does it support fault tolerance for graph >> >> >> >>> >> > algorithms? >> >> >> >>> >> > and >> >> >> >>> >> > how >> >> >> >>> >> > can I >> >> >> >>> >> > get it? >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> -- >> >> >> >>> >> Edward J. Yoon (@eddieyoon) >> >> >> >>> >> Chief Executive Officer >> >> >> >>> >> DataSayer, Inc. >> >> >> >>> > >> >> >> >>> > >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> -- >> >> >> >>> Edward J. Yoon (@eddieyoon) >> >> >> >>> Chief Executive Officer >> >> >> >>> DataSayer, Inc. >> >> >> >> >> >> >> >> >> >> >> > >> >> >> > >> >> >> > >> >> >> > -- >> >> >> > Edward J. Yoon (@eddieyoon) >> >> >> > Chief Executive Officer >> >> >> > DataSayer, Inc. >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Edward J. Yoon (@eddieyoon) >> >> >> Chief Executive Officer >> >> >> DataSayer, Inc. >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Edward J. Yoon (@eddieyoon) >> >> Chief Executive Officer >> >> DataSayer, Inc. >> > >> > >> >> >> >> -- >> Edward J. Yoon (@eddieyoon) >> Chief Executive Officer >> DataSayer, Inc. > > -- Edward J. Yoon (@eddieyoon) Chief Executive Officer DataSayer, Inc.
