Re: java.net.ConnectException: Connection refused
Hi Puneet, I am not an expert but I had the same error and I solved it by changing the hostnames of the cluster-Pcs in lowercase e.g Make iHadoop3 - ihadoop3 -- Xenia 2014-11-02 14:08 GMT+02:00 Puneet Agarwal puagar...@yahoo.com: I have setup a cluster of 4 computers for running my Pregel jobs. When running a job I often get the following error (given below). I followed another thread in giraph forums and learnt that this problem is because of the firewall stopping network traffic. I have stopped the firewall service on all the machines. These are machines have RHEL 5.5 and I stopped the service using the command - service iptables stop But I still get the same error. Can someone tell me what could be causing this service to be blocked on port 30001 on this computer? Regards Puneet (IIT Delhi, India) Re: Problem running the PageRank example in a cluster http://mail-archives.apache.org/mod_mbox/giraph-user/201310.mbox/%3CCAAjjGef9QT6y_gobLzCFp=SERreJ9Rfv0zOnKpiUfED4S6=a...@mail.gmail.com%3E Re: Problem running the PageRank example in a cluster http://mail-archives.apache.org/mod_mbox/giraph-user/201310.mbox/%3CCAAjjGef9QT6y_gobLzCFp=SERreJ9Rfv0zOnKpiUfED4S6=a...@mail.gmail.com%3E this is the output of the command in all servers: Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT tcp -- anywhere anywhere state NEW tcp dpts:3:30010 ACCEPT tcp -- anywhere anywhere ... View on mail-archives.apache.org http://mail-archives.apache.org/mod_mbox/giraph-user/201310.mbox/%3CCAAjjGef9QT6y_gobLzCFp=SERreJ9Rfv0zOnKpiUfED4S6=a...@mail.gmail.com%3E Preview by Yahoo Error === Using Netty without authentication. 2014-11-02 14:26:24,458 WARN org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Future failed to connect with iHadoop3/172.21.208.178:30001 with 0 failures because of java.net.ConnectException: Connection refused 2014-11-02 14:26:24,458 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication. 2014-11-02 14:26:24,459 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 0 connections, (0 total connected) 1 failed, 1 failures total. 2014-11-02 14:26:24,499 WARN org.apache.giraph.comm.netty.handler.ResponseClientHandler: exceptionCaught: Channel failed with remote address null java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)2014-11-02 14:26:24,459 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 0 connections, (0 total connected) 1 failed, 1 failures total. 2014-11-02 14:26:24,499 WARN org.apache.giraph.comm.netty.handler.ResponseClientHandler: exceptionCaught: Channel failed with remote address null java.net.ConnectException: Connection refusedjava.net.ConnectException: Connection refused
Re: [VOTE] Apache Giraph 1.1.0 RC1
We¹ve been running code which is the same as release candidate plus fix on GIRAPH-961 in production for 5 days now, no problems. This is hadoop_facebook profile, using only hive-io from all io modules. On 11/1/14, 3:49 PM, Roman Shaposhnik ro...@shaposhnik.org wrote: Ping! Any progress on testing the current RC? Thanks, Roman. On Fri, Oct 31, 2014 at 9:00 AM, Claudio Martella claudio.marte...@gmail.com wrote: Oh, thanks for the info! On Fri, Oct 31, 2014 at 3:06 PM, Roman Shaposhnik ro...@shaposhnik.org wrote: On Fri, Oct 31, 2014 at 3:26 AM, Claudio Martella claudio.marte...@gmail.com wrote: Hi Roman, thanks again for this. I have had a look at the staging site so far (our cluster has been down whole week... universities...), and I was wondering if you have an insight why some of the docs are missing, e.g. gora and rexster documentation. None of them are missing. The links moved to a User Docs - Modules though: https://urldefense.proofpoint.com/v1/url?u=http://people.apache.org/~rvs /giraph-1.1.0-RC1/site/gora.htmlk=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=RGg 8bUFUf%2FM2K95hnYD1RGWK1CQ%2BbcclArMcjzJodKY%3D%0Am=8PzjCy0QzsbRm9lbAnj 1Sreanb81jw%2FnRRX1Zju8ZvM%3D%0As=aabb0575b0830bb2c1b05645279b426e8789e fca3a6049073b214e2fbf832ec7 https://urldefense.proofpoint.com/v1/url?u=http://people.apache.org/~rvs /giraph-1.1.0-RC1/site/rexster.htmlk=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar= RGg8bUFUf%2FM2K95hnYD1RGWK1CQ%2BbcclArMcjzJodKY%3D%0Am=8PzjCy0QzsbRm9lb Anj1Sreanb81jw%2FnRRX1Zju8ZvM%3D%0As=08f4a813900872e6085eea6d3569bf7db0 078c050d4aac784c0d61ba8f70504d and so forth. Thanks, Roman. -- Claudio Martella
Graph partitioning and data locality
Hi group, I got a question concerning the graph partitioning step. If I understood the code correctly, the graph is distributed to n partitions by using vertexID.hashCode() n. I got two questions concerning that step. 1) Is the whole graph loaded and partitioned only by the Master? This would mean, the whole data has to be moved to that Master map job and then moved to the physical node the specific worker for the partition runs on. As this sounds like a huge overhead, I further inspected the code: I saw that there is also a WorkerGraphPartitioner and I assume he calls the partitioning method on his local data (lets say his local HDFS blocks) and if the resulting partition for a vertex is not himself, the data gets moved to that worker, which reduces the overhead. Is this assumption correct? 2) Let's say the graph is already partitioned in the file system, e.g. blocks on physical nodes contain logical connected graph nodes. Is it possible to just read the data as it is and skip the partitioning step? In that case I currently assume, that the vertexID should contain the partitionID and the custom partitioning would be an identity function in that case (instead of hashing or range). Thanks for your time and help! Cheers, Martin
Re: Graph partitioning and data locality
sorry for the typo (no coffee yet): vertexID.hashCode() *%* n On 04.11.2014 08:36, Martin Junghanns wrote: Hi group, I got a question concerning the graph partitioning step. If I understood the code correctly, the graph is distributed to n partitions by using vertexID.hashCode() n. I got two questions concerning that step. 1) Is the whole graph loaded and partitioned only by the Master? This would mean, the whole data has to be moved to that Master map job and then moved to the physical node the specific worker for the partition runs on. As this sounds like a huge overhead, I further inspected the code: I saw that there is also a WorkerGraphPartitioner and I assume he calls the partitioning method on his local data (lets say his local HDFS blocks) and if the resulting partition for a vertex is not himself, the data gets moved to that worker, which reduces the overhead. Is this assumption correct? 2) Let's say the graph is already partitioned in the file system, e.g. blocks on physical nodes contain logical connected graph nodes. Is it possible to just read the data as it is and skip the partitioning step? In that case I currently assume, that the vertexID should contain the partitionID and the custom partitioning would be an identity function in that case (instead of hashing or range). Thanks for your time and help! Cheers, Martin