Re: java.net.ConnectException: Connection refused

2014-11-03 Thread Xenia Demetriou
Hi Puneet,

I am not an expert but I had the same error and I solved it by changing the
hostnames of the cluster-Pcs in lowercase e.g Make

iHadoop3 - ihadoop3

--
Xenia


2014-11-02 14:08 GMT+02:00 Puneet Agarwal puagar...@yahoo.com:

 I have setup a cluster of 4 computers for running my Pregel jobs.

 When running a job I often get the following error (given below).
 I followed another thread in giraph forums and learnt that this problem is
 because of the firewall stopping network traffic.
 I have stopped the firewall service on all the machines. These are
 machines have RHEL 5.5 and I stopped the service using the command -
 service iptables stop

 But I still get the same error.

 Can someone tell me what could be causing this service to be blocked on
 port 30001 on this computer?

 Regards
 Puneet (IIT Delhi, India)

 Re: Problem running the PageRank example in a cluster
 http://mail-archives.apache.org/mod_mbox/giraph-user/201310.mbox/%3CCAAjjGef9QT6y_gobLzCFp=SERreJ9Rfv0zOnKpiUfED4S6=a...@mail.gmail.com%3E






 Re: Problem running the PageRank example in a cluster
 http://mail-archives.apache.org/mod_mbox/giraph-user/201310.mbox/%3CCAAjjGef9QT6y_gobLzCFp=SERreJ9Rfv0zOnKpiUfED4S6=a...@mail.gmail.com%3E
 this is the output of the command in all servers: Chain INPUT (policy
 ACCEPT) target prot opt source destination ACCEPT tcp -- anywhere anywhere
 state NEW tcp dpts:3:30010 ACCEPT tcp -- anywhere anywhere ...
 View on mail-archives.apache.org
 http://mail-archives.apache.org/mod_mbox/giraph-user/201310.mbox/%3CCAAjjGef9QT6y_gobLzCFp=SERreJ9Rfv0zOnKpiUfED4S6=a...@mail.gmail.com%3E
 Preview by Yahoo




 Error
 ===
 Using Netty without authentication.

 2014-11-02 14:26:24,458 WARN org.apache.giraph.comm.netty.NettyClient: 
 connectAllAddresses: Future failed to connect with 
 iHadoop3/172.21.208.178:30001 with 0 failures because of 
 java.net.ConnectException: Connection refused
 2014-11-02 14:26:24,458 INFO org.apache.giraph.comm.netty.NettyClient: Using 
 Netty without authentication.
 2014-11-02 14:26:24,459 INFO org.apache.giraph.comm.netty.NettyClient: 
 connectAllAddresses: Successfully added 0 connections, (0 total connected) 1 
 failed, 1 failures total.
 2014-11-02 14:26:24,499 WARN 
 org.apache.giraph.comm.netty.handler.ResponseClientHandler: exceptionCaught: 
 Channel failed with remote address null
 java.net.ConnectException: Connection refused

   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)2014-11-02 
 14:26:24,459 INFO org.apache.giraph.comm.netty.NettyClient: 
 connectAllAddresses: Successfully added 0 connections, (0 total connected) 1 
 failed, 1 failures total.
 2014-11-02 14:26:24,499 WARN 
 org.apache.giraph.comm.netty.handler.ResponseClientHandler: exceptionCaught: 
 Channel failed with remote address null

 java.net.ConnectException: Connection refusedjava.net.ConnectException: 
 Connection refused






Re: [VOTE] Apache Giraph 1.1.0 RC1

2014-11-03 Thread Maja Kabiljo
We¹ve been running code which is the same as release candidate plus fix on
GIRAPH-961 in production for 5 days now, no problems. This is
hadoop_facebook profile, using only hive-io from all io modules.

On 11/1/14, 3:49 PM, Roman Shaposhnik ro...@shaposhnik.org wrote:

Ping! Any progress on testing the current RC?

Thanks,
Roman.

On Fri, Oct 31, 2014 at 9:00 AM, Claudio Martella
claudio.marte...@gmail.com wrote:
 Oh, thanks for the info!

 On Fri, Oct 31, 2014 at 3:06 PM, Roman Shaposhnik ro...@shaposhnik.org
 wrote:

 On Fri, Oct 31, 2014 at 3:26 AM, Claudio Martella
 claudio.marte...@gmail.com wrote:
  Hi Roman,
 
  thanks again for this. I have had a look at the staging site so far
(our
  cluster has been down whole week... universities...), and I was
  wondering if
  you have an insight why some of the docs are missing, e.g. gora and
  rexster
  documentation.

 None of them are missing. The links moved to a User Docs - Modules
 though:

https://urldefense.proofpoint.com/v1/url?u=http://people.apache.org/~rvs
/giraph-1.1.0-RC1/site/gora.htmlk=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=RGg
8bUFUf%2FM2K95hnYD1RGWK1CQ%2BbcclArMcjzJodKY%3D%0Am=8PzjCy0QzsbRm9lbAnj
1Sreanb81jw%2FnRRX1Zju8ZvM%3D%0As=aabb0575b0830bb2c1b05645279b426e8789e
fca3a6049073b214e2fbf832ec7

https://urldefense.proofpoint.com/v1/url?u=http://people.apache.org/~rvs
/giraph-1.1.0-RC1/site/rexster.htmlk=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=
RGg8bUFUf%2FM2K95hnYD1RGWK1CQ%2BbcclArMcjzJodKY%3D%0Am=8PzjCy0QzsbRm9lb
Anj1Sreanb81jw%2FnRRX1Zju8ZvM%3D%0As=08f4a813900872e6085eea6d3569bf7db0
078c050d4aac784c0d61ba8f70504d
 and so forth.

 Thanks,
 Roman.




 --
Claudio Martella




Graph partitioning and data locality

2014-11-03 Thread Martin Junghanns

Hi group,

I got a question concerning the graph partitioning step. If I understood 
the code correctly, the graph is distributed to n partitions by using 
vertexID.hashCode()  n. I got two questions concerning that step.


1) Is the whole graph loaded and partitioned only by the Master? This 
would mean, the whole data has to be moved to that Master map job and 
then moved to the physical node the specific worker for the partition 
runs on. As this sounds like a huge overhead, I further inspected the code:
I saw that there is also a WorkerGraphPartitioner and I assume he calls 
the partitioning method on his local data (lets say his local HDFS 
blocks) and if the resulting partition for a vertex is not himself, the 
data gets moved to that worker, which reduces the overhead. Is this 
assumption correct?


2) Let's say the graph is already partitioned in the file system, e.g. 
blocks on physical nodes contain logical connected graph nodes. Is it 
possible to just read the data as it is and skip the partitioning step? 
In that case I currently assume, that the vertexID should contain the 
partitionID and the custom partitioning would be an identity function in 
that case (instead of hashing or range).


Thanks for your time and help!

Cheers,
Martin


Re: Graph partitioning and data locality

2014-11-03 Thread Martin Junghanns

sorry for the typo (no coffee yet): vertexID.hashCode() *%* n

On 04.11.2014 08:36, Martin Junghanns wrote:

Hi group,

I got a question concerning the graph partitioning step. If I 
understood the code correctly, the graph is distributed to n 
partitions by using vertexID.hashCode()  n. I got two questions 
concerning that step.


1) Is the whole graph loaded and partitioned only by the Master? This 
would mean, the whole data has to be moved to that Master map job and 
then moved to the physical node the specific worker for the partition 
runs on. As this sounds like a huge overhead, I further inspected the 
code:
I saw that there is also a WorkerGraphPartitioner and I assume he 
calls the partitioning method on his local data (lets say his local 
HDFS blocks) and if the resulting partition for a vertex is not 
himself, the data gets moved to that worker, which reduces the 
overhead. Is this assumption correct?


2) Let's say the graph is already partitioned in the file system, e.g. 
blocks on physical nodes contain logical connected graph nodes. Is it 
possible to just read the data as it is and skip the partitioning 
step? In that case I currently assume, that the vertexID should 
contain the partitionID and the custom partitioning would be an 
identity function in that case (instead of hashing or range).


Thanks for your time and help!

Cheers,
Martin