Here's my configurations:

hama-site.xml:

  <property>
    <name>bsp.master.address</name>
    <value>cluster-0:40000</value>
  </property>

  <property>
    <name>fs.default.name</name>
    <value>hdfs://cluster-0:9000/</value>
  </property>

  <property>
    <name>hama.zookeeper.quorum</name>
    <value>cluster-0</value>
  </property>


% bin/hama zookeeper
15/06/29 12:17:17 ERROR quorum.QuorumPeerConfig: Invalid
configuration, only one server specified (ignoring)

Then, open new terminal and run master with following command:

% bin/hama bspmaster
...
15/06/29 12:17:40 INFO sync.ZKSyncBSPMasterClient: Initialized ZK false
15/06/29 12:17:40 INFO sync.ZKSyncClient: Initializing ZK Sync Client
15/06/29 12:17:40 INFO ipc.Server: IPC Server Responder: starting
15/06/29 12:17:40 INFO ipc.Server: IPC Server listener on 40000: starting
15/06/29 12:17:40 INFO ipc.Server: IPC Server handler 0 on 40000: starting
15/06/29 12:17:40 INFO bsp.BSPMaster: Starting RUNNING



On Mon, Jun 29, 2015 at 12:17 PM, Edward J. Yoon <[email protected]> wrote:
> Hi,
>
> If you run zk server too, BSPmaster will be connected to zk and won't
> throw exceptions.
>
> On Mon, Jun 29, 2015 at 12:13 PM, Behroz Sikander <[email protected]> wrote:
>> Hi,
>> Thank you the information. I moved to hama 0.7.0 and I still have the same
>> problem.
>> When I run % bin/hama bspmaster, I am getting the following exception
>>
>> INFO http.HttpServer: Port returned by
>> webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening
>> the listener on 40013
>>  INFO http.HttpServer: listener.getLocalPort() returned 40013
>> webServer.getConnectors()[0].getLocalPort() returned 40013
>>  INFO http.HttpServer: Jetty bound to port 40013
>>  INFO mortbay.log: jetty-6.1.14
>>  INFO mortbay.log: Extract
>> jar:file:/home/behroz/Documents/Packages/hama-0.7.0/hama-core-0.7.0.jar!/webapp/bspmaster/
>> to /tmp/Jetty_b178b33b16cc_40013_bspmaster____.cof30w/webapp
>>  INFO mortbay.log: Started SelectChannelConnector@b178b33b16cc:40013
>>  INFO bsp.BSPMaster: Cleaning up the system directory
>>  INFO bsp.BSPMaster: hdfs://172.17.0.3:54310/tmp/hama-behroz/bsp/system
>>  INFO sync.ZKSyncBSPMasterClient: Initialized ZK false
>>  INFO sync.ZKSyncClient: Initializing ZK Sync Client
>>  ERROR sync.ZKSyncBSPMasterClient:
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /bsp
>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
>> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
>> at
>> org.apache.hama.bsp.sync.ZKSyncBSPMasterClient.init(ZKSyncBSPMasterClient.java:62)
>> at org.apache.hama.bsp.BSPMaster.initZK(BSPMaster.java:534)
>> at org.apache.hama.bsp.BSPMaster.startMaster(BSPMaster.java:517)
>> at org.apache.hama.bsp.BSPMaster.startMaster(BSPMaster.java:500)
>> at org.apache.hama.BSPMasterRunner.run(BSPMasterRunner.java:46)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>> at org.apache.hama.BSPMasterRunner.main(BSPMasterRunner.java:56)
>>  ERROR sync.ZKSyncBSPMasterClient:
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /bsp
>>
>> *Why zookeeper settings in hama-site.xml are (right now, I am using just
>> two servers 172.17.0.3 and 172.17.0.7)*
>> <property>
>>                  <name>hama.zookeeper.quorum</name>
>>                  <value>172.17.0.3,172.17.0.7</value>
>>                  <description>Comma separated list of servers in the
>> ZooKeeper quorum.
>>                  For example, "host1.mydomain.com,host2.mydomain.com,
>> host3.mydomain.com".
>>                  By default this is set to localhost for local and
>> pseudo-distributed modes
>>                  of operation. For a fully-distributed setup, this should
>> be set to a full
>>                  list of ZooKeeper quorum servers. If HAMA_MANAGES_ZK is
>> set in hama-env.sh
>>                  this is the list of servers which we will start/stop
>> ZooKeeper on.
>>                  </description>
>>         </property>
>>        ......
>>        <property>
>>                  <name>hama.zookeeper.property.clientPort</name>
>>                  <value>2181</value>
>>          </property>
>>
>> Is something wrong with my settings ?
>>
>> Regards,
>> Behroz Sikander
>>
>> On Mon, Jun 29, 2015 at 1:44 AM, Edward J. Yoon <[email protected]>
>> wrote:
>>
>>> > (0.7.0) because I do not understand YARN yet. It adds extra
>>> configurations
>>>
>>> Hama classic mode works on both Hadoop 1.x and Hadoop 2.x HDFS. Yarn
>>> configuration is only needed when you want to submit a BSP job to Yarn
>>> cluster
>>> without Hama cluster. So you don't need to worry about it. :-)
>>>
>>> > distributed mode ? and is there any way to manage the server ? I mean
>>> right
>>> > now, I have 3 machines with alot of configurations files and log files.
>>> It
>>>
>>> You can use web UI at http://masterserver_address:40013/bspmaster.jsp
>>>
>>> To debug your program, please try like below:
>>>
>>> 1) Run a BSPMaster and Zookeeper at server1.
>>> % bin/hama bspmaster
>>> % bin/hama zookeeper
>>>
>>> 2) Run a Groom at server1 and server2.
>>>
>>> % bin/hama groom
>>>
>>> 3) Check whether deamons are running well. Then, run your program using jar
>>> command at server1.
>>>
>>> % bin/hama jar .....
>>>
>>> > In hama_[user]_bspmaster_.....log file I get the following exception. But
>>> > this occurs in both cases when I run my job with 3 tasks or with 4 tasks
>>>
>>> In fact, you should not see above initZK error log.
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>>
>>>
>>> -----Original Message-----
>>> From: Behroz Sikander [mailto:[email protected]]
>>> Sent: Monday, June 29, 2015 8:18 AM
>>> To: [email protected]
>>> Subject: Re: Groomserer BSPPeerChild limit
>>>
>>> I will try the things that you mentioned. I am not using the latest version
>>> (0.7.0) because I do not understand YARN yet. It adds extra configurations
>>> which makes it more harder for me to understand when things go wrong. Any
>>> suggestions ?
>>>
>>> Further, are there any tools that you use for debugging while in
>>> distributed mode ? and is there any way to manage the server ? I mean right
>>> now, I have 3 machines with alot of configurations files and log files. It
>>> takes alot of time. This makes me wonder how people who have 100s of
>>> machines debug and manage the cluster.
>>>
>>> Regards,
>>> Behroz
>>>
>>> On Mon, Jun 29, 2015 at 12:53 AM, Edward J. Yoon <[email protected]>
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > It looks like a zookeeper connection problem. Please check whether
>>> > zookeeper
>>> > is running and every tasks can connect to zookeeper.
>>> >
>>> > I would recommend you to stop the firewall during debugging, and please
>>> use
>>> > the 0.7.0 latest release.
>>> >
>>> >
>>> > --
>>> > Best Regards, Edward J. Yoon
>>> >
>>> > -----Original Message-----
>>> > From: Behroz Sikander [mailto:[email protected]]
>>> > Sent: Monday, June 29, 2015 7:34 AM
>>> > To: [email protected]
>>> > Subject: Re: Groomserer BSPPeerChild limit
>>> >
>>> > To figure out the issue, I was trying something else and found out
>>> another
>>> > wiered issue. Might be a bug of Hama but I am not sure. Both following
>>> > lines give an exception.
>>> >
>>> > System.out.println( peer.getPeerName(0)); //Exception
>>> >
>>> > System.out.println( peer.getNumPeers()); //Exception
>>> >
>>> >
>>> > [time] ERROR bsp.BSPTask: *Error running bsp setup and bsp function.*
>>> >
>>> > [time]java.lang.*RuntimeException: All peer names could not be
>>> retrieved!*
>>> >
>>> > at
>>> >
>>> >
>>> org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.getAllPeerNames(ZooKeeperSyncClientImpl.java:305)
>>> >
>>> > at org.apache.hama.bsp.BSPPeerImpl.initPeerNames(BSPPeerImpl.java:544)
>>> >
>>> > at org.apache.hama.bsp.BSPPeerImpl.getNumPeers(BSPPeerImpl.java:538)
>>> >
>>> > at testHDFS.EVADMMBsp.setup*(EVADMMBsp.java:58)*
>>> >
>>> > at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:170)
>>> >
>>> > at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
>>> >
>>> > at
>>> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)
>>> >
>>> > On Sun, Jun 28, 2015 at 6:45 PM, Behroz Sikander <[email protected]>
>>> > wrote:
>>> >
>>> > > I think I have more information on the issue. I did some debugging and
>>> > > found something quite strange.
>>> > >
>>> > > If I open my job with 6 tasks ( 3 tasks will run on MACHINE1 and 3 task
>>> > > will be opened on other MACHINE2),
>>> > >
>>> > >  -  3 tasks on Machine1 are frozen and the strange thing is that the
>>> > > processes do not even enter the SETUP function of BSP class. I have
>>> print
>>> > > statements in the setup function of BSP class and it doesn't print
>>> > > anything. I get empty files with zero size.
>>> > >
>>> > > drwxrwxr-x  2 behroz behroz 4096 Jun 28 16:29 .
>>> > > drwxrwxr-x 99 behroz behroz 4096 Jun 28 16:28 ..
>>> > > -rw-rw-r--  1 behroz behroz    0 Jun 28 16:24
>>> > > attempt_201506281624_0001_000000_0.err
>>> > > -rw-rw-r--  1 behroz behroz    0 Jun 28 16:24
>>> > > attempt_201506281624_0001_000000_0.log
>>> > > -rw-rw-r--  1 behroz behroz    0 Jun 28 16:24
>>> > > attempt_201506281624_0001_000001_0.err
>>> > > -rw-rw-r--  1 behroz behroz    0 Jun 28 16:24
>>> > > attempt_201506281624_0001_000001_0.log
>>> > > -rw-rw-r--  1 behroz behroz    0 Jun 28 16:24
>>> > > attempt_201506281624_0001_000002_0.err
>>> > > -rw-rw-r--  1 behroz behroz    0 Jun 28 16:24
>>> > > attempt_201506281624_0001_000002_0.log
>>> > >
>>> > > - On MACHINE2, the code enters the SETUP function of BSP class and
>>> prints
>>> > > stuff. See the size of files generated on output. How is it possible
>>> that
>>> > > in 3 tasks the code can enter BSP and in others it cannot ?
>>> > >
>>> > > drwxrwxr-x  2 behroz behroz 4096 Jun 28 16:39 .
>>> > > drwxrwxr-x 82 behroz behroz 4096 Jun 28 16:39 ..
>>> > > -rw-rw-r--  1 behroz behroz  659 Jun 28 16:39
>>> > > attempt_201506281639_0001_000003_0.err
>>> > > -rw-rw-r--  1 behroz behroz 1441 Jun 28 16:39
>>> > > attempt_201506281639_0001_000003_0.log
>>> > > -rw-rw-r--  1 behroz behroz  659 Jun 28 16:39
>>> > > attempt_201506281639_0001_000004_0.err
>>> > > -rw-rw-r--  1 behroz behroz 1368 Jun 28 16:39
>>> > > attempt_201506281639_0001_000004_0.log
>>> > > -rw-rw-r--  1 behroz behroz  659 Jun 28 16:39
>>> > > attempt_201506281639_0001_000005_0.err
>>> > > -rw-rw-r--  1 behroz behroz 1441 Jun 28 16:39
>>> > > attempt_201506281639_0001_000005_0.log
>>> > >
>>> > > - Hama Groom log file on MACHINE2 (which is frozen) shows.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Task
>>> > > 'attempt_201506281639_0001_000001_0' has started.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Launch 3 tasks.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Task
>>> > > 'attempt_201506281639_0001_000002_0' has started.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Launch 3 tasks.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Task
>>> > > 'attempt_201506281639_0001_000000_0' has started.
>>> > >
>>> > > - Hama Groom log file on MACHINE2 shows
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Task
>>> > > 'attempt_201506281639_0001_000003_0' has started.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Launch 3 tasks.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Task
>>> > > 'attempt_201506281639_0001_000004_0' has started.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Launch 3 tasks.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Task
>>> > > 'attempt_201506281639_0001_000005_0' has started.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Task
>>> > > attempt_201506281639_0001_000004_0 is *done*.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Task
>>> > > attempt_201506281639_0001_000003_0 is *done*.
>>> > > [time] INFO org.apache.hama.bsp.GroomServer: Task
>>> > > attempt_201506281639_0001_000005_0 is *done*.
>>> > >
>>> > > Any clue what might be going wrong ?
>>> > >
>>> > > Regards,
>>> > > Behroz
>>> > >
>>> > >
>>> > >
>>> > > On Sat, Jun 27, 2015 at 1:13 PM, Behroz Sikander <[email protected]>
>>> > > wrote:
>>> > >
>>> > >> Here is the log file from that folder
>>> > >>
>>> > >> 15/06/27 11:10:34 INFO ipc.Server: Starting Socket Reader #1 for port
>>> > >> 61001
>>> > >> 15/06/27 11:10:34 INFO ipc.Server: IPC Server Responder: starting
>>> > >> 15/06/27 11:10:34 INFO ipc.Server: IPC Server listener on 61001:
>>> > starting
>>> > >> 15/06/27 11:10:34 INFO ipc.Server: IPC Server handler 0 on 61001:
>>> > starting
>>> > >> 15/06/27 11:10:34 INFO ipc.Server: IPC Server handler 1 on 61001:
>>> > starting
>>> > >> 15/06/27 11:10:34 INFO ipc.Server: IPC Server handler 2 on 61001:
>>> > starting
>>> > >> 15/06/27 11:10:34 INFO ipc.Server: IPC Server handler 3 on 61001:
>>> > starting
>>> > >> 15/06/27 11:10:34 INFO message.HamaMessageManagerImpl: BSPPeer
>>> > >> address:b178b33b16cc port:61001
>>> > >> 15/06/27 11:10:34 INFO ipc.Server: IPC Server handler 4 on 61001:
>>> > starting
>>> > >> 15/06/27 11:10:34 INFO sync.ZKSyncClient: Initializing ZK Sync Client
>>> > >> 15/06/27 11:10:34 INFO sync.ZooKeeperSyncClientImpl: Start connecting
>>> to
>>> > >> Zookeeper! At b178b33b16cc/172.17.0.7:61001
>>> > >> 15/06/27 11:10:37 INFO ipc.Server: Stopping server on 61001
>>> > >> 15/06/27 11:10:37 INFO ipc.Server: IPC Server handler 0 on 61001:
>>> > exiting
>>> > >> 15/06/27 11:10:37 INFO ipc.Server: Stopping IPC Server listener on
>>> 61001
>>> > >> 15/06/27 11:10:37 INFO ipc.Server: IPC Server handler 1 on 61001:
>>> > exiting
>>> > >> 15/06/27 11:10:37 INFO ipc.Server: IPC Server handler 2 on 61001:
>>> > exiting
>>> > >> 15/06/27 11:10:37 INFO ipc.Server: Stopping IPC Server Responder
>>> > >> 15/06/27 11:10:37 INFO ipc.Server: IPC Server handler 3 on 61001:
>>> > exiting
>>> > >> 15/06/27 11:10:37 INFO ipc.Server: IPC Server handler 4 on 61001:
>>> > exiting
>>> > >>
>>> > >>
>>> > >> And my console shows the following ouptut. Hama is frozen right now.
>>> > >> 15/06/27 11:10:32 INFO bsp.BSPJobClient: Running job:
>>> > >> job_201506262331_0003
>>> > >> 15/06/27 11:10:35 INFO bsp.BSPJobClient: Current supersteps number: 0
>>> > >> 15/06/27 11:10:38 INFO bsp.BSPJobClient: Current supersteps number: 2
>>> > >>
>>> > >> On Sat, Jun 27, 2015 at 1:07 PM, Edward J. Yoon <
>>> [email protected]>
>>> > >> wrote:
>>> > >>
>>> > >>> Please check the task logs in $HAMA_HOME/logs/tasklogs folder.
>>> > >>>
>>> > >>> On Sat, Jun 27, 2015 at 8:03 PM, Behroz Sikander <[email protected]
>>> >
>>> > >>> wrote:
>>> > >>> > Yea. I also thought that. I ran the program through eclipse with 20
>>> > >>> tasks
>>> > >>> > and it works fine.
>>> > >>> >
>>> > >>> > On Sat, Jun 27, 2015 at 1:00 PM, Edward J. Yoon <
>>> > [email protected]
>>> > >>> >
>>> > >>> > wrote:
>>> > >>> >
>>> > >>> >> > When I run the PI example, it uses 9 tasks and runs fine. When I
>>> > >>> run my
>>> > >>> >> > program with 3 tasks, everything runs fine. But when I increase
>>> > the
>>> > >>> tasks
>>> > >>> >> > (to 4) by using "setNumBspTask". Hama freezes. I do not
>>> understand
>>> > >>> what
>>> > >>> >> can
>>> > >>> >> > go wrong.
>>> > >>> >>
>>> > >>> >> It looks like a program bug. Have you ran your program in local
>>> > mode?
>>> > >>> >>
>>> > >>> >> On Sat, Jun 27, 2015 at 8:03 AM, Behroz Sikander <
>>> > [email protected]>
>>> > >>> >> wrote:
>>> > >>> >> > Hi,
>>> > >>> >> > In the current thread, I mentioned 3 issues. Issue 1 and 3 are
>>> > >>> resolved
>>> > >>> >> but
>>> > >>> >> > issue number 2 is still giving me headaches.
>>> > >>> >> >
>>> > >>> >> > My problem:
>>> > >>> >> > My cluster now consists of 3 machines. Each one of them properly
>>> > >>> >> configured
>>> > >>> >> > (Apparently). From my master machine when I start Hadoop and
>>> Hama,
>>> > >>> I can
>>> > >>> >> > see the processes started on other 2 machines. If I check the
>>> > >>> maximum
>>> > >>> >> tasks
>>> > >>> >> > that my cluster can support then I get 9 (3 tasks on each
>>> > machine).
>>> > >>> >> >
>>> > >>> >> > When I run the PI example, it uses 9 tasks and runs fine. When I
>>> > >>> run my
>>> > >>> >> > program with 3 tasks, everything runs fine. But when I increase
>>> > the
>>> > >>> tasks
>>> > >>> >> > (to 4) by using "setNumBspTask". Hama freezes. I do not
>>> understand
>>> > >>> what
>>> > >>> >> can
>>> > >>> >> > go wrong.
>>> > >>> >> >
>>> > >>> >> > I checked the logs files and things look fine. I just sometimes
>>> > get
>>> > >>> an
>>> > >>> >> > exception that hama was not able to delete the sytem directory
>>> > >>> >> > (bsp.system.dir) defined in the hama-site.xml.
>>> > >>> >> >
>>> > >>> >> > Any help or clue would be great.
>>> > >>> >> >
>>> > >>> >> > Regards,
>>> > >>> >> > Behroz Sikander
>>> > >>> >> >
>>> > >>> >> > On Thu, Jun 25, 2015 at 1:13 PM, Behroz Sikander <
>>> > >>> [email protected]>
>>> > >>> >> wrote:
>>> > >>> >> >
>>> > >>> >> >> Thank you :)
>>> > >>> >> >>
>>> > >>> >> >> On Thu, Jun 25, 2015 at 12:14 AM, Edward J. Yoon <
>>> > >>> [email protected]
>>> > >>> >> >
>>> > >>> >> >> wrote:
>>> > >>> >> >>
>>> > >>> >> >>> Hi,
>>> > >>> >> >>>
>>> > >>> >> >>> You can get the maximum number of available tasks like
>>> following
>>> > >>> code:
>>> > >>> >> >>>
>>> > >>> >> >>>     BSPJobClient jobClient = new BSPJobClient(conf);
>>> > >>> >> >>>     ClusterStatus cluster = jobClient.getClusterStatus(true);
>>> > >>> >> >>>
>>> > >>> >> >>>     // Set to maximum
>>> > >>> >> >>>     bsp.setNumBspTask(cluster.getMaxTasks());
>>> > >>> >> >>>
>>> > >>> >> >>>
>>> > >>> >> >>> On Wed, Jun 24, 2015 at 11:20 PM, Behroz Sikander <
>>> > >>> [email protected]>
>>> > >>> >> >>> wrote:
>>> > >>> >> >>> > Hi,
>>> > >>> >> >>> > 1) Thank you for this.
>>> > >>> >> >>> > 2) Here are the images. I will look into the log files of PI
>>> > >>> example
>>> > >>> >> >>> >
>>> > >>> >> >>> > *Result of JPS command on slave*
>>> > >>> >> >>> >
>>> > >>> >> >>>
>>> > >>> >>
>>> > >>>
>>> >
>>> http://s17.postimg.org/gpwe2bbfj/Screen_Shot_2015_06_22_at_7_23_31_PM.png
>>> > >>> >> >>> >
>>> > >>> >> >>> > *Result of JPS command on Master*
>>> > >>> >> >>> >
>>> > >>> >> >>>
>>> > >>> >>
>>> > >>>
>>> >
>>> http://s14.postimg.org/s9922em5p/Screen_Shot_2015_06_22_at_7_23_42_PM.png
>>> > >>> >> >>> >
>>> > >>> >> >>> > 3) In my current case, I do not have any input submitted to
>>> > the
>>> > >>> job.
>>> > >>> >> >>> During
>>> > >>> >> >>> > run time, I directly fetch data from HDFS. So, I am looking
>>> > for
>>> > >>> >> >>> something
>>> > >>> >> >>> > like BSPJob.set*Max*NumBspTask().
>>> > >>> >> >>> >
>>> > >>> >> >>> > Regards,
>>> > >>> >> >>> > Behroz
>>> > >>> >> >>> >
>>> > >>> >> >>> >
>>> > >>> >> >>> >
>>> > >>> >> >>> > On Tue, Jun 23, 2015 at 12:57 AM, Edward J. Yoon <
>>> > >>> >> [email protected]
>>> > >>> >> >>> >
>>> > >>> >> >>> > wrote:
>>> > >>> >> >>> >
>>> > >>> >> >>> >> Hello,
>>> > >>> >> >>> >>
>>> > >>> >> >>> >> 1) You can get the filesystem URI from a configuration
>>> using
>>> > >>> >> >>> >> "FileSystem fs = FileSystem.get(conf);". Of course, the
>>> > >>> fs.defaultFS
>>> > >>> >> >>> >> property should be in hama-site.xml
>>> > >>> >> >>> >>
>>> > >>> >> >>> >>   <property>
>>> > >>> >> >>> >>     <name>fs.defaultFS</name>
>>> > >>> >> >>> >>     <value>hdfs://host1.mydomain.com:9000/</value>
>>> > >>> >> >>> >>     <description>
>>> > >>> >> >>> >>       The name of the default file system. Either the
>>> literal
>>> > >>> string
>>> > >>> >> >>> >>       "local" or a host:port for HDFS.
>>> > >>> >> >>> >>     </description>
>>> > >>> >> >>> >>   </property>
>>> > >>> >> >>> >>
>>> > >>> >> >>> >> 2) The 'bsp.tasks.maximum' is the number of tasks per node.
>>> > It
>>> > >>> looks
>>> > >>> >> >>> >> cluster configuration issue. Please run Pi example and look
>>> > at
>>> > >>> the
>>> > >>> >> >>> >> logs for more details. NOTE: you can not attach the images
>>> to
>>> > >>> >> mailing
>>> > >>> >> >>> >> list so I can't see it.
>>> > >>> >> >>> >>
>>> > >>> >> >>> >> 3) You can use the BSPJob.setNumBspTask(int) method. If
>>> input
>>> > >>> is
>>> > >>> >> >>> >> provided, the number of BSP tasks is basically driven by
>>> the
>>> > >>> number
>>> > >>> >> of
>>> > >>> >> >>> >> DFS blocks. I'll fix it to be more flexible on HAMA-956.
>>> > >>> >> >>> >>
>>> > >>> >> >>> >> Thanks!
>>> > >>> >> >>> >>
>>> > >>> >> >>> >>
>>> > >>> >> >>> >> On Tue, Jun 23, 2015 at 2:33 AM, Behroz Sikander <
>>> > >>> >> [email protected]>
>>> > >>> >> >>> >> wrote:
>>> > >>> >> >>> >> > Hi,
>>> > >>> >> >>> >> > Recently, I moved from a single machine setup to a 2
>>> > machine
>>> > >>> >> setup.
>>> > >>> >> >>> I was
>>> > >>> >> >>> >> > successfully able to run my job that uses the HDFS to get
>>> > >>> data. I
>>> > >>> >> >>> have 3
>>> > >>> >> >>> >> > trivial questions
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> > 1- To access HDFS, I have to manually give the IP address
>>> > of
>>> > >>> >> server
>>> > >>> >> >>> >> running
>>> > >>> >> >>> >> > HDFS. I thought that Hama will automatically pick from
>>> the
>>> > >>> >> >>> configurations
>>> > >>> >> >>> >> > but it does not. I am probably doing something wrong.
>>> Right
>>> > >>> now my
>>> > >>> >> >>> code
>>> > >>> >> >>> >> work
>>> > >>> >> >>> >> > by using the following.
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> > FileSystem fs = FileSystem.get(new
>>> > >>> URI("hdfs://server_ip:port/"),
>>> > >>> >> >>> conf);
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> > 2- On my master server, when I start hama it
>>> automatically
>>> > >>> starts
>>> > >>> >> >>> hama in
>>> > >>> >> >>> >> > the slave machine (all good). Both master and slave are
>>> set
>>> > >>> as
>>> > >>> >> >>> >> groomservers.
>>> > >>> >> >>> >> > This means that I have 2 servers to run my job which
>>> means
>>> > >>> that I
>>> > >>> >> can
>>> > >>> >> >>> >> open
>>> > >>> >> >>> >> > more BSPPeerChild processes. And if I submit my jar with
>>> 3
>>> > >>> bsp
>>> > >>> >> tasks
>>> > >>> >> >>> then
>>> > >>> >> >>> >> > everything works fine. But when I move to 4 tasks, Hama
>>> > >>> freezes.
>>> > >>> >> >>> Here is
>>> > >>> >> >>> >> the
>>> > >>> >> >>> >> > result of JPS command on slave.
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> > Result of JPS command on Master
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> > You can see that it is only opening tasks on slaves but
>>> not
>>> > >>> on
>>> > >>> >> >>> master.
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> > Note: I tried to change the bsp.tasks.maximum property in
>>> > >>> >> >>> >> hama-default.xml
>>> > >>> >> >>> >> > to 4 but still same result.
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> > 3- I want my cluster to open as many BSPPeerChild
>>> processes
>>> > >>> as
>>> > >>> >> >>> possible.
>>> > >>> >> >>> >> Is
>>> > >>> >> >>> >> > there any setting that can I do to achieve that ? Or hama
>>> > >>> picks up
>>> > >>> >> >>> the
>>> > >>> >> >>> >> > values from hama-default.xml to open tasks ?
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> > Regards,
>>> > >>> >> >>> >> >
>>> > >>> >> >>> >> > Behroz Sikander
>>> > >>> >> >>> >>
>>> > >>> >> >>> >>
>>> > >>> >> >>> >>
>>> > >>> >> >>> >> --
>>> > >>> >> >>> >> Best Regards, Edward J. Yoon
>>> > >>> >> >>> >>
>>> > >>> >> >>>
>>> > >>> >> >>>
>>> > >>> >> >>>
>>> > >>> >> >>> --
>>> > >>> >> >>> Best Regards, Edward J. Yoon
>>> > >>> >> >>>
>>> > >>> >> >>
>>> > >>> >> >>
>>> > >>> >>
>>> > >>> >>
>>> > >>> >>
>>> > >>> >> --
>>> > >>> >> Best Regards, Edward J. Yoon
>>> > >>> >>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> --
>>> > >>> Best Regards, Edward J. Yoon
>>> > >>>
>>> > >>
>>> > >>
>>> > >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>
>
>
> --
> Best Regards, Edward J. Yoon



-- 
Best Regards, Edward J. Yoon

Reply via email to