Re: The number of splits has exceeded the number of max tasks

Edward J. Yoon Sun, 09 Mar 2014 17:27:07 -0700

The GroomServer (slave node) spawns new JVM for each BSP task. The
"bsp.tasks.maximum" is number of maximum tasks per slave node.


Unlike MapReduce, BSP tasks must run at once. So, Tasks should not
exceed cluster capacity. You should change the block size or increase
the cluster's capacity. If you have a lot of memory, you can increase
the number of tasks per node.


On Sun, Mar 9, 2014 at 9:41 PM, Ammar Sahib <[email protected]> wrote:
> Hi, I changed  bsp.tasks.maximum in hama-default.xml to 4 as the following:
>
> nano conf/hama-default.xml
>
> <property>
>     <name>bsp.tasks.maximum</name>
>     <value>4</value>
> </property>
>
> Now the pi estimation example isworking fine:
> hadoop@c1-master:/usr/local/hama$ bin/hama jar hama-examples-0.6.4.jar pi
> 14/03/09 13:37:07 INFO bsp.BSPJobClient: Running job: job_201403091336_0001
> 14/03/09 13:37:10 INFO bsp.BSPJobClient: Current supersteps number: 0
> 14/03/09 13:37:28 INFO bsp.BSPJobClient: Current supersteps number: 1
> 14/03/09 13:37:28 INFO bsp.BSPJobClient: The total number of supersteps: 1
> 14/03/09 13:37:28 INFO bsp.BSPJobClient: Counters: 7
> 14/03/09 13:37:28 INFO bsp.BSPJobClient:   
> org.apache.hama.bsp.JobInProgress$JobCounter
> 14/03/09 13:37:28 INFO bsp.BSPJobClient:     SUPERSTEPS=1
> 14/03/09 13:37:28 INFO bsp.BSPJobClient:     LAUNCHED_TASKS=16
> 14/03/09 13:37:28 INFO bsp.BSPJobClient:   
> org.apache.hama.bsp.BSPPeerImpl$PeerCounter
> 14/03/09 13:37:28 INFO bsp.BSPJobClient:     SUPERSTEP_SUM=16
> 14/03/09 13:37:28 INFO bsp.BSPJobClient:     MESSAGE_BYTES_TRANSFERED=128
> 14/03/09 13:37:28 INFO bsp.BSPJobClient:     TIME_IN_SYNC_MS=3963
> 14/03/09 13:37:28 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_SENT=16
> 14/03/09 13:37:28 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_RECEIVED=16
> Estimated value of PI is        3.141225
> Job Finished in 22.094 seconds
>
>
> Can you advice how to choose the value of bsp.tasks.maximum? How should I 
> know how many machines are required to handle my data size?
>
>
>
>
> On Sunday, March 9, 2014 1:29 PM, Edward J. Yoon <[email protected]> 
> wrote:
>
> reduce max tasks per node from 30 to 3~5.
>
> Before ask them, please try to understand Java and others.
>
> Sent from my iPhone
>
>> On 2014. 3. 9., at 오후 9:24, Ammar Sahib <[email protected]> wrote:
>>
>> Please find the attached snapshot. I runned the pi estimation example but it 
>> fails:
>> 14/03/09 13:22:16 INFO bsp.BSPJobClient: Running job: job_201403091217_0004
>> 14/03/09 13:22:19 INFO bsp.BSPJobClient: Current supersteps number: 0
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:41 INFO ipc.Server: 
>> Starting Socket Reader #1 for port 61002
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:42 INFO ipc.Server: IPC 
>> Server Responder: starting
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:42 INFO ipc.Server: IPC 
>> Server handler 0 on 61002: starting
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:42 INFO ipc.Server: IPC 
>> Server listener on 61002: starting
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:42 INFO ipc.Server: IPC 
>> Server handler 1 on 61002: starting
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:42 INFO 
>> message.HamaMessageManagerImpl: BSPPeer address:slave3 port:61002
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:42 INFO ipc.Server: IPC 
>> Server handler 3 on 61002: starting
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:42 INFO ipc.Server: IPC 
>> Server handler 2 on 61002: starting
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:42 INFO ipc.Server: IPC 
>> Server handler 4 on 61002: starting
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:45 INFO 
>> sync.ZKSyncClient: Initializing ZK Sync Client
>> attempt_201403091217_0004_000085_0: 14/03/09 13:22:45 INFO 
>> sync.ZooKeeperSyncClientImpl: Start connecting to Zookeeper! At 
>> slave3/10.255.255.21:61002
>> 14/03/09 13:24:01 INFO bsp.BSPJobClient: Job failed.
>>
>>
>>
>> On Sunday, March 9, 2014 1:06 PM, Edward J. Yoon <[email protected]> 
>> wrote:
>> What's the value of "bsp.max.tasks.per.job"?
>>
>> Please try to run Pi example and then how many tasks are launched?
>>
>> On Sun, Mar 9, 2014 at 8:24 PM, Ammar Sahib <[email protected]> wrote:
>> > I was sure to restart HAMA. According to the web UI, I have the following 
>> > in my cluster:
>> >
>> >
>> > master:40000 Hama BSP AdministrationState: RUNNING
>> > Started: Sun Mar 09 12:17:01 CET 2014
>> > Version: 0.6.4
>> > Compiled By: edward
>> > Compiled At Time: Mon Mar  3 19:14:32 KST 2014
>> > Identifier: 201403091217
>> > ________________________________
>> >
>> > Groom Servers
>> > BSP Task Capacity
>> > Avg. Tasks/Node
>> > Blacklisted Nodes
>> > 4 120 30.00 0
>> > ________________________________
>> >
>> > Running Jobs
>> > No jobs found!
>> > ________________________________
>> >
>> > All Jobs History
>> > No jobs found!
>> > ________________________________
>> >  Hama, 2014.
>> >
>> >
>> > But When running my job I will something different:
>> >
>> > 14/03/09 12:19:21 INFO bsp.FileInputFormat: Total input paths to process : 
>> > 1
>> > 14/03/09 12:19:22 INFO util.NativeCodeLoader: Loaded the native-hadoop 
>> > library
>> > 14/03/09 12:19:22 WARN snappy.LoadSnappy: Snappy native library not loaded
>> > 14/03/09 12:19:22 INFO bsp.FileInputFormat: Total input paths to process : 
>> > 1
>> > Exception in thread "main" java.io.IOException: Job failed! The number of 
>> > splits has exceeded the number of max tasks. The number of splits: 52, The 
>> > number of max tasks: 20
>> >        at 
>> > org.apache.hama.bsp.BSPJobClient.submitJobInternal(BSPJobClient.java:349)
>> >        at org.apache.hama.bsp.BSPJobClient.submitJob(BSPJobClient.java:296)
>> >        at org.apache.hama.bsp.BSPJob.submit(BSPJob.java:219)
>> >        at org.apache.hama.bsp.BSPJob.waitForCompletion(BSPJob.java:226)
>> >        at org.apache.hama.bsp.BSPJobClient.partition(BSPJobClient.java:460)
>> >        at 
>> > org.apache.hama.bsp.BSPJobClient.submitJobInternal(BSPJobClient.java:341)
>> >        at org.apache.hama.bsp.BSPJobClient.submitJob(BSPJobClient.java:296)
>> >        at org.apache.hama.bsp.BSPJob.submit(BSPJob.java:219)
>> >        at org.apache.hama.graph.GraphJob.submit(GraphJob.java:208)
>> >        at org.apache.hama.bsp.BSPJob.waitForCompletion(BSPJob.java:226)
>> >        at 
>> > de.rwthaachen.dbis.i5cloudmatch.controller.Matcher.main(Matcher.java:479)
>> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >        at 
>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> >        at 
>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> >        at java.lang.reflect.Method.invoke(Method.java:606)
>> >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> >
>> >
>> >
>> >
>> >
>> > On Sunday, March 9, 2014 12:23 AM, Edward J. Yoon <[email protected]> 
>> > wrote:
>> >
>> > Please use the web UI to check the cluster capacity.
>> >
>> > I think your cluster is not working correctly now, or you didn't
>> > restart your cluster.
>> >
>> > On Sun, Mar 9, 2014 at 2:00 AM, Ammar Sahib <[email protected]> wrote:
>> >> Hi
>> >>
>> >> I tried to increase the bsp.tasks.maximum in hama-default.xml from 3 to 30
>> >> but I still get the same error.
>> >> I am thinking of of reducing the number of blocks of the input file by
>> >> controlling the parameter dfs.namenode.fs-limits.min-block-size in
>> >> hdfs-default.xml. Do you think this might be a good approach that might
>> >> solve the problem?
>> >>
>> >>
>> >>
>> >> On Friday, March 7, 2014 11:32 PM, Edward J. Yoon <[email protected]>
>> >> wrote:
>> >> If the number of blocks of input file is 52 (see [1]), you should
>> >> increase number of task slots by adding new machine or increasing max
>> >> tasks number per node "bsp.tasks.maximum".
>> >>
>> >> 1.
>> >> http://stackoverflow.com/questions/11168427/viewing-the-number-of-blocks-for-a-file-in-hadoop
>> >>
>> >> On Sat, Mar 8, 2014 at 12:22 AM, Ammar Sahib <[email protected]> 
>> >> wrote:
>> >>> Hi
>> >>>
>> >>>
>> >>> I am using HAMA 0.6.4 and I am running my custom program using a cluster
>> >>> of 4 machines. My input is a single file and I am setting the number of
>> >>> BSP tasks to the number of Groom servers by using
>> >>> (JOB.setNumBspTask(cluster.getGroomServers()). I am using the
>> >>> HashPartitioner.class to partition the data.
>> >>>
>> >>>
>> >>> I have a problem when I load my data. When I run my custom program I get
>> >>> the following error messages:
>> >>>
>> >>> 14/03/07 16:02:34 INFO bsp.FileInputFormat: Total input paths to process 
>> >>> :
>> >>> 1
>> >>> 14/03/07 16:02:34 INFO util.NativeCodeLoader: Loaded the
>> >>>  native-hadoop library
>> >>> 14/03/07 16:02:34 WARN snappy.LoadSnappy: Snappy native library not 
>> >>> loaded
>> >>> 14/03/07 16:02:34 INFO bsp.FileInputFormat: Total input paths to process 
>> >>> :
>> >>> 1
>> >>> Exception
>> >>>  in thread "main" java.io.IOException: Job failed! The number of splits
>> >>> has exceeded the number of max tasks. The number of splits: 52, The
>> >>> number of max tasks: 20
>> >>>        at
>> >>> org.apache.hama.bsp.BSPJobClient.submitJobInternal(BSPJobClient.java:349)
>> >>>        at
>> >>> org.apache.hama.bsp.BSPJobClient.submitJob(BSPJobClient.java:296)
>> >>>        at org.apache.hama.bsp.BSPJob.submit(BSPJob.java:219)
>> >>>        at org.apache.hama.bsp.BSPJob.waitForCompletion(BSPJob.java:226)
>> >>>        at
>> >>> org.apache.hama.bsp.BSPJobClient.partition(BSPJobClient.java:460)
>> >>>
>> >>>  at
>> >>> org.apache.hama.bsp.BSPJobClient.submitJobInternal(BSPJobClient.java:341)
>> >>>        at
>> >>> org.apache.hama.bsp.BSPJobClient.submitJob(BSPJobClient.java:296)
>> >>>        at org.apache.hama.bsp.BSPJob.submit(BSPJob.java:219)
>> >>>        at org.apache.hama.graph.GraphJob.submit(GraphJob.java:208)
>> >>>        at org.apache.hama.bsp.BSPJob.waitForCompletion(BSPJob.java:226)
>> >>>        at
>> >>> de.rwthaachen.dbis.i5cloudmatch.controller.Matcher.main(Matcher.java:479)
>> >>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >>>        at
>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> >>>        at
>> >>>
>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> >>>        at java.lang.reflect.Method.invoke(Method.java:606)
>> >>>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> >>>
>> >>> Any advice of how to solve this problem?
>> >>>
>> >>> RegardsAmmar
>> >>
>> >>
>> >>
>> >> --
>> >> Edward J. Yoon (@eddieyoon)
>> >> Chief Executive Officer
>> >> DataSayer, Inc.
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Edward J. Yoon (@eddieyoon)
>> > Chief Executive Officer
>> > DataSayer, Inc.
>>
>>
>>
>> --
>> Edward J. Yoon (@eddieyoon)
>> Chief Executive Officer
>> DataSayer, Inc.
>>
>>



-- 
Edward J. Yoon (@eddieyoon)
Chief Executive Officer
DataSayer, Inc.

Re: The number of splits has exceeded the number of max tasks

Reply via email to