Re: Benchmark Failure

2014-03-22 Thread Lixiang Ao
Checked the logs, and turned out to be configuration problem. Just
set dfs.namenode.fs-limits.min-block-size to 1 and it's fixed

Thanks.


On Wed, Mar 19, 2014 at 2:51 PM, Brahma Reddy Battula <
brahmareddy.batt...@huawei.com> wrote:

>  Seems to be this is issue, which is logged..Please check following jirafor 
> sameHope you also facing same issue...
>
>
>
> https://issues.apache.org/jira/browse/HDFS-4929
>
>
>
>
>
>
>
> Thanks & Regards
>
>
>
> Brahma Reddy Battula
>
>
>   --
> *From:* Lixiang Ao [aolixi...@gmail.com]
> *Sent:* Tuesday, March 18, 2014 10:34 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Benchmark Failure
>
>   the version is release 2.2.0
> 2014年3月18日 上午12:26于 "Lixiang Ao" 写道:
>
>>  Hi all,
>>
>>  I'm running jobclient tests(on single node), other tests like
>> TestDFSIO, mrbench succeed except nnbench.
>>
>>  I got a lot of Exceptions but without any explanation(see below).
>>
>>  Could anyone tell me what might went wrong?
>>
>>  Thanks!
>>
>>
>>  14/03/17 23:54:22 INFO hdfs.NNBench: Waiting in barrier for: 112819 ms
>> 14/03/17 23:54:23 INFO mapreduce.Job: Job job_local2133868569_0001
>> running in uber mode : false
>> 14/03/17 23:54:23 INFO mapreduce.Job:  map 0% reduce 0%
>> 14/03/17 23:54:28 INFO mapred.LocalJobRunner: hdfs://
>> 0.0.0.0:9000/benchmarks/NNBench-aolx-PC/control/NNBench_Controlfile_10:0+125>
>>  map
>> 14/03/17 23:54:29 INFO mapreduce.Job:  map 6% reduce 0%
>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>> Create/Write/Close
>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>> Create/Write/Close
>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>> Create/Write/Close
>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>> Create/Write/Close
>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>> Create/Write/Close
>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>> Create/Write/Close
>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>> Create/Write/Close
>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>> Create/Write/Close
>> (1000 Exceptions)
>> .
>> .
>> .
>> results:
>>
>>  File System Counters
>> FILE: Number of bytes read=18769411
>> FILE: Number of bytes written=21398315
>> FILE: Number of read operations=0
>> FILE: Number of large read operations=0
>> FILE: Number of write operations=0
>> HDFS: Number of bytes read=11185
>> HDFS: Number of bytes written=19540
>> HDFS: Number of read operations=325
>> HDFS: Number of large read operations=0
>> HDFS: Number of write operations=13210
>> Map-Reduce Framework
>> Map input records=12
>> Map output records=95
>> Map output bytes=1829
>> Map output materialized bytes=2091
>> Input split bytes=1538
>> Combine input records=0
>> Combine output records=0
>> Reduce input groups=8
>> Reduce shuffle bytes=0
>> Reduce input records=95
>> Reduce output records=8
>> Spilled Records=214
>> Shuffled Maps =0
>> Failed Shuffles=0
>> Merged Map outputs=0
>> GC time elapsed (ms)=211
>> CPU time spent (ms)=0
>> Physical memory (bytes) snapshot=0
>> Virtual memory (bytes) snapshot=0
>> Total committed heap usage (bytes)=4401004544
>> File Input Format Counters
>> Bytes Read=1490
>> File Output Format Counters
>> Bytes Written=170
>>  14/03/17 23:56:18 INFO hdfs.NNBench: -- NNBench
>> -- :
>> 14/03/17 23:56:18 INFO hdfs.NNBench:
>>  Version: NameNode Benchmark 0.4
>> 14/03/17 23:56:18 INFO hdfs.NNBench:Date &
>> time: 2014-03-17 23:56:18,619
>> 14/03/17 23:56:18 INFO hdfs.NNBench:
>> 14/03/17 23:56:18 INFO hdfs.NNBench: Test
>> Operation: create_write
>> 14/03/17 23:56:18 INFO hdfs.NNBench: Start
>> time: 2014-03-17 23:56:15,521
>> 14/03/17 23:56:18 INFO hdfs.NNBench:Maps to
>> run: 12
>> 14/03/17 23:56:18 INFO hdfs.NNBench: Reduces to
>> run: 6
>> 14/03/17 23:56:18 INFO hdfs.NNBench: Block Size
>> (bytes): 1
>> 14/03/17 23:56:18 INFO hdfs.NNBench: Bytes to
>> write: 0
>> 14/03/17 23:56:18 INFO hdfs.NNBench: Bytes per
>> checksum: 1
>> 14/03/17 23:56:18 INFO hdfs.NNBench:Number of
>> files: 1000
>> 14/03/17 23:56:18 INFO hdfs.NNBench: Replication
>> factor: 3
>> 14/03/17 23:56:18 INFO hdfs.NNBench: Successful file
>> operations: 0
>> 14/03/17 23:56:18 INFO hdfs.NNBench:
>> 14/03/17 23:56:18 INFO hdfs.NNBench: # maps that missed the
>> barrier: 11
>> 14/03/17 23:56:18 INFO hdfs.NNBench:   #
>> exceptions: 1000
>> 14/03/17 23:56:18 INFO hdfs.NNBench:
>> 14/03/17 23:56:18 INFO hdfs.NNBench:TPS:
>> Create/Write/Close: 0
>> 14/03/17 23:56:18 INFO hdfs.NNBench: Avg exec time (ms):
>> Create/Write/Close: Infinity
>> 14/03/17 23:56:18 INFO hdfs.NNBench:   

Yarn MapReduce Job Issue - AM Container launch error in Hadoop 2.3.0

2014-03-22 Thread Tony Mullins
Hi,

I have setup a 2 node cluster of Hadoop 2.3.0. Its working fine and I can
successfully run distributedshell-2.2.0.jar example. But when I try to run
any mapreduce job I get error. I have setup MapRed.xml and other configs
for running MapReduce job according to (
http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide)
but I am getting following error :

14/03/22 20:31:17 INFO mapreduce.Job: Job job_1395502230567_0001 failed
with state FAILED due to: Application application_1395502230567_0001 failed
2 times due to AM Container for appattempt_1395502230567_0001_02 exited
with exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException: at
org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at
org.apache.hadoop.util.Shell.run(Shell.java:418) at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Container exited with a non-zero exit code 1
.Failing this attempt.. Failing the application.
14/03/22 20:31:17 INFO mapreduce.Job: Counters: 0
Job ended: Sat Mar 22 20:31:17 PKT 2014
The job took 6 seconds.

And if look at stderr (log of job) there is only one line

*"Could not find or load main class 614"*

Now I have googled it and usually this issues comes when you have different
JAVA versions or in yarn-site.xml classpath is not properly set , my
yarn-site.xml has this



yarn.application.classpath

/opt/yarn/hadoop-2.3.0/etc/hadoop,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*
  

So any other ideas what could be the issue here ?

I am running my mapreduce job like this:

$HADOOP_PREFIX/bin/hadoop jar
$HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar
randomwriter out

Thanks, Tony


Re: The reduce copier failed

2014-03-22 Thread Mahmood Naderan
Really stuck at this step. I have test with smaller data set and it works. Now 
I am using wikipedia articles (46GB) with 600 chunks (each 64MB)

I have set number of mappers and reducers to 1 to ensure consistency and I am 
running on a local node. Why reducer doesn't report anything within 600 
seconds??


14/03/22 15:00:51 INFO mapred.JobClient:  map 15% reduce 5%
14/03/22 15:18:43 INFO mapred.JobClient:  map 16% reduce 5%
14/03/22 15:46:38 INFO mapred.JobClient: Task Id : 
attempt_201403212248_0002_m_000118_0, Status : FAILED
Task attempt_201403212248_0002_m_000118_0 failed to report status for 600 
seconds. Killing!
14/03/22 15:48:54 INFO mapred.JobClient:  map 17% reduce 5%
14/03/22 16:06:32 INFO mapred.JobClient:  map 18% reduce 5%
14/03/22 16:07:08 INFO mapred.JobClient:  map 18% reduce 6%
14/03/22 16:24:09 INFO mapred.JobClient:  map 19% reduce 6%
14/03/22 16:41:58 INFO mapred.JobClient:  map 20% reduce 6%
14/03/22 16:55:13 INFO mapred.JobClient: Task Id : 
attempt_201403212248_0002_r_00_0, Status : FAILED
java.io.IOException: Task: attempt_201403212248_0002_r_00_0 - The reduce 
copier failed
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not 
find any valid local directory for 
file:/tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201403212248_0002/attempt_201403212248_0002_r_00_0/output/map_107.out
    at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
    at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
    at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
    at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(ReduceTask.java:2690)

attempt_201403212248_0002_r_00_0: log4j:WARN No appenders could be found 
for logger (org.apache.hadoop.mapred.Task).
attempt_201403212248_0002_r_00_0: log4j:WARN Please initialize the log4j 
system properly.
14/03/22 16:55:15 INFO mapred.JobClient:  map 20% reduce 0%
14/03/22 16:55:34 INFO mapred.JobClient:  map 20% reduce 1%





 
Regards,
Mahmood



On Saturday, March 22, 2014 10:27 AM, Mahmood Naderan  
wrote:
 
Again I got the same error and it says

The reducer copier failed
...
could not find any valid local directory for file 
/tmp/hadoop-hadoop/map_150.out

Searching the web shows that I have to clean up the /tmp/hadoop-hadoop folder 
but the total size of this folder is 800KB with 1100 files. Does that really 
matter?


 
Regards,
Mahmood



On Friday, March 21, 2014 3:52 PM, Mahmood Naderan  wrote:
 
OK it seems that there was a "free disk space" issue.
I made more spaces and running again.


 
Regards,
Mahmood



On Friday, March 21, 2014 11:43 AM, shashwat shriparv 
 wrote:
 
​Check if the tmp dir, hdfs remaining or log directory are getting filled up 
while this job runs..​

On Fri, Mar 21, 2014 at 12:11 PM, Mahmood Naderan  wrote:

that imply a *retry* process? Or I have to be wo

​



Warm Regards_∞_
Shashwat Shriparv

Re: Yarn MapReduce Job Issue - AM Container launch error in Hadoop 2.3.0

2014-03-22 Thread Vinod Kumar Vavilapalli
What is "614" here?

The other relevant thing to check is the MapReduce specific config 
mapreduce.application.classpath.

+Vinod

On Mar 22, 2014, at 9:03 AM, Tony Mullins  wrote:

> Hi,
> 
> I have setup a 2 node cluster of Hadoop 2.3.0. Its working fine and I can 
> successfully run distributedshell-2.2.0.jar example. But when I try to run 
> any mapreduce job I get error. I have setup MapRed.xml and other configs for 
> running MapReduce job according to 
> (http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide)
>  but I am getting following error :
> 
> 14/03/22 20:31:17 INFO mapreduce.Job: Job job_1395502230567_0001 failed with 
> state FAILED due to: Application application_1395502230567_0001 failed 2 
> times due to AM Container for appattempt_1395502230567_0001_02 exited 
> with exitCode: 1 due to: Exception from container-launch: 
> org.apache.hadoop.util.Shell$ExitCodeException: 
> org.apache.hadoop.util.Shell$ExitCodeException: at 
> org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at 
> org.apache.hadoop.util.Shell.run(Shell.java:418) at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:744)
> 
> Container exited with a non-zero exit code 1
> .Failing this attempt.. Failing the application.
> 14/03/22 20:31:17 INFO mapreduce.Job: Counters: 0
> Job ended: Sat Mar 22 20:31:17 PKT 2014
> The job took 6 seconds.
> And if look at stderr (log of job) there is only one line 
> 
> "Could not find or load main class 614"
> 
> Now I have googled it and usually this issues comes when you have different 
> JAVA versions or in yarn-site.xml classpath is not properly set , my 
> yarn-site.xml has this
> 
> 
> 
> yarn.application.classpath
> 
> /opt/yarn/hadoop-2.3.0/etc/hadoop,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*
> 
>   
> So any other ideas what could be the issue here ?
> 
> I am running my mapreduce job like this:
> 
> $HADOOP_PREFIX/bin/hadoop jar 
> $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> randomwriter out
> Thanks, Tony
> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Yarn MapReduce Job Issue - AM Container launch error in Hadoop 2.3.0

2014-03-22 Thread Tony Mullins
That I also dont know what 614... Its the exact and single line in stderr
of Jobs logs.
And regarding MapRed classpath , defaults are good as there are only two
vars $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*.

Is there any other place to look for detailed & meaningfull error info ? or
any huntch to how to fix it ?

Thanks,
Tony


On Sat, Mar 22, 2014 at 11:11 PM, Vinod Kumar Vavilapalli <
vino...@apache.org> wrote:

> What is "614" here?
>
> The other relevant thing to check is the MapReduce specific config
> mapreduce.application.classpath.
>
> +Vinod
>
> On Mar 22, 2014, at 9:03 AM, Tony Mullins 
> wrote:
>
> Hi,
>
> I have setup a 2 node cluster of Hadoop 2.3.0. Its working fine and I can
> successfully run distributedshell-2.2.0.jar example. But when I try to run
> any mapreduce job I get error. I have setup MapRed.xml and other configs
> for running MapReduce job according to (
> http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide)
> but I am getting following error :
>
> 14/03/22 20:31:17 INFO mapreduce.Job: Job job_1395502230567_0001 failed
> with state FAILED due to: Application application_1395502230567_0001 failed
> 2 times due to AM Container for appattempt_1395502230567_0001_02 exited
> with exitCode: 1 due to: Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
> org.apache.hadoop.util.Shell$ExitCodeException: at
> org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at
> org.apache.hadoop.util.Shell.run(Shell.java:418) at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
>
> Container exited with a non-zero exit code 1
> .Failing this attempt.. Failing the application.
> 14/03/22 20:31:17 INFO mapreduce.Job: Counters: 0
> Job ended: Sat Mar 22 20:31:17 PKT 2014
> The job took 6 seconds.
>
> And if look at stderr (log of job) there is only one line
>
> *"Could not find or load main class 614"*
>
> Now I have googled it and usually this issues comes when you have
> different JAVA versions or in yarn-site.xml classpath is not properly set ,
> my yarn-site.xml has this
>
>
> 
> yarn.application.classpath
> 
> /opt/yarn/hadoop-2.3.0/etc/hadoop,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*
>
>   
>
> So any other ideas what could be the issue here ?
>
> I am running my mapreduce job like this:
>
> $HADOOP_PREFIX/bin/hadoop jar 
> $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> randomwriter out
>
> Thanks, Tony
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.


Re: Yarn MapReduce Job Issue - AM Container launch error in Hadoop 2.3.0

2014-03-22 Thread Vinod Kumar Vavilapalli
Given your earlier mail about the paths in /opt, shouldn't mapreduce classpath 
also point to /opt/yarn/hadoop-2.3.0 etc?

+Vinod

On Mar 22, 2014, at 11:33 AM, Tony Mullins  wrote:

> That I also dont know what 614... Its the exact and single line in stderr of 
> Jobs logs.
> And regarding MapRed classpath , defaults are good as there are only two vars 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*, 
> $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*.
> 
> Is there any other place to look for detailed & meaningfull error info ? or 
> any huntch to how to fix it ?
> 
> Thanks,
> Tony
> 
> 
> On Sat, Mar 22, 2014 at 11:11 PM, Vinod Kumar Vavilapalli 
>  wrote:
> What is "614" here?
> 
> The other relevant thing to check is the MapReduce specific config 
> mapreduce.application.classpath.
> 
> +Vinod
> 
> On Mar 22, 2014, at 9:03 AM, Tony Mullins  wrote:
> 
>> Hi,
>> 
>> I have setup a 2 node cluster of Hadoop 2.3.0. Its working fine and I can 
>> successfully run distributedshell-2.2.0.jar example. But when I try to run 
>> any mapreduce job I get error. I have setup MapRed.xml and other configs for 
>> running MapReduce job according to 
>> (http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide)
>>  but I am getting following error :
>> 
>> 14/03/22 20:31:17 INFO mapreduce.Job: Job job_1395502230567_0001 failed with 
>> state FAILED due to: Application application_1395502230567_0001 failed 2 
>> times due to AM Container for appattempt_1395502230567_0001_02 exited 
>> with exitCode: 1 due to: Exception from container-launch: 
>> org.apache.hadoop.util.Shell$ExitCodeException: 
>> org.apache.hadoop.util.Shell$ExitCodeException: at 
>> org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at 
>> org.apache.hadoop.util.Shell.run(Shell.java:418) at 
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at 
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>  at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>  at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>  at java.util.concurrent.FutureTask.run(FutureTask.java:262) at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>  at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>  at java.lang.Thread.run(Thread.java:744)
>> 
>> Container exited with a non-zero exit code 1
>> .Failing this attempt.. Failing the application.
>> 14/03/22 20:31:17 INFO mapreduce.Job: Counters: 0
>> Job ended: Sat Mar 22 20:31:17 PKT 2014
>> The job took 6 seconds.
>> And if look at stderr (log of job) there is only one line 
>> 
>> "Could not find or load main class 614"
>> 
>> Now I have googled it and usually this issues comes when you have different 
>> JAVA versions or in yarn-site.xml classpath is not properly set , my 
>> yarn-site.xml has this
>> 
>> 
>> 
>> yarn.application.classpath
>> 
>> /opt/yarn/hadoop-2.3.0/etc/hadoop,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*,/opt/yarn/hadoop-2.3.0/*,/opt/yarn/hadoop-2.3.0/lib/*
>> 
>> 
>>   
>> So any other ideas what could be the issue here ?
>> 
>> I am running my mapreduce job like this:
>> 
>> $HADOOP_PREFIX/bin/hadoop jar 
>> $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
>> randomwriter out
>> Thanks, Tony
>> 
>> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader of 
> this message is not the intended recipient, you are hereby notified that any 
> printing, copying, dissemination, distribution, disclosure or forwarding of 
> this communication is strictly prohibited. If you have received this 
> communication in error, please contact the sender immediately and delete it 
> from your system. Thank You.
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Benchmark Failure

2014-03-22 Thread Harsh J
Do not leave that configuration in after your tests are done. It would be very
harmful to allow such tiny block sizes from clients, enabling them to
flood your NameNode's metadata with a lot of blocks for a small file.

If its instead possible, tune NNBench's block size to be larger perhaps.

On Sat, Mar 22, 2014 at 2:26 PM, Lixiang Ao  wrote:
> Checked the logs, and turned out to be configuration problem. Just set
> dfs.namenode.fs-limits.min-block-size to 1 and it's fixed
>
> Thanks.
>
>
> On Wed, Mar 19, 2014 at 2:51 PM, Brahma Reddy Battula
>  wrote:
>>
>> Seems to be this is issue, which is logged..Please check following jira
>> for sameHope you also facing same issue...
>>
>>
>>
>> https://issues.apache.org/jira/browse/HDFS-4929
>>
>>
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>>
>>
>> Brahma Reddy Battula
>>
>>
>>
>> 
>> From: Lixiang Ao [aolixi...@gmail.com]
>> Sent: Tuesday, March 18, 2014 10:34 AM
>> To: user@hadoop.apache.org
>> Subject: Re: Benchmark Failure
>>
>> the version is release 2.2.0
>>
>> 2014年3月18日 上午12:26于 "Lixiang Ao" 写道:
>>>
>>> Hi all,
>>>
>>> I'm running jobclient tests(on single node), other tests like TestDFSIO,
>>> mrbench succeed except nnbench.
>>>
>>> I got a lot of Exceptions but without any explanation(see below).
>>>
>>> Could anyone tell me what might went wrong?
>>>
>>> Thanks!
>>>
>>>
>>> 14/03/17 23:54:22 INFO hdfs.NNBench: Waiting in barrier for: 112819 ms
>>> 14/03/17 23:54:23 INFO mapreduce.Job: Job job_local2133868569_0001
>>> running in uber mode : false
>>> 14/03/17 23:54:23 INFO mapreduce.Job:  map 0% reduce 0%
>>> 14/03/17 23:54:28 INFO mapred.LocalJobRunner:
>>> hdfs://0.0.0.0:9000/benchmarks/NNBench-aolx-PC/control/NNBench_Controlfile_10:0+125
>>> > map
>>> 14/03/17 23:54:29 INFO mapreduce.Job:  map 6% reduce 0%
>>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>>> Create/Write/Close
>>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>>> Create/Write/Close
>>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>>> Create/Write/Close
>>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>>> Create/Write/Close
>>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>>> Create/Write/Close
>>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>>> Create/Write/Close
>>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>>> Create/Write/Close
>>> 14/03/17 23:56:15 INFO hdfs.NNBench: Exception recorded in op:
>>> Create/Write/Close
>>> (1000 Exceptions)
>>> .
>>> .
>>> .
>>> results:
>>>
>>> File System Counters
>>> FILE: Number of bytes read=18769411
>>> FILE: Number of bytes written=21398315
>>> FILE: Number of read operations=0
>>> FILE: Number of large read operations=0
>>> FILE: Number of write operations=0
>>> HDFS: Number of bytes read=11185
>>> HDFS: Number of bytes written=19540
>>> HDFS: Number of read operations=325
>>> HDFS: Number of large read operations=0
>>> HDFS: Number of write operations=13210
>>> Map-Reduce Framework
>>> Map input records=12
>>> Map output records=95
>>> Map output bytes=1829
>>> Map output materialized bytes=2091
>>> Input split bytes=1538
>>> Combine input records=0
>>> Combine output records=0
>>> Reduce input groups=8
>>> Reduce shuffle bytes=0
>>> Reduce input records=95
>>> Reduce output records=8
>>> Spilled Records=214
>>> Shuffled Maps =0
>>> Failed Shuffles=0
>>> Merged Map outputs=0
>>> GC time elapsed (ms)=211
>>> CPU time spent (ms)=0
>>> Physical memory (bytes) snapshot=0
>>> Virtual memory (bytes) snapshot=0
>>> Total committed heap usage (bytes)=4401004544
>>> File Input Format Counters
>>> Bytes Read=1490
>>> File Output Format Counters
>>> Bytes Written=170
>>> 14/03/17 23:56:18 INFO hdfs.NNBench: -- NNBench
>>> -- :
>>> 14/03/17 23:56:18 INFO hdfs.NNBench:
>>> Version: NameNode Benchmark 0.4
>>> 14/03/17 23:56:18 INFO hdfs.NNBench:Date &
>>> time: 2014-03-17 23:56:18,619
>>> 14/03/17 23:56:18 INFO hdfs.NNBench:
>>> 14/03/17 23:56:18 INFO hdfs.NNBench: Test
>>> Operation: create_write
>>> 14/03/17 23:56:18 INFO hdfs.NNBench: Start
>>> time: 2014-03-17 23:56:15,521
>>> 14/03/17 23:56:18 INFO hdfs.NNBench:Maps to
>>> run: 12
>>> 14/03/17 23:56:18 INFO hdfs.NNBench: Reduces to
>>> run: 6
>>> 14/03/17 23:56:18 INFO hdfs.NNBench: Block Size
>>> (bytes): 1
>>> 14/03/17 23:56:18 INFO hdfs.NNBench: Bytes to
>>> write: 0
>>> 14/03/17 23:56:18 INFO hdfs.NNBench: Bytes per
>>> checksum: 1
>>> 14/03/17 23:56:18 INFO hdfs.NNBench:Number of
>>> files: 1000
>>> 14/03/17 23:56:18 INFO hdfs.NNBench: Replication
>>> factor: 3
>>> 14/03/17 23:56:18 INFO hdfs.NNBench: Successful file
>>> operations: 0
>>> 14/03/17 23:56:18 INFO hdfs.N