Team ,
i had wrote a mapreduce program . scenario of my program is to emit
<userid,seqid> .
Total no user : 825
Total no seqid:6583100
No of map which the program will emit is : 825 * 6583100
I have Hbase table called ObjectSequence : which consist of 6583100(rows)
i had use TableMapper and TableReducer for my map reduce program
Problem definition :
Processor : i7
Replication Factor : 1
Live Datanodes : 3
Node Last
Contact Admin State Configured
Capacity (GB) Used
(GB) Non DFS
Used (GB) Remaining
(GB) Used
(%) Used
(%) Remaining
(%) Blocks chethan 1In Service28.590.625.172.822.11
9.8773
shashwat<http://shashwat:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>2In
Service28.980.8722.016.13
21.0469
syed<http://syed:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>0In
Service28.984.2918.376.3214.8
21.82129
When i run balancer in hadoop i had seen Blocks are not equally distributed
. Can i know what may be the reason for this ..
Kind% CompleteNum TasksPendingRunningCompleteKilledFailed/Killed
Task Attempts<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007>
map<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1>
85.71%
701<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=running>
6<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=completed>
03<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=failed>/
1<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=killed>
reduce<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1>
28.57%
101<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1&state=running>
000 / 0
i had seen only Number Task is allocated is 8 . Is there any possibility to
increase the Map Number of Task
Completed TasksTaskCompleteStatusStart TimeFinish TimeErrorsCounters
task_201207121836_0007_m_000001<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000001>
100.00%
UserID: 777 SEQID:415794
12-Jul-2012 21:35:48
12-Jul-2012 21:36:12 (24sec)
16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000001>
task_201207121836_0007_m_000002<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000002>
100.00%
UserID: 777 SEQID:422256
12-Jul-2012 21:35:50
12-Jul-2012 21:36:47 (57sec)
16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000002>
task_201207121836_0007_m_000003<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000003>
100.00%
UserID: 777 SEQID:563544
12-Jul-2012 21:35:50
12-Jul-2012 22:00:08 (24mins, 17sec)
16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000003>
task_201207121836_0007_m_000004<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000004>
100.00%
UserID: 777 SEQID:592918
12-Jul-2012 21:35:50
12-Jul-2012 21:42:09 (6mins, 18sec)
16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000004>
task_201207121836_0007_m_000005<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000005>
100.00%
UserID: 777 SEQID:618121
12-Jul-2012 21:35:50
12-Jul-2012 21:44:34 (8mins, 43sec)
16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000005>
task_201207121836_0007_m_000006<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000006>
100.00%
UserID: 777 SEQID:685810
12-Jul-2012 21:36:12
12-Jul-2012 21:44:18 (8mins, 6sec)
16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000006>
why for last Map task is talking nearly 2 hours .please give me some
suggestion how to do an optimization
TaskCompleteStatusStart TimeFinish TimeErrorsCounters
task_201207121836_0007_m_000000<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000000>
0.00%
UserID: 482 SEQID:99596
12-Jul-2012 21:35:48
java.io.IOException: Spill failed
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
at
org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
Could not find any valid local directory for output/spill712.out
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
at
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
java.lang.RuntimeException: Error while running command to get file
permissions : java.io.IOException: Cannot run program "/bin/ls":
java.io.IOException: error=12, Cannot allocate memory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:475)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:703)
at
org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:443)
at
org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
at
org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
allocate memory
at java.lang.UNIXProcess.<init>(UNIXProcess.java:164)
at java.lang.ProcessImpl.start(ProcessImpl.java:81)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:468)
... 15 more
at
org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:468)
at
org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
at
org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
java.io.IOException: Spill failed
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
at
org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
Could not find any valid local directory for output/spill934.out
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
at
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
I had seen this error for last task what may be the reason for this error .
NOTE: When i run import hbase table it takes 10 min .
Team please give suggestion what to be done to solve these issue .
Thanks and Regards,
S SYED ABDUL KATHER