from:"Ben Kim"

Re: Datanodes using public ip, why?

2013-07-17 Thread Ben Kim

Hi
Thank you all for the comments

setting dfs.datanode.dns.interface and having private ip's in slaves ans
masters file didn't work.
So as Alex said I changed all public ip mapping to hostnames on /etc/hosts
file, and all datanodes now communicate through private network.

but im not fully content since on some situations I would want hostnames to
be mapped to public ips, and hadoop still communication through private
network. I don't understand why dfs.datanode.dns.interface has no effect.


One interesting thing i found is that if I change dfs.default.name to
public ip from private one, all datanodes now report themselves with public
ips.
so confusing. why?

btw, im using hadoop 1.0.3, without nameserver and firewalls

Thank you
Ben





On Fri, Jul 12, 2013 at 12:29 PM, Alex Levin ale...@gmail.com wrote:

 make sure that your hostnames resolved ( dns or/and hosts files ) with
 private IPs.

 if you have records in the nodes hosts files like
 public IP hosname

 remove (or comment) them

 Alex
 On Jul 11, 2013 2:21 AM, Ben Kim benkimkim...@gmail.com wrote:

 Hello Hadoop Community!

 I've setup datanodes with private network by adding private hostname's to
 the slaves file.
 but it looks like when i lookup on the webUI datenodes are registered
 with public hostnames.

 are they actually networking with public network?

 all datanodes have eth0 with public address and eth1 with private address.

 what am i missing?

 Thanks a whole lot

 *Benjamin Kim*
 *benkimkimben at gmail*




-- 

*Benjamin Kim*
*benkimkimben at gmail*

Datanodes using public ip, why?

2013-07-11 Thread Ben Kim

Hello Hadoop Community!

I've setup datanodes with private network by adding private hostname's to
the slaves file.
but it looks like when i lookup on the webUI datenodes are registered with
public hostnames.

are they actually networking with public network?

all datanodes have eth0 with public address and eth1 with private address.

what am i missing?

Thanks a whole lot

*Benjamin Kim*
*benkimkimben at gmail*

is time sync required among all nodes?

2013-06-04 Thread Ben Kim

Hi,
This is very basic  fundamental question.

Is time among all nodes needs to be synced?

I've never even thought of timing in hadoop cluster but recently
experienced my servers going out of sync with time. I know hbase requires
time to by synced due to its timestamp action. But I wonder any of hadoop
functionality requires time sync. Perhaps checkpoint, namenode HA, or
datanode report, etc... hmm


-- 

*Benjamin Kim*
*benkimkimben at gmail*

Hadoop 2.0.4: Unable to load native-hadoop library for your platform

2013-05-23 Thread Ben Kim

Hi I downloaded hadoop 2.0.4 and keep getting these errors from hadoop cli
and MapReduce task logs

13/05/24 14:34:17 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable

i tried adding $HADOOP_HOME/lib/native/* to CLASSPATH and LD_LIBRARY_PATH
but none of these worked.

Had anyone have similar problem?

TY!

-- 

*Benjamin Kim*
*benkimkimben at gmail*

Re: issue with hadoop mapreduce using same job.jar

2013-03-12 Thread Ben Kim

Things I tried so far without a luck

   - restart the hadoop
   - sync  echo 3  /proc/sys/vm/drop_caches
   - clear namenode java cache using jcontrol
   - check permission of the /user/hadoop/.staging folder
   - delete everything under the .staging folder
   - rename the test.jar file
   - run using different user

what worked though is remotely running a MR using the Hadoop API
so it seems like this is only happenening in the Hadoop CLI


On Wed, Mar 13, 2013 at 1:00 AM, Ben Kim benkimkim...@gmail.com wrote:

 Hi there

 It looks like the job.jar created in the /user/hadoop/.staging/ folder is
 always the same no matter which jar file i give.

 if i download the job.jar file then I reckon that it's a jar file I used
 run a MR job few hours ago.

 I'm using hadoop 1.0.3 on top of centos 6.2

 anyone has any ideas?

 *Benjamin Kim*
 *benkimkimben at gmail*




-- 

*Benjamin Kim*
*benkimkimben at gmail*

issue with hadoop mapreduce using same job.jar

2013-03-12 Thread Ben Kim

Hi there

It looks like the job.jar created in the /user/hadoop/.staging/ folder is
always the same no matter which jar file i give.

if i download the job.jar file then I reckon that it's a jar file I used
run a MR job few hours ago.

I'm using hadoop 1.0.3 on top of centos 6.2

anyone has any ideas?

*Benjamin Kim*
*benkimkimben at gmail*

Re: Multiple reduce task retries running at same time

2013-01-28 Thread Ben Kim

Attached a screenshot showing the retries

On Tue, Jan 29, 2013 at 4:35 PM, Ben Kim benkimkim...@gmail.com wrote:

 Hi!

 I have come across the situation where i found a single reducer task
 executing with multiple retries simultaneously.
 Which is potent for slowing down the whole reduce process for large data
 sets.

 Is this pretty normal to yall for hadoop 1.0.3?

 --

 *Benjamin Kim*
 *benkimkimben at gmail*




-- 

*Benjamin Kim*
*benkimkimben at gmail*
attachment: sc2.png

Re: Decommissioning a datanode takes forever

2013-01-22 Thread Ben Kim

UPDATE:

WARN with edit log had nothing to do with the current problem.

However replica placement warnings seem to be suspicious.
Please have a look at the following logs.

2013-01-22 09:12:10,885 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 1
2013-01-22 00:02:17,541 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
Block: blk_4844131893883391179_3440513,
Expected Replicas: 10, live replicas: 9, corrupt replicas: 0,
decommissioned replicas: 1, excess replicas: 0, Is Open File: false,
Datanodes having this block: 203.235.211.155:50010
203.235.211.156:5001020
3.235.211.145:50010 203.235.211.144:50010 203.235.211.146:50010
203.235.211.158:50010 203.235.211.159:50010 203.235.211.157:50010
203.235.211.160:50010 203.235.211.143:50010 ,
Current Datanode: 203.235.211.155:50010, Is current datanode
decommissioning: true

I have set my replication factor to 3. I dont understand why hadoop is
trying to replicate it to 10 nodes. I have decommissioned one node so
currently I have 9 nodes in operation. It will never be replicated to 10
nodes.

I also see that all repeated warning msg like the above is for
blk_4844131893883391179_3440513.

How would I delete the block? it's not showing as corrupted block on fsck.
:(

BEN




On Tue, Jan 22, 2013 at 9:28 AM, Ben Kim benkimkim...@gmail.com wrote:

 Hi Varun, Thnk you for the reponse

 No there doesnt seem to be any corrupted blocks in my cluster.
 I did hadoop fsck -blocks / and it didnt report any corrupted block.

 However, these are two WARNings in the namenode log, constantly repeating
 since the decommission.

- 2013-01-22 09:16:30,908 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit log,
edits.new files already exists in all healthy directories:
- 2013-01-22 09:12:10,885 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 1

 There isn't any WARN or ERROR in the decommissioning datanode log

 Ben



 On Mon, Jan 21, 2013 at 3:05 PM, varun kumar varun@gmail.com wrote:

 Hi Ben,

 Are there any corrupted blocks in your hadoop cluster.

 Regards,
 Varun Kumar


 On Mon, Jan 21, 2013 at 8:22 AM, Ben Kim benkimkim...@gmail.com wrote:

 Hi!

 I followed the decommissioning guide on the hadoop hdfs wiki.

 the hdfs web ui shows that the decommissioning proceess has successfully
 begun.

 it started redeploying 80,000 blocks through the hadoop cluster, but for
 some reason it stopped at 9059 blocks. I've waited 30 hours and still no
 progress.

 Any one with any idea?
  --

 *Benjamin Kim*
 *benkimkimben at gmail*




 --
 Regards,
 Varun Kumar.P




 --

 *Benjamin Kim*
 *benkimkimben at gmail*




-- 

*Benjamin Kim*
*benkimkimben at gmail*

Re: Decommissioning a datanode takes forever

2013-01-22 Thread Ben Kim

Impatient I am, I just shut down the cluster and restarted it with empty
exclude file.

If I added the datanode hostname back to the exclude file, and ran hadoop
dfsadmin -refreshNodes, *the datanode goes straight to the dead node *without
going to the descommission process.

I'm done for today. maybe someone else can figure it out when I come back
tomorrow :)

Best regards,
Ben

On Tue, Jan 22, 2013 at 5:38 PM, Ben Kim benkimkim...@gmail.com wrote:

 UPDATE:

 WARN with edit log had nothing to do with the current problem.

 However replica placement warnings seem to be suspicious.
 Please have a look at the following logs.


 2013-01-22 09:12:10,885 WARN
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
 enough replicas, still in need of 1
 2013-01-22 00:02:17,541 INFO
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
 Block: blk_4844131893883391179_3440513,
 Expected Replicas: 10, live replicas: 9, corrupt replicas: 0,
 decommissioned replicas: 1, excess replicas: 0, Is Open File: false,
 Datanodes having this block: 203.235.211.155:50010 203.235.211.156:5001020
 3.235.211.145:50010 203.235.211.144:50010 203.235.211.146:50010
 203.235.211.158:50010 203.235.211.159:50010 203.235.211.157:50010
 203.235.211.160:50010 203.235.211.143:50010 ,
 Current Datanode: 203.235.211.155:50010, Is current datanode
 decommissioning: true

 I have set my replication factor to 3. I dont understand why hadoop is
 trying to replicate it to 10 nodes. I have decommissioned one node so
 currently I have 9 nodes in operation. It will never be replicated to 10
 nodes.

 I also see that all repeated warning msg like the above is for
 blk_4844131893883391179_3440513.

 How would I delete the block? it's not showing as corrupted block on fsck.
 :(

 BEN





 On Tue, Jan 22, 2013 at 9:28 AM, Ben Kim benkimkim...@gmail.com wrote:

 Hi Varun, Thnk you for the reponse

 No there doesnt seem to be any corrupted blocks in my cluster.
 I did hadoop fsck -blocks / and it didnt report any corrupted block.

 However, these are two WARNings in the namenode log, constantly repeating
 since the decommission.

- 2013-01-22 09:16:30,908 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit log,
edits.new files already exists in all healthy directories:
- 2013-01-22 09:12:10,885 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 1

 There isn't any WARN or ERROR in the decommissioning datanode log

 Ben



 On Mon, Jan 21, 2013 at 3:05 PM, varun kumar varun@gmail.com wrote:

 Hi Ben,

 Are there any corrupted blocks in your hadoop cluster.

 Regards,
 Varun Kumar


 On Mon, Jan 21, 2013 at 8:22 AM, Ben Kim benkimkim...@gmail.com wrote:

 Hi!

 I followed the decommissioning guide on the hadoop hdfs wiki.

 the hdfs web ui shows that the decommissioning proceess has
 successfully begun.

 it started redeploying 80,000 blocks through the hadoop cluster, but
 for some reason it stopped at 9059 blocks. I've waited 30 hours and still
 no progress.

 Any one with any idea?
  --

 *Benjamin Kim*
 *benkimkimben at gmail*




 --
 Regards,
 Varun Kumar.P




 --

 *Benjamin Kim*
 *benkimkimben at gmail*




 --

 *Benjamin Kim*
 *benkimkimben at gmail*




-- 

*Benjamin Kim*
*benkimkimben at gmail*

Re: Decommissioning a datanode takes forever

2013-01-21 Thread Ben Kim

Hi Varun, Thnk you for the reponse

No there doesnt seem to be any corrupted blocks in my cluster.
I did hadoop fsck -blocks / and it didnt report any corrupted block.

However, these are two WARNings in the namenode log, constantly repeating
since the decommission.

   - 2013-01-22 09:16:30,908 WARN
   org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit log,
   edits.new files already exists in all healthy directories:
   - 2013-01-22 09:12:10,885 WARN
   org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
   enough replicas, still in need of 1

There isn't any WARN or ERROR in the decommissioning datanode log

Ben


On Mon, Jan 21, 2013 at 3:05 PM, varun kumar varun@gmail.com wrote:

 Hi Ben,

 Are there any corrupted blocks in your hadoop cluster.

 Regards,
 Varun Kumar


 On Mon, Jan 21, 2013 at 8:22 AM, Ben Kim benkimkim...@gmail.com wrote:

 Hi!

 I followed the decommissioning guide on the hadoop hdfs wiki.

 the hdfs web ui shows that the decommissioning proceess has successfully
 begun.

 it started redeploying 80,000 blocks through the hadoop cluster, but for
 some reason it stopped at 9059 blocks. I've waited 30 hours and still no
 progress.

 Any one with any idea?
  --

 *Benjamin Kim*
 *benkimkimben at gmail*




 --
 Regards,
 Varun Kumar.P




-- 

*Benjamin Kim*
*benkimkimben at gmail*

Streaming Job map/reduce not working with scripts on 1.0.3

2013-01-04 Thread Ben Kim

Hi !

I'm using hadoop-1.0.3 to run streaming jobs with map/reduce shell scripts
such as this

bin/hadoop jar ./contrib/streaming/hadoop-streaming-1.0.3.jar -input /input
-output /output/015 -mapper streaming-map.sh -reducer
streaming-reduce.sh -file /home/hadoop/streaming/streaming-map.sh -file
/home/hadoop/streaming-reduce.sh

but the job fails and the task attemp log shows this,

java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 17 more
Caused by: java.lang.RuntimeException: configuration exception
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230)
at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
... 22 more
Caused by: java.io.IOException: Cannot run program
/tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/attempt_201301041944_0001_m_00_0/work/./streaming-map.sh:
java.io.IOException: error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
... 23 more
Caused by: java.io.IOException: java.io.IOException: error=2, No such
file or directory
at java.lang.UNIXProcess.init(UNIXProcess.java:148)
at java.lang.ProcessImpl.start(ProcessImpl.java:65)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
... 24 more


I tried to see what the problem is and found out that the missing file is a
symbolic link and hadoop isn't able to create it, in fact the
/tmp/hadoop-hadoop/...0_0/work directory doesn't exist at all.


here's an exerpt from the task attempt syslog (full text attached):

2013-01-04 19:44:43,304 INFO
org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating
symlink: 
/tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/jars/streaming-map.sh
- 
/tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/attempt_201301041944_0001_m_00_0/work/streaming-map.sh

hadoop is thinking that it's successfully created the symbolic link
from .job_201301041944_0001/jars/streaming-map.s to
job_201301041944_0001/attempt_201301041944_0001_m_00_0/work//streaming-map.s

but it actually doesn't. Therefore throwing the error.


If you had same experience or knows work around it please comment!
otherwise i'll file a jira tomorrow for it seems to be an obvious bug.

Best regards,

*Benjamin Kim*
*benkimkimben at gmail*


syslog
Description: Binary data

Re: Streaming Job map/reduce not working with scripts on 1.0.3

2013-01-04 Thread Ben Kim

nevermind. the problem has been fixed.

The problem was the trailing {control-v}{control-m} character on the first
line of #!/bin/bash
(which i blame my teammate for writing the script in windows notepad !!)





On Fri, Jan 4, 2013 at 8:09 PM, Ben Kim benkimkim...@gmail.com wrote:

 Hi !

 I'm using hadoop-1.0.3 to run streaming jobs with map/reduce shell scripts
 such as this

 bin/hadoop jar ./contrib/streaming/hadoop-streaming-1.0.3.jar -input
 /input -output /output/015 -mapper streaming-map.sh -reducer
 streaming-reduce.sh -file /home/hadoop/streaming/streaming-map.sh -file
 /home/hadoop/streaming-reduce.sh

 but the job fails and the task attemp log shows this,

 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
   ... 14 more
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 17 more
 Caused by: java.lang.RuntimeException: configuration exception
   at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230)
   at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
   ... 22 more
 Caused by: java.io.IOException: Cannot run program 
 /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/attempt_201301041944_0001_m_00_0/work/./streaming-map.sh:
  java.io.IOException: error=2, No such file or directory
   at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
   at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
   ... 23 more
 Caused by: java.io.IOException: java.io.IOException: error=2, No such file or 
 directory
   at java.lang.UNIXProcess.init(UNIXProcess.java:148)
   at java.lang.ProcessImpl.start(ProcessImpl.java:65)
   at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
   ... 24 more


 I tried to see what the problem is and found out that the missing file is
 a symbolic link and hadoop isn't able to create it, in fact the
 /tmp/hadoop-hadoop/...0_0/work directory doesn't exist at all.


 here's an exerpt from the task attempt syslog (full text attached):

 2013-01-04 19:44:43,304 INFO 
 org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: 
 /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/jars/streaming-map.sh
  - 
 /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/attempt_201301041944_0001_m_00_0/work/streaming-map.sh


 hadoop is thinking that it's successfully created the symbolic link from 
 .job_201301041944_0001/jars/streaming-map.s to 
 job_201301041944_0001/attempt_201301041944_0001_m_00_0/work//streaming-map.s

 but it actually doesn't. Therefore throwing the error.


 If you had same experience or knows work around it please comment!
 otherwise i'll file a jira tomorrow for it seems to be an obvious bug.


 Best regards,

 *Benjamin Kim*
 *benkimkimben at gmail*




-- 

*Benjamin Kim*
*benkimkimben at gmail*

Re: use S3 as input to MR job

2012-10-02 Thread Ben Kim

I'm having a similar issue

I'm running a wordcount MR as follows

hadoop jar WordCount.jar wordcount.WordCountDriver
 s3n://bucket/wordcount/input s3n://bucket/wordcount/output


s3n://bucket/wordcount/input is a s3 object that contains other input files.

However I get following NPE error

12/10/02 18:56:23 INFO mapred.JobClient:  map 0% reduce 0%
 12/10/02 18:56:54 INFO mapred.JobClient:  map 50% reduce 0%
 12/10/02 18:56:56 INFO mapred.JobClient: Task Id :
 attempt_201210021853_0001_m_01_0, Status : FAILED
 java.lang.NullPointerException
 at
 org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
 at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
 at java.io.FilterInputStream.close(FilterInputStream.java:155)
 at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
 at
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144)
 at
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)


MR runs fine if i specify more specific input path such as
s3n://bucket/wordcount/input/file.txt
what i want is to be able to pass s3 folders as parameters
Does anyone knows how to do this?

Best regards,
Ben Kim


On Fri, Jul 20, 2012 at 10:33 AM, Harsh J ha...@cloudera.com wrote:

 Dan,

 Can you share your error? The plain .gz files (not .tar.gz) are natively
 supported by Hadoop via its GzipCodec, and if you are facing an error, I
 believe its cause of something other than compression.


 On Fri, Jul 20, 2012 at 6:14 AM, Dan Yi d...@mediosystems.com wrote:

 i have a MR job to read file on amazon S3 and process the data on local
 hdfs. the files are zipped text file as .gz. i tried to setup the job as
 below but it won't work, anyone know what might be wrong? do i need to add
 extra step to unzip the file first? thanks.

 String S3_LOCATION = s3n://access_key:private_key@bucket_name

 protected void prepareHadoopJob() throws Exception {

 this.getHadoopJob().setMapperClass(Mapper1.class);
 this.getHadoopJob().setInputFormatClass(TextInputFormat.class);

 FileInputFormat.addInputPath(this.getHadoopJob(), new Path(S3_LOCATION));

 this.getHadoopJob().setNumReduceTasks(0);
 this.getHadoopJob().setOutputFormatClass(TableOutputFormat.class);
 
 this.getHadoopJob().getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, 
 myTable.getTableName());
 this.getHadoopJob().setOutputKeyClass(ImmutableBytesWritable.class);
 this.getHadoopJob().setOutputValueClass(Put.class);
 }




 *

 Dan Yi | Software Engineer, Analytics Engineering
   Medio Systems Inc | 701 Pike St. #1500 Seattle, WA 98101
 Predictive Analytics for a Connected World
  *




 --
 Harsh J




-- 

*Benjamin Kim*
*benkimkimben at gmail*
medio.gif

Re: Error reading task output

2012-07-27 Thread Ben Kim

Hi
I'm having a similar problem so I'll continue on this mailing to describe
my issue.

I ran a MR job that takes 70GB of input and creates 1098 mappers and 100
Reducers to process tasks. (on 9 node Hadoop cluster)
but the job fails and 4 datanode dies after few min (processes are still
running, but the master  recognize them as dead).
When I investigate the job, it looks like 20 mappers fail with these errors

ProcfsBasedProcessTree: java.io.IOException: Cannot run program getconf:
 java.io.IOException: error=11, Resource temporarily unavailable
 ..
 OutOfMemoryError: unable to create new native thread
 ..
 # There is insufficient memory for the Java Runtime Environment to
 continue.
 # Cannot create GC thread. Out of system resources.


Reducers also fail because they weren't able to retrieve the failed mapper
outputs.
I'm guessing for somehow a JVM memory reaches its max and tasktrackers and
datanodes aren't able to create new threads, so they die.

But as lack of my experience in hadoop, I don't know what's actually
causing it. And of course I dun have answers to it yet.

here are some *configurations*
HADOOP_HEAPSIZE=4096
HADOOP_NAMENODE_OPTS = .. -Xmx2g ..
HADOOP_DATANODE_OPTS = .. -Xmx4g ..
HADOOP_JOBTRACKER_OPTS = .. -Xmx4g ..

dfs.datanode.max.xcievers = 6
mapred.child.java.opts = -Xmx400m
mapred.tasktracker.map.tasks.maximum = 14
mapred.tasktracker.reduce.tasks.maximum = 14

also attached the* logs*

If anyone knows answers to it please please let me know.
I will appreciate anyone help on this.

Best regards,
Ben

On Fri, Jun 15, 2012 at 1:05 PM, Harsh J ha...@cloudera.com wrote:

 Do you ship a lot of dist-cache files or perhaps have a bad
 mapred.child.java.opts parameter?

 On Fri, Jun 15, 2012 at 1:39 AM, Shamshad Ansari sans...@apixio.com
 wrote:
  Hi All,
  When I run hadoop jobs, I observe the following errors. Also, I notice
 that
  data node dies every time  the job is initiated.
 
  Does any one know what may be causing this and how to solve this?
 
  ==
 
  12/06/14 19:57:17 INFO input.FileInputFormat: Total input paths to
 process :
  1
  12/06/14 19:57:17 INFO mapred.JobClient: Running job:
 job_201206141136_0002
  12/06/14 19:57:18 INFO mapred.JobClient:  map 0% reduce 0%
  12/06/14 19:57:27 INFO mapred.JobClient: Task Id :
  attempt_201206141136_0002_m_01_0, Status : FAILED
  java.lang.Throwable: Child Error
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
  Caused by: java.io.IOException: Task process exit with nonzero status of
 1.
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
 
  12/06/14 19:57:27 WARN mapred.JobClient: Error reading task
 
 outputhttp://node1:50060/tasklog?plaintext=trueattemptid=attempt_201206141136_0002_m_01_0filter=stdout
  12/06/14 19:57:27 WARN mapred.JobClient: Error reading task
 
 outputhttp://node1:50060/tasklog?plaintext=trueattemptid=attempt_201206141136_0002_m_01_0filter=stderr
  12/06/14 19:57:33 INFO mapred.JobClient: Task Id :
  attempt_201206141136_0002_r_02_0, Status : FAILED
  java.lang.Throwable: Child Error
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
  Caused by: java.io.IOException: Task process exit with nonzero status of
 1.
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
 
  12/06/14 19:57:33 WARN mapred.JobClient: Error reading task
 
 outputhttp://node1:50060/tasklog?plaintext=trueattemptid=attempt_201206141136_0002_r_02_0filter=stdout
  12/06/14 19:57:33 WARN mapred.JobClient: Error reading task
 
 outputhttp://node1:50060/tasklog?plaintext=trueattemptid=attempt_201206141136_0002_r_02_0filter=stderr
  ^Chadoop@ip-10-174-87-251:~/apixio-pipeline/pipeline-trigger$ 12/06/14
  19:57:27 WARN mapred.JobClient: Error reading task
 
 outputhttp:/node1:50060/sklog?plaintext=trueattemptid=attempt_201206141136_0002_m_01_0filter=stdout
 
  Thank you,
  --Shamshad
 



 --
 Harsh J




-- 

*Benjamin Kim*
*benkimkimben at gmail*


datanode.log
Description: Binary data


mapper.log
Description: Binary data


reducer.log
Description: Binary data


tasktracker.log
Description: Binary data

Hadoop topology not working (all servers belongs to default rack)

2012-06-27 Thread Ben Kim

Hi
I got my topology script from
http://wiki.apache.org/hadoop/topology_rack_awareness_scripts
I checked that the script works correctly.

But, in the hadoop cluster, all my servers get assigned to the default rack.
I'm using hadoop 1.0.3, but had experienced same problem with 1.0.0 version.


Yunhong was having the same problem in the past without any resolution.
http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200807.mbox/%3cpine.lnx.4.64.0807031453070.28...@bert.cs.uic.edu%3E

*Benjamin Kim*
*benkimkimben at gmail*

Re: Datanodes using public ip, why?

Datanodes using public ip, why?

is time sync required among all nodes?

Hadoop 2.0.4: Unable to load native-hadoop library for your platform

Re: issue with hadoop mapreduce using same job.jar

issue with hadoop mapreduce using same job.jar

Re: Multiple reduce task retries running at same time

Re: Decommissioning a datanode takes forever

Re: Decommissioning a datanode takes forever

Re: Decommissioning a datanode takes forever

Streaming Job map/reduce not working with scripts on 1.0.3

Re: Streaming Job map/reduce not working with scripts on 1.0.3

Re: use S3 as input to MR job

Re: Error reading task output

Hadoop topology not working (all servers belongs to default rack)

15 matches

Site Navigation

Mail list logo

Footer information