Re: distcp failing

Michael Di Domenico Tue, 09 Sep 2008 10:14:43 -0700

manually creating the "system" directory gets me past the first error, but
now i get this.  i'm not necessarily sure its a step forward though, because
the map task never shows up in the jobtracker
[EMAIL PROTECTED] hadoop-0.17.1]$ bin/hadoop distcp
"file:///home/mdidomenico/1gTestfile" "1gTestfile"
08/09/09 13:12:06 INFO util.CopyFiles:
srcPaths=[file:/home/mdidomenico/1gTestfile]
08/09/09 13:12:06 INFO util.CopyFiles: destPath=1gTestfile
08/09/09 13:12:07 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:07 INFO dfs.DFSClient: Abandoning block
blk_5758513071638050362
08/09/09 13:12:13 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:13 INFO dfs.DFSClient: Abandoning block
blk_1691495306775808049
08/09/09 13:12:17 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:17 INFO dfs.DFSClient: Abandoning block
blk_1027634596973755899
08/09/09 13:12:19 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:19 INFO dfs.DFSClient: Abandoning block
blk_4535302510016050282
08/09/09 13:12:23 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:23 INFO dfs.DFSClient: Abandoning block
blk_7022658012001626339
08/09/09 13:12:25 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:25 INFO dfs.DFSClient: Abandoning block
blk_-4509681241839967328
08/09/09 13:12:29 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:29 INFO dfs.DFSClient: Abandoning block
blk_8318033979013580420
08/09/09 13:12:31 WARN dfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
08/09/09 13:12:31 WARN dfs.DFSClient: Error Recovery for block
blk_-4509681241839967328 bad datanode[0]
08/09/09 13:12:35 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:35 INFO dfs.DFSClient: Abandoning block
blk_2848354798649979411
08/09/09 13:12:41 WARN dfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
08/09/09 13:12:41 WARN dfs.DFSClient: Error Recovery for block
blk_2848354798649979411 bad datanode[0]
Exception in thread "Thread-0" java.util.ConcurrentModificationException
        at java.util.TreeMap$PrivateEntryIterator.nextEntry(Unknown Source)
        at java.util.TreeMap$KeyIterator.next(Unknown Source)
        at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
        at
org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
        at
org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
        at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
        at
org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
08/09/09 13:12:41 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Could not read from stream
08/09/09 13:12:41 INFO dfs.DFSClient: Abandoning block
blk_9189111926428577428


On Tue, Sep 9, 2008 at 1:03 PM, Michael Di Domenico
<[EMAIL PROTECTED]>wrote:

> a little more digging and it appears i cannot run distcp as someone other
> then hadoop on the namenode
>  /tmp/hadoop-hadoop/mapred/system/job_200809091231_0005/job.xml
>
> looking at this directory from the error file the "system" directory does
> not exist on the namenode, i only have a "local" directory
>
>
> On Tue, Sep 9, 2008 at 12:41 PM, Michael Di Domenico <
> [EMAIL PROTECTED]> wrote:
>
>> i'm not sure that's the issue, i basically tarred up the hadoop directory
>> from the cluster and copied it over to the non-data node
>> but i do agree i've likely got a setting wrong, since i can run distcp
>> from the namenode and it works fine.  the question is which one
>>
>> On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball <[EMAIL PROTECTED]>wrote:
>>
>>> It is likely that you mapred.system.dir and/or fs.default.name settings
>>> are
>>> incorrect on the non-datanode machine that you are launching the task
>>> from.
>>> These two settings (in your conf/hadoop-site.xml file) must match the
>>> settings on the cluster itself.
>>>
>>> - Aaron
>>>
>>> On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
>>> <[EMAIL PROTECTED]>wrote:
>>>
>>> > I'm attempting to load data into hadoop (version 0.17.1), from a
>>> > non-datanode machine in the cluster.  I can run jobs and copyFromLocal
>>> > works
>>> > fine, but when i try to use distcp i get the below.  I'm don't
>>> understand
>>> > what the error, can anyone help?
>>> > Thanks
>>> >
>>> > blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
>>> > file:///Users/mdidomenico/hadoop/1gTestfile
>>> /user/mdidomenico/1gTestfile
>>> > 08/09/07 23:56:06 INFO util.CopyFiles:
>>> > srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
>>> > 08/09/07 23:56:06 INFO util.CopyFiles:
>>> > destPath=/user/mdidomenico/1gTestfile1
>>> > 08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
>>> > With failures, global counters are inaccurate; consider running with -i
>>> > Copy failed: org.apache.hadoop.ipc.RemoteException:
>>> java.io.IOException:
>>> > /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such
>>> > file
>>> > or directory
>>> >        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
>>> >        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
>>> >        at
>>> > org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
>>> >        at
>>> > org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
>>> >        at
>>> > org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:175)
>>> >        at
>>> > org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755)
>>> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> >        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>>> >        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
>>> Source)
>>> >        at java.lang.reflect.Method.invoke(Unknown Source)
>>> >        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
>>> >        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
>>> >
>>> >        at org.apache.hadoop.ipc.Client.call(Client.java:557)
>>> >        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
>>> >        at $Proxy1.submitJob(Unknown Source)
>>> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> >        at
>>> >
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >        at
>>> >
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >        at java.lang.reflect.Method.invoke(Method.java:585)
>>> >        at
>>> >
>>> >
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>>> >        at
>>> >
>>> >
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>>> >        at $Proxy1.submitJob(Unknown Source)
>>> >        at
>>> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
>>> >        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>>> >        at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
>>> >        at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
>>> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>> >        at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
>>> >
>>>
>>
>>
>

Re: distcp failing

Reply via email to