Re: RemoteException writing files
Thanks for the links. The behavior is as the links describe but bottom line it works fine if I'm copying these files on the Linux VMWare instance via the command line. Using my java program remotely, it simply doesn't work. All I can think of is that there is some property on the Java side (in Windows 7) that is telling Hadoop (in VMware Linux) to do the block replication differently than what it does when the operation is run locally via the command line. This is a frustrating problem. I'm sure its a 10 second fix if I can find the right property to set in the Configuration class. This is what I have loaded into the Configuration class so far: config.addResource(new Path(c:/_bigdata/client_libs/core-site.xml)); config.addResource(new Path(c:/_bigdata/client_libs/hdfs-site.xml)); config.addResource(new Path(c:/_bigdata/client_libs/mapred-site.xml)); //config.set(dfs.replication, 1); //config.set(dfs.datanode.address, 192.168.60.128:50010); Setting dfs.replication=1 is the default setting from hdfs-site.xml so that didn't do anything. I tried to override the dfs.datanode.address in case 127.0.0.1:50010 was the issue but it gets overridden on the Linux end apparently. How do I override 127.0.0.1 to localhost? What config file? - On Sat, May 19, 2012 at 2:00 PM, samir das mohapatra samir.help...@gmail.com wrote: Hi This Could be due to the Following reason 1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have any available DataNodes 2) Namenode not able to start properly 3) other wise some IP Issue . Note:- Pleaes mention localhost instead of 127.0.0.1 (If it is in local) Follow URL: http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F Thanks samir On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.com wrote: Hi folks, (Resending to this group, sent to common-dev before, pretty sure that's for Hadoop internal development - sorry for that..) I'm pretty stuck here. I've been researching for hours and I haven't made any forward progress on this one. I have a vmWare installation of Cloudera Hadoop 0.20. The following commands to create a directory and copy a file from the shared folder *work fine*, so I'm confident everything is setup correctly: [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir [cloudera@localhost bin]$ hadoop fs -put /mnt/hgfs/shared_folder/file1.txt /user/cloudera/testdir/file1.txt The file shows up fine in the HDFS doing it this way on the Linux VM. *However*, when I try doing the equivalent operation in Java everything works great until I try to close() FSDataOutputStream. I'm left with the new directory and a zero byte size file. One suspicious thing is that the user is admin instead of cloudera which I haven't figured out why. Here is the error: 12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in createBlockOutputStream 127.0.0.1:50010 java.net.ConnectException: Connection refused: no further information 12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block blk_1931357292676354131_1068 12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode 127.0.0.1:50010 12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/admin/testdir/file1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) There are certainly lots of search references to *could only be replicated to 0 nodes, instead of 1* but chasing down those suggestions hasn't helped. I have run *jps* and* netstat* and that looks good. All services are running, all port seem to be good. The *health check* looks good, plenty of disk space, no failed nodes... Here is the java (it fails when it hits fs.close(): import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileReader; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class TestFileTrans { public static void main(String[] args) { Configuration config = new Configuration(); config.addResource(new Path(c:/_bigdata/client_libs/core-site.xml));
Re: RemoteException writing files
Hi Todd, It might be useful to try the CDH user mailing list too. I'm afraid I haven't used CDH, so I'm not entirely certain. The fact that after you run your JAVA program, the NN has created a directory and a 0-byte file means you were able to contact and interact with the NN just fine. I'm guessing the problem is in streaming data to the DN(s). Does the VM have its ports blocked, causing your client (presumably outside the VM) to be unable to talk the DNs? What happens when you run the JAVA program from inside the VM? After your JAVA program is unable to talk to the DN, it asks the NN for another DN. I'm guessing since there are no more left, you see the message *could only be replicated to 0 nodes, instead of 1* So its kind of a red herring. Using my java program remotely, it simply doesn't work. All I can think of is that there is some property on the Java side (in Windows 7) that is telling Hadoop (in VMware Linux) to do the block replication differently than what it does when the operation is run locally via the command line. I would be very surprised if this were the issue. Hope this helps, Ravi. On Sun, May 20, 2012 at 9:40 AM, Todd McFarland toddmcf2...@gmail.comwrote: Thanks for the links. The behavior is as the links describe but bottom line it works fine if I'm copying these files on the Linux VMWare instance via the command line. Using my java program remotely, it simply doesn't work. All I can think of is that there is some property on the Java side (in Windows 7) that is telling Hadoop (in VMware Linux) to do the block replication differently than what it does when the operation is run locally via the command line. This is a frustrating problem. I'm sure its a 10 second fix if I can find the right property to set in the Configuration class. This is what I have loaded into the Configuration class so far: config.addResource(new Path(c:/_bigdata/client_libs/core-site.xml)); config.addResource(new Path(c:/_bigdata/client_libs/hdfs-site.xml)); config.addResource(new Path(c:/_bigdata/client_libs/mapred-site.xml)); //config.set(dfs.replication, 1); //config.set(dfs.datanode.address, 192.168.60.128:50010); Setting dfs.replication=1 is the default setting from hdfs-site.xml so that didn't do anything. I tried to override the dfs.datanode.address in case 127.0.0.1:50010 was the issue but it gets overridden on the Linux end apparently. How do I override 127.0.0.1 to localhost? What config file? - On Sat, May 19, 2012 at 2:00 PM, samir das mohapatra samir.help...@gmail.com wrote: Hi This Could be due to the Following reason 1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have any available DataNodes 2) Namenode not able to start properly 3) other wise some IP Issue . Note:- Pleaes mention localhost instead of 127.0.0.1 (If it is in local) Follow URL: http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F Thanks samir On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.com wrote: Hi folks, (Resending to this group, sent to common-dev before, pretty sure that's for Hadoop internal development - sorry for that..) I'm pretty stuck here. I've been researching for hours and I haven't made any forward progress on this one. I have a vmWare installation of Cloudera Hadoop 0.20. The following commands to create a directory and copy a file from the shared folder *work fine*, so I'm confident everything is setup correctly: [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir [cloudera@localhost bin]$ hadoop fs -put /mnt/hgfs/shared_folder/file1.txt /user/cloudera/testdir/file1.txt The file shows up fine in the HDFS doing it this way on the Linux VM. *However*, when I try doing the equivalent operation in Java everything works great until I try to close() FSDataOutputStream. I'm left with the new directory and a zero byte size file. One suspicious thing is that the user is admin instead of cloudera which I haven't figured out why. Here is the error: 12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in createBlockOutputStream 127.0.0.1:50010 java.net.ConnectException: Connection refused: no further information 12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block blk_1931357292676354131_1068 12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode 127.0.0.1:50010 12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/admin/testdir/file1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533) at
RemoteException writing files
Hi folks, (Resending to this group, sent to common-dev before, pretty sure that's for Hadoop internal development - sorry for that..) I'm pretty stuck here. I've been researching for hours and I haven't made any forward progress on this one. I have a vmWare installation of Cloudera Hadoop 0.20. The following commands to create a directory and copy a file from the shared folder *work fine*, so I'm confident everything is setup correctly: [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir [cloudera@localhost bin]$ hadoop fs -put /mnt/hgfs/shared_folder/file1.txt /user/cloudera/testdir/file1.txt The file shows up fine in the HDFS doing it this way on the Linux VM. *However*, when I try doing the equivalent operation in Java everything works great until I try to close() FSDataOutputStream. I'm left with the new directory and a zero byte size file. One suspicious thing is that the user is admin instead of cloudera which I haven't figured out why. Here is the error: 12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in createBlockOutputStream 127.0.0.1:50010 java.net.ConnectException: Connection refused: no further information 12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block blk_1931357292676354131_1068 12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode 127.0.0.1:50010 12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/admin/testdir/file1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) There are certainly lots of search references to *could only be replicated to 0 nodes, instead of 1* but chasing down those suggestions hasn't helped. I have run *jps* and* netstat* and that looks good. All services are running, all port seem to be good. The *health check* looks good, plenty of disk space, no failed nodes... Here is the java (it fails when it hits fs.close(): import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileReader; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class TestFileTrans { public static void main(String[] args) { Configuration config = new Configuration(); config.addResource(new Path(c:/_bigdata/client_libs/core-site.xml)); config.addResource(new Path(c:/_bigdata/client_libs/hdfs-site.xml)); System.out.println(hadoop.tmp.dir: + config.get(hadoop.tmp.dir)); try{ FileSystem dfs = FileSystem.get(config); // this will default to admin unless the workingDirectory is explicitly set.. System.out.println(HDFS Working Directory: + dfs.getWorkingDirectory().toString()); String dirName = testdir; Path src = new Path(dfs.getWorkingDirectory()+/+dirName); dfs.mkdirs(src); System.out.println(HDFS Directory created: + dfs.getWorkingDirectory().toString()); loadFile(dfs, src); }catch(IOException e){ System.out.println(Error + e.getMessage()); } } private static void loadFile(FileSystem dfs, Path src) throws IOException{ FileInputStream fis = new FileInputStream(c:/_bigdata/shared_folder/file1.txt); int len = fis.available(); byte[] btr = new byte[len]; fis.read(btr); FSDataOutputStream fs = dfs.create(new Path(src.toString() +/file1.txt)); fs.write(btr); fs.flush(); fs.close(); } } Any help would be greatly appreciated!
Re: RemoteException writing files
Hi This Could be due to the Following reason 1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have any available DataNodes 2) Namenode not able to start properly 3) other wise some IP Issue . Note:- Pleaes mention localhost instead of 127.0.0.1 (If it is in local) Follow URL: http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F Thanks samir On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.comwrote: Hi folks, (Resending to this group, sent to common-dev before, pretty sure that's for Hadoop internal development - sorry for that..) I'm pretty stuck here. I've been researching for hours and I haven't made any forward progress on this one. I have a vmWare installation of Cloudera Hadoop 0.20. The following commands to create a directory and copy a file from the shared folder *work fine*, so I'm confident everything is setup correctly: [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir [cloudera@localhost bin]$ hadoop fs -put /mnt/hgfs/shared_folder/file1.txt /user/cloudera/testdir/file1.txt The file shows up fine in the HDFS doing it this way on the Linux VM. *However*, when I try doing the equivalent operation in Java everything works great until I try to close() FSDataOutputStream. I'm left with the new directory and a zero byte size file. One suspicious thing is that the user is admin instead of cloudera which I haven't figured out why. Here is the error: 12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in createBlockOutputStream 127.0.0.1:50010 java.net.ConnectException: Connection refused: no further information 12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block blk_1931357292676354131_1068 12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode 127.0.0.1:50010 12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/admin/testdir/file1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) There are certainly lots of search references to *could only be replicated to 0 nodes, instead of 1* but chasing down those suggestions hasn't helped. I have run *jps* and* netstat* and that looks good. All services are running, all port seem to be good. The *health check* looks good, plenty of disk space, no failed nodes... Here is the java (it fails when it hits fs.close(): import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileReader; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class TestFileTrans { public static void main(String[] args) { Configuration config = new Configuration(); config.addResource(new Path(c:/_bigdata/client_libs/core-site.xml)); config.addResource(new Path(c:/_bigdata/client_libs/hdfs-site.xml)); System.out.println(hadoop.tmp.dir: + config.get(hadoop.tmp.dir)); try{ FileSystem dfs = FileSystem.get(config); // this will default to admin unless the workingDirectory is explicitly set.. System.out.println(HDFS Working Directory: + dfs.getWorkingDirectory().toString()); String dirName = testdir; Path src = new Path(dfs.getWorkingDirectory()+/+dirName); dfs.mkdirs(src); System.out.println(HDFS Directory created: + dfs.getWorkingDirectory().toString()); loadFile(dfs, src); }catch(IOException e){ System.out.println(Error + e.getMessage()); } } private static void loadFile(FileSystem dfs, Path src) throws IOException{ FileInputStream fis = new FileInputStream(c:/_bigdata/shared_folder/file1.txt); int len = fis.available(); byte[] btr = new byte[len]; fis.read(btr); FSDataOutputStream fs = dfs.create(new Path(src.toString() +/file1.txt)); fs.write(btr); fs.flush(); fs.close(); } } Any help would be greatly appreciated!
Re: RemoteException writing files
Hi This Could be due to the Following reason 1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have any available DataNodes 2) Namenode not able to start properly 3) other wise some IP Issue . Note:- Pleaes mention localhost instead of 127.0.0.1 (If it is in local) Follow URL: http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F Thanks samir On Sat, May 19, 2012 at 11:30 PM, samir das mohapatra samir.help...@gmail.com wrote: Hi This Could be due to the Following reason 1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have any available DataNodes 2) Namenode not able to start properly 3) other wise some IP Issue . Note:- Pleaes mention localhost instead of 127.0.0.1 (If it is in local) Follow URL: http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F Thanks samir On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.comwrote: Hi folks, (Resending to this group, sent to common-dev before, pretty sure that's for Hadoop internal development - sorry for that..) I'm pretty stuck here. I've been researching for hours and I haven't made any forward progress on this one. I have a vmWare installation of Cloudera Hadoop 0.20. The following commands to create a directory and copy a file from the shared folder *work fine*, so I'm confident everything is setup correctly: [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir [cloudera@localhost bin]$ hadoop fs -put /mnt/hgfs/shared_folder/file1.txt /user/cloudera/testdir/file1.txt The file shows up fine in the HDFS doing it this way on the Linux VM. *However*, when I try doing the equivalent operation in Java everything works great until I try to close() FSDataOutputStream. I'm left with the new directory and a zero byte size file. One suspicious thing is that the user is admin instead of cloudera which I haven't figured out why. Here is the error: 12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in createBlockOutputStream 127.0.0.1:50010 java.net.ConnectException: Connection refused: no further information 12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block blk_1931357292676354131_1068 12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode 127.0.0.1:50010 12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/admin/testdir/file1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) There are certainly lots of search references to *could only be replicated to 0 nodes, instead of 1* but chasing down those suggestions hasn't helped. I have run *jps* and* netstat* and that looks good. All services are running, all port seem to be good. The *health check* looks good, plenty of disk space, no failed nodes... Here is the java (it fails when it hits fs.close(): import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileReader; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class TestFileTrans { public static void main(String[] args) { Configuration config = new Configuration(); config.addResource(new Path(c:/_bigdata/client_libs/core-site.xml)); config.addResource(new Path(c:/_bigdata/client_libs/hdfs-site.xml)); System.out.println(hadoop.tmp.dir: + config.get(hadoop.tmp.dir)); try{ FileSystem dfs = FileSystem.get(config); // this will default to admin unless the workingDirectory is explicitly set.. System.out.println(HDFS Working Directory: + dfs.getWorkingDirectory().toString()); String dirName = testdir; Path src = new Path(dfs.getWorkingDirectory()+/+dirName); dfs.mkdirs(src); System.out.println(HDFS Directory created: + dfs.getWorkingDirectory().toString()); loadFile(dfs, src); }catch(IOException e){ System.out.println(Error + e.getMessage()); } } private static void loadFile(FileSystem dfs, Path src) throws IOException{ FileInputStream fis = new FileInputStream(c:/_bigdata/shared_folder/file1.txt); int len = fis.available();