Re: RemoteException writing files

2012-05-20 Thread Todd McFarland
Thanks for the links.  The behavior is as the links describe but bottom
line it works fine if I'm copying these files on the Linux VMWare instance
via the command line.

Using my java program remotely, it simply doesn't work.  All I can think of
is that there is some property on the Java side (in Windows 7) that is
telling Hadoop (in VMware Linux) to do the block replication differently
than what it does when the operation is run locally via the command line.

This is a frustrating problem.  I'm sure its a 10 second fix if I can find
the right property to set in the Configuration class.
This is what I have loaded into the Configuration class so far:

config.addResource(new Path(c:/_bigdata/client_libs/core-site.xml));
config.addResource(new Path(c:/_bigdata/client_libs/hdfs-site.xml));
config.addResource(new Path(c:/_bigdata/client_libs/mapred-site.xml));
//config.set(dfs.replication, 1);
//config.set(dfs.datanode.address, 192.168.60.128:50010);

Setting dfs.replication=1 is the default setting from hdfs-site.xml so
that didn't do anything.  I tried to override the dfs.datanode.address in
case 127.0.0.1:50010 was the issue but it gets overridden on the Linux
end apparently.

How do I override 127.0.0.1 to localhost?  What config file?

-


On Sat, May 19, 2012 at 2:00 PM, samir das mohapatra 
samir.help...@gmail.com wrote:

 Hi
  This Could be due to the Following reason

 1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have
 any available DataNodes
  2) Namenode not able to start properly
  3) other wise some IP Issue .
Note:- Pleaes  mention localhost instead of 127.0.0.1 (If it is in
 local)

   Follow URL:

 http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F


 Thanks
  samir


 On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.com
 wrote:

  Hi folks,
 
  (Resending to this group, sent to common-dev before, pretty sure that's
 for
  Hadoop internal development - sorry for that..)
 
  I'm pretty stuck here.  I've been researching for hours and I haven't
 made
  any forward progress on this one.
 
  I have a vmWare installation of Cloudera Hadoop 0.20.  The following
  commands to create a directory and copy a file from the shared folder
 *work
  fine*, so I'm confident everything is setup correctly:
 
  [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir
  [cloudera@localhost bin]$ hadoop fs -put
 /mnt/hgfs/shared_folder/file1.txt
  /user/cloudera/testdir/file1.txt
 
  The file shows up fine in the HDFS doing it this way on the Linux VM.
 
  *However*, when I try doing the equivalent operation in Java everything
  works great until I try to close() FSDataOutputStream.
  I'm left with the new directory and a zero byte size file.  One
 suspicious
  thing is that the user is admin instead of cloudera which I haven't
  figured out why.  Here is the error:
 
  12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in
 createBlockOutputStream
  127.0.0.1:50010 java.net.ConnectException: Connection refused: no
 further
  information
  12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block
  blk_1931357292676354131_1068
  12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode
 127.0.0.1:50010
  12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception:
  org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
  /user/admin/testdir/file1.txt could only be replicated to 0 nodes,
 instead
  of 1
 at
 
 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533)
 at
 
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:667)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 
  There are certainly lots of search references to *could only be
 replicated
  to 0 nodes, instead of 1* but chasing down those suggestions hasn't
  helped.
  I have run *jps* and* netstat* and that looks good.  All services are
  running, all port seem to be good.  The *health check* looks good, plenty
  of disk space, no failed nodes...
 
  Here is the java (it fails when it hits fs.close():
 
  import java.io.BufferedReader;
  import java.io.FileInputStream;
  import java.io.FileReader;
  import java.io.IOException;
  import org.apache.hadoop.conf.Configuration;
  import org.apache.hadoop.fs.FSDataInputStream;
  import org.apache.hadoop.fs.FSDataOutputStream;
  import org.apache.hadoop.fs.FileSystem;
  import org.apache.hadoop.fs.Path;
 
  public class TestFileTrans {
 
 public static void main(String[] args) {
 
 Configuration config = new Configuration();
 
 config.addResource(new
  Path(c:/_bigdata/client_libs/core-site.xml));
 

Re: RemoteException writing files

2012-05-20 Thread Ravi Prakash
Hi Todd,

It might be useful to try the CDH user mailing list too. I'm afraid I
haven't used CDH, so I'm not entirely certain.
The fact that after you run your JAVA program, the NN has created a
directory and a 0-byte file means you were able to contact and interact
with the NN just fine. I'm guessing the problem is in streaming data to the
DN(s). Does the VM have its ports blocked, causing your client (presumably
outside the VM) to be unable to talk the DNs? What happens when you run the
JAVA program from inside the VM?

After your JAVA program is unable to talk to the DN, it asks the NN for
another DN. I'm guessing since there are no more left, you see the message
*could only be replicated to 0 nodes, instead of 1*  So its kind of a red
herring.

Using my java program remotely, it simply doesn't work.  All I can think of
is that there is some property on the Java side (in Windows 7) that is
telling Hadoop (in VMware Linux) to do the block replication differently
than what it does when the operation is run locally via the command line.
I would be very surprised if this were the issue.

Hope this helps,
Ravi.

On Sun, May 20, 2012 at 9:40 AM, Todd McFarland toddmcf2...@gmail.comwrote:

 Thanks for the links.  The behavior is as the links describe but bottom
 line it works fine if I'm copying these files on the Linux VMWare instance
 via the command line.

 Using my java program remotely, it simply doesn't work.  All I can think of
 is that there is some property on the Java side (in Windows 7) that is
 telling Hadoop (in VMware Linux) to do the block replication differently
 than what it does when the operation is run locally via the command line.

 This is a frustrating problem.  I'm sure its a 10 second fix if I can find
 the right property to set in the Configuration class.
 This is what I have loaded into the Configuration class so far:

 config.addResource(new Path(c:/_bigdata/client_libs/core-site.xml));
 config.addResource(new Path(c:/_bigdata/client_libs/hdfs-site.xml));
 config.addResource(new Path(c:/_bigdata/client_libs/mapred-site.xml));
 //config.set(dfs.replication, 1);
 //config.set(dfs.datanode.address, 192.168.60.128:50010);

 Setting dfs.replication=1 is the default setting from hdfs-site.xml so
 that didn't do anything.  I tried to override the dfs.datanode.address in
 case 127.0.0.1:50010 was the issue but it gets overridden on the Linux
 end apparently.

 How do I override 127.0.0.1 to localhost?  What config file?

 -


 On Sat, May 19, 2012 at 2:00 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:

  Hi
   This Could be due to the Following reason
 
  1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have
  any available DataNodes
   2) Namenode not able to start properly
   3) other wise some IP Issue .
 Note:- Pleaes  mention localhost instead of 127.0.0.1 (If it is in
  local)
 
Follow URL:
 
 
 http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F
 
 
  Thanks
   samir
 
 
  On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.com
  wrote:
 
   Hi folks,
  
   (Resending to this group, sent to common-dev before, pretty sure that's
  for
   Hadoop internal development - sorry for that..)
  
   I'm pretty stuck here.  I've been researching for hours and I haven't
  made
   any forward progress on this one.
  
   I have a vmWare installation of Cloudera Hadoop 0.20.  The following
   commands to create a directory and copy a file from the shared folder
  *work
   fine*, so I'm confident everything is setup correctly:
  
   [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir
   [cloudera@localhost bin]$ hadoop fs -put
  /mnt/hgfs/shared_folder/file1.txt
   /user/cloudera/testdir/file1.txt
  
   The file shows up fine in the HDFS doing it this way on the Linux VM.
  
   *However*, when I try doing the equivalent operation in Java everything
   works great until I try to close() FSDataOutputStream.
   I'm left with the new directory and a zero byte size file.  One
  suspicious
   thing is that the user is admin instead of cloudera which I haven't
   figured out why.  Here is the error:
  
   12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in
  createBlockOutputStream
   127.0.0.1:50010 java.net.ConnectException: Connection refused: no
  further
   information
   12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block
   blk_1931357292676354131_1068
   12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode
  127.0.0.1:50010
   12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception:
   org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
   /user/admin/testdir/file1.txt could only be replicated to 0 nodes,
  instead
   of 1
  at
  
  
 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533)
  at
  
 
 

Re: RemoteException writing files

2012-05-19 Thread samir das mohapatra
Hi
  This Could be due to the Following reason

1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have
any available DataNodes
 2) Namenode not able to start properly
 3) other wise some IP Issue .
Note:- Pleaes  mention localhost instead of 127.0.0.1 (If it is in
local)

   Follow URL:
http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F


Thanks
 samir


On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.comwrote:

 Hi folks,

 (Resending to this group, sent to common-dev before, pretty sure that's for
 Hadoop internal development - sorry for that..)

 I'm pretty stuck here.  I've been researching for hours and I haven't made
 any forward progress on this one.

 I have a vmWare installation of Cloudera Hadoop 0.20.  The following
 commands to create a directory and copy a file from the shared folder *work
 fine*, so I'm confident everything is setup correctly:

 [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir
 [cloudera@localhost bin]$ hadoop fs -put /mnt/hgfs/shared_folder/file1.txt
 /user/cloudera/testdir/file1.txt

 The file shows up fine in the HDFS doing it this way on the Linux VM.

 *However*, when I try doing the equivalent operation in Java everything
 works great until I try to close() FSDataOutputStream.
 I'm left with the new directory and a zero byte size file.  One suspicious
 thing is that the user is admin instead of cloudera which I haven't
 figured out why.  Here is the error:

 12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in createBlockOutputStream
 127.0.0.1:50010 java.net.ConnectException: Connection refused: no further
 information
 12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block
 blk_1931357292676354131_1068
 12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode 127.0.0.1:50010
 12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception:
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /user/admin/testdir/file1.txt could only be replicated to 0 nodes, instead
 of 1
at

 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533)
at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:667)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 There are certainly lots of search references to *could only be replicated
 to 0 nodes, instead of 1* but chasing down those suggestions hasn't
 helped.
 I have run *jps* and* netstat* and that looks good.  All services are
 running, all port seem to be good.  The *health check* looks good, plenty
 of disk space, no failed nodes...

 Here is the java (it fails when it hits fs.close():

 import java.io.BufferedReader;
 import java.io.FileInputStream;
 import java.io.FileReader;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FSDataInputStream;
 import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;

 public class TestFileTrans {

public static void main(String[] args) {

Configuration config = new Configuration();

config.addResource(new
 Path(c:/_bigdata/client_libs/core-site.xml));
config.addResource(new
 Path(c:/_bigdata/client_libs/hdfs-site.xml));

System.out.println(hadoop.tmp.dir:  +
 config.get(hadoop.tmp.dir));
try{
FileSystem dfs = FileSystem.get(config);

// this will default to admin unless the workingDirectory is
 explicitly set..

System.out.println(HDFS Working Directory:  +
 dfs.getWorkingDirectory().toString());

String dirName = testdir;
Path src = new Path(dfs.getWorkingDirectory()+/+dirName);
dfs.mkdirs(src);

System.out.println(HDFS Directory created:  +
 dfs.getWorkingDirectory().toString());

loadFile(dfs, src);

}catch(IOException e){
System.out.println(Error + e.getMessage());
}

}

private static void loadFile(FileSystem dfs, Path src) throws
 IOException{

FileInputStream fis = new
 FileInputStream(c:/_bigdata/shared_folder/file1.txt);

int len = fis.available();

byte[] btr = new byte[len];

fis.read(btr);

FSDataOutputStream fs = dfs.create(new Path(src.toString()
 +/file1.txt));

fs.write(btr);

fs.flush();
fs.close();

}
 }

 Any help would be greatly appreciated!



Re: RemoteException writing files

2012-05-19 Thread samir das mohapatra
Hi
  This Could be due to the Following reason

1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have
any available DataNodes
 2) Namenode not able to start properly
 3) other wise some IP Issue .
Note:- Pleaes  mention localhost instead of 127.0.0.1 (If it is in
local)

   Follow URL:
http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F

Thanks
 samir

On Sat, May 19, 2012 at 11:30 PM, samir das mohapatra 
samir.help...@gmail.com wrote:

 Hi
   This Could be due to the Following reason

 1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have
 any available DataNodes
  2) Namenode not able to start properly
  3) other wise some IP Issue .
 Note:- Pleaes  mention localhost instead of 127.0.0.1 (If it is in
 local)

Follow URL:

 http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F


 Thanks
  samir



 On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.comwrote:

 Hi folks,

 (Resending to this group, sent to common-dev before, pretty sure that's
 for
 Hadoop internal development - sorry for that..)

 I'm pretty stuck here.  I've been researching for hours and I haven't made
 any forward progress on this one.

 I have a vmWare installation of Cloudera Hadoop 0.20.  The following
 commands to create a directory and copy a file from the shared folder
 *work
 fine*, so I'm confident everything is setup correctly:

 [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir
 [cloudera@localhost bin]$ hadoop fs -put
 /mnt/hgfs/shared_folder/file1.txt
 /user/cloudera/testdir/file1.txt

 The file shows up fine in the HDFS doing it this way on the Linux VM.

 *However*, when I try doing the equivalent operation in Java everything
 works great until I try to close() FSDataOutputStream.
 I'm left with the new directory and a zero byte size file.  One suspicious
 thing is that the user is admin instead of cloudera which I haven't
 figured out why.  Here is the error:

 12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in
 createBlockOutputStream
 127.0.0.1:50010 java.net.ConnectException: Connection refused: no further
 information
 12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block
 blk_1931357292676354131_1068
 12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode 127.0.0.1:50010
 12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception:
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /user/admin/testdir/file1.txt could only be replicated to 0 nodes, instead
 of 1
at

 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533)
at

 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:667)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 There are certainly lots of search references to *could only be
 replicated
 to 0 nodes, instead of 1* but chasing down those suggestions hasn't
 helped.
 I have run *jps* and* netstat* and that looks good.  All services are
 running, all port seem to be good.  The *health check* looks good, plenty
 of disk space, no failed nodes...

 Here is the java (it fails when it hits fs.close():

 import java.io.BufferedReader;
 import java.io.FileInputStream;
 import java.io.FileReader;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FSDataInputStream;
 import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;

 public class TestFileTrans {

public static void main(String[] args) {

Configuration config = new Configuration();

config.addResource(new
 Path(c:/_bigdata/client_libs/core-site.xml));
config.addResource(new
 Path(c:/_bigdata/client_libs/hdfs-site.xml));

System.out.println(hadoop.tmp.dir:  +
 config.get(hadoop.tmp.dir));
try{
FileSystem dfs = FileSystem.get(config);

// this will default to admin unless the workingDirectory is
 explicitly set..

System.out.println(HDFS Working Directory:  +
 dfs.getWorkingDirectory().toString());

String dirName = testdir;
Path src = new Path(dfs.getWorkingDirectory()+/+dirName);
dfs.mkdirs(src);

System.out.println(HDFS Directory created:  +
 dfs.getWorkingDirectory().toString());

loadFile(dfs, src);

}catch(IOException e){
System.out.println(Error + e.getMessage());
}

}

private static void loadFile(FileSystem dfs, Path src) throws
 IOException{

FileInputStream fis = new
 FileInputStream(c:/_bigdata/shared_folder/file1.txt);

int len = fis.available();