RE: Get Line Number from InputFormat

2010-04-06 Thread Michael Segel


Ok, so getting your position in to the file based on offset and a known fixed 
length format, er what you meant by structured, will give you a line number.

But lets look at the question from a more practical and wider application.  In 
most applications where you have a single record per line, you will not have a 
fixed length record format, so you really don't have a good way to calculate 
your line number based on position in to the file.

Lets also look at the issue of the importance of a line number in terms of 
practical use.
Sort of like row_id in a partitioned table, line number loses meaning.

If line number had specific meaning and the application ended their records 
with a '\n' (or cr nl),
the an alternative would be to add a field that contained the line number.

HTH

-Mike

PS. Wouldn't you call a record in XML structured? Yet of an unknown length? ;-)

(Sorry, I haven't had my first cup of coffee yet. :-)   )
 From: am...@yahoo-inc.com
 To: common-user@hadoop.apache.org
 Date: Tue, 6 Apr 2010 12:14:56 +0530
 Subject: Re: Get Line Number from InputFormat
 
 Hi,
 If your records are structured / of equal size, then getting the line number 
 is straightforward.
 If not, you'll need to construct your own sequence of numbers, someone's been 
 kind enough to publish on his blog:
 
 http://www.data-miners.com/blog/2009/11/hadoop-and-mapreduce-parallel-program.html
 
 Amogh
 
 
 On 4/5/10 7:59 PM, Michael Segel michael_se...@hotmail.com wrote:
 
 
 
 
 
  Date: Mon, 5 Apr 2010 14:57:09 +0100
  From: lamfeeli...@gmail.com
  To: common-user@hadoop.apache.org
  Subject: Get Line Number from InputFormat
 
  Dear all,
 TextInputFormat send the Offset, Line into the Mapper, however, the
  offset is sometime meaningless, and confusing. Is it possible to have a
  InputFormat which outputs Line NO., line into mapper?
 
  Thanks a lot.
 
  Song
 
 Song,
 
 I'm not sure what you want is realistic or even worthwhile.
 
 You have a file and its split in to chunks of 64MB (default) or something 
 larger based on your cloud settings.
 You have map job that starts from a specific point in to the file, but that 
 does not mean that its starting at a specific line, or that Hadoop will know 
 which line in the file. (Your records are not always going to be based on the 
 end of a line, or one like per record.
 
 Does that make sense?
 Offset has more meaning that an arbitrary Line NO.
 
 -Mike
 
 _
 The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with 
 Hotmail.
 http://www.windowslive.com/campaign/thenewbusy?tile=multicalendarocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5
 
  
_
The New Busy is not the old busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3

What means log DIR* NameSystem.completeFile: failed to complete... ?

2010-04-06 Thread Al Lias
Hi all,

this warning is written in FSFileSystem.java/completeFileInternal(). It
makes the calling code in NameNode.java throwing an IOException.

FSFileSystem.java
...
if (fileBlocks == null ) {
  NameNode.stateChangeLog.warn(
DIR* NameSystem.completeFile: 
+ failed to complete  + src
+  because dir.getFileBlocks() is null  +
   and pendingFile is  +
  ((pendingFile == null) ? null :
(from  + pendingFile.getClientMachine()))
  );
...

What is the meaning of this warning? Any Idea what could have gone wrong
in such a case?

(This popped up through hbase, but as this code is in HDFS, I am asking
this list)

Thx

Al


Re: losing network interfaces during long running map-reduce jobs

2010-04-06 Thread Steve Loughran

David Howell wrote:

But I haven't seen anything in the dmesg log. I'll have to try looking
at the tcpdump output on Monday, once I can get console access again.
My apologies that I'm so sketchy on details right now... so far, I
haven't been any able to find any evidence of something going wrong
except for the hadoop log entries when the IOExceptions start.

Thanks,
-David



I just lost my networking again. This time, I had switched my cluster
back to the build I was using before I switched to CDH2.

It's Hadoop 0.20.1 with these patches applied (for Dumbo):

HADOOP-1722-v0.20.1
HADOOP-5450
MAPREDUCE-764
HADOOP-5528

Now I'm wondering if something about my job is the culprit. I have 2
nodes, both 8 core machines.
mapred.tasktracker.map|reduce.tasks.maximum are both set to 7.

The job I'm running is combining lots of gzipped Apache log files into
sequence files for later analysis... I'm going from one file per
virtual host per server per day to file per virtual host per day. The
last attempt had ~1400 maps/10 reduces.




could be just file handles you are losing; have up upped the OS defaults?


Re: Logging info in hadoop

2010-04-06 Thread steven zhuang
hi, Nachiket,
*  I think if you output something to stderr, you should be able to
find it in the .out log. *
*Just make sure you are checking the right .out log file, *
*you can do that by checking which tasktrackers are running you job from the
web UI.*
* *
* *
On Tue, Apr 6, 2010 at 6:56 PM, Nachiket Vaidya nachik...@gmail.com wrote:

 I have the following doubts:
 1. How to print log information in Hadoop. In the documentation, I have
 read
 that hadoop-username-processname-machinename.log contains logs. I
 have
 used
 Log log = LogFactory.getLog(FBEMMapper.class);
 and
 log.info();

 for printing into log, but I do not see any log information in log file. I
 have also used System.out.println() but these are also not getting printed
 in .log or .out file.
 Do we need to change some log level in hadoop?
 Do we need to enable logging for some class?
 which log4j.properties file we need to change?

 Firstly, am I doing right things for logging?

 Actually the problem is I have written my custom FileInputFormat
 and WritableComparable for my purpose. My program runs fine, but I do not
 see any output. That is why I need to print some log statement to debug the
 problem.

 Thank you.


 - Nachiket



Hadoop, C API, and fork

2010-04-06 Thread Patrick Donnelly
Hi,

I have a distributed file server front end to Hadoop that uses the
libhdfs C API to talk to Hadoop. Normally the file server will fork on
a new client connection but this does not work with the libhdfs shared
library (it is loaded using dlopen). If the server is in single
process mode (no forking and can handle only one client at a time)
then everything works fine.

I have tried changing it so the server disconnects the Hadoop
connection before forking and having both processes re-connect post
fork. Essentially in the server:

hdfsDisconnect(...);
pid = fork();
hdfsConnect(...);
if (pid == 0)
  ...
else
  ...

This causes a hang in the child process on Connect with the following backtrace:

(gdb) bt
#0  0x0034d160ad09 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x2ace492559f7 in os::PlatformEvent::park ()
   from 
/afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
#2  0x2ace4930a5da in ObjectMonitor::wait ()
   from 
/afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
#3  0x2ace49307b13 in ObjectSynchronizer::wait ()
   from 
/afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
#4  0x2ace490cf5fb in JVM_MonitorWait ()
   from 
/afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
#5  0x2ace49c87f50 in ?? ()
#6  0x0001 in ?? ()
#7  0x2ace4cd84d10 in ?? ()
#8  0x3f80 in ?? ()
#9  0x2ace49c8841d in ?? ()
#10 0x7fff0b4d04c0 in ?? ()
#11 0x in ?? ()

Leaving the connection open in the server:

pid = fork();
if (pid == 0)
  ...
else
  ...

Also produces a hang in the child:

(gdb) bt
#0  0x0034d160ad09 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x2b3d7193d9f7 in os::PlatformEvent::park ()
   from 
/afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
#2  0x2b3d719f25da in ObjectMonitor::wait ()
   from 
/afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
#3  0x2b3d719efb13 in ObjectSynchronizer::wait ()
   from 
/afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
#4  0x2b3d717b75fb in JVM_MonitorWait ()
   from 
/afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
#5  0x2b3d7236ff50 in ?? ()
#6  0x in ?? ()


Does anyone have a suggestion on debugging/fixing this?

Thanks for any help,

-- 
- Patrick Donnelly


Re: Reducer-side join example

2010-04-06 Thread M B
Thanks, I appreciate the example - what happens if File A and B have many
more columns (all different data types)?  The logic doesn't seem to work in
that case - unless we set up the values in the Map function to include the
file name (maybe the output value is a HashMap or something, which might
work).

Also, I was asking to see a reduce-side join as we have other things going
on in the Mapper and I'm not sure if we can tweak it's output (we send
output to multiple places).  Does anyone have an example using the
contrib/DataJoin or something similar?

thanks

On Mon, Apr 5, 2010 at 7:03 PM, He Chen airb...@gmail.com wrote:

 For the Map function:
 Input key: default
 input value: File A and File B lines

 output key: A, B, C,(first colomn of the final result)
 output value: 12, 24, Car, 13, Van, SUV...

 Reduce function:
 take the Map output and do:
 for each key
 {   if the value of a key is integer
then same it to array1;
   else save it to array2
 }
 for ith element in array1
  for jth element in array2
   output(key, array1[i]+\t+array2[j]);
 done

 Hope this helps.


 On Mon, Apr 5, 2010 at 4:10 PM, M B machac...@gmail.com wrote:

  Hi, I need a good java example to get me started with some joining we
 need
  to do, any examples would be appreciated.
 
  File A:
  Field1  Field2
  A12
  B13
  C22
  A24
 
  File B:
   Field1  Field2   Field3
  ACar   ...
  BTruck...
  BSUV ...
  BVan  ...
 
  So, we need to first join File A and B on Field1 (say both are string
  fields).  The result would just be:
  A   12   Car   ...
  A   24   Car   ...
  B   13   Truck   ...
  B   13   SUV   ...
   B   13   Van   ...
  and so on - with all the fields from both files returning.
 
  Once we have that, we sometimes need to then transform it so we have a
  single record per key (Field1):
  A (12,Car) (24,Car)
  B (13,Truck) (13,SUV) (13,Van)
  --however it looks, basically tuples for each key (we'll modify this
 later
  to return a conatenated set of fields from B, etc)
 
  At other times, instead of transforming to a single row, we just need to
  modify rows based on values.  So if B.Field2 equals Van, we need to set
  Output.Field2 = whatever then output to file ...
 
  Are there any good examples of this in native java (we can't use
  pig/hive/etc)?
 
  thanks.
 



 --
 Best Wishes!


 --
 Chen He
  PhD. student of CSE Dept.
 Holland Computing Center
 University of Nebraska-Lincoln
 Lincoln NE 68588



Re: Hadoop, C API, and fork

2010-04-06 Thread Brian Bockelman
Hey Patrick,

Using fork() for a multi-threaded process (which anything that uses libhdfs is) 
is pretty shaky.  You might want to start off by reading the multi-threaded 
notes from the POSIX standard:

http://www.opengroup.org/onlinepubs/95399/functions/fork.html

You might have better luck playing around with pthread_atfork, or thinking 
about other possible designs :)

If you really, really want to do this, you can also try playing around with the 
internals of libhdfs.  Basically, use native JNI calls to shut down the JVM 
after you disconnect, then fork, then re-initialize everything.  No idea if 
this would work.

Brian

On Apr 6, 2010, at 9:51 AM, Patrick Donnelly wrote:

 Hi,
 
 I have a distributed file server front end to Hadoop that uses the
 libhdfs C API to talk to Hadoop. Normally the file server will fork on
 a new client connection but this does not work with the libhdfs shared
 library (it is loaded using dlopen). If the server is in single
 process mode (no forking and can handle only one client at a time)
 then everything works fine.
 
 I have tried changing it so the server disconnects the Hadoop
 connection before forking and having both processes re-connect post
 fork. Essentially in the server:
 
 hdfsDisconnect(...);
 pid = fork();
 hdfsConnect(...);
 if (pid == 0)
  ...
 else
  ...
 
 This causes a hang in the child process on Connect with the following 
 backtrace:
 
 (gdb) bt
 #0  0x0034d160ad09 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
 #1  0x2ace492559f7 in os::PlatformEvent::park ()
   from 
 /afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
 #2  0x2ace4930a5da in ObjectMonitor::wait ()
   from 
 /afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
 #3  0x2ace49307b13 in ObjectSynchronizer::wait ()
   from 
 /afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
 #4  0x2ace490cf5fb in JVM_MonitorWait ()
   from 
 /afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
 #5  0x2ace49c87f50 in ?? ()
 #6  0x0001 in ?? ()
 #7  0x2ace4cd84d10 in ?? ()
 #8  0x3f80 in ?? ()
 #9  0x2ace49c8841d in ?? ()
 #10 0x7fff0b4d04c0 in ?? ()
 #11 0x in ?? ()
 
 Leaving the connection open in the server:
 
 pid = fork();
 if (pid == 0)
  ...
 else
  ...
 
 Also produces a hang in the child:
 
 (gdb) bt
 #0  0x0034d160ad09 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
 #1  0x2b3d7193d9f7 in os::PlatformEvent::park ()
   from 
 /afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
 #2  0x2b3d719f25da in ObjectMonitor::wait ()
   from 
 /afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
 #3  0x2b3d719efb13 in ObjectSynchronizer::wait ()
   from 
 /afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
 #4  0x2b3d717b75fb in JVM_MonitorWait ()
   from 
 /afs/nd.edu/user37/ccl/software/external/java/jdk/jre/lib/amd64/server/libjvm.so
 #5  0x2b3d7236ff50 in ?? ()
 #6  0x in ?? ()
 
 
 Does anyone have a suggestion on debugging/fixing this?
 
 Thanks for any help,
 
 -- 
 - Patrick Donnelly



smime.p7s
Description: S/MIME cryptographic signature


Re: What means log DIR* NameSystem.completeFile: failed to complete... ?

2010-04-06 Thread Todd Lipcon
Hi Al,

Usually this indicates that the file was renamed or deleted while it was
still being created by the client. Unfortunately it's not the most
descriptive :)

-Todd

On Tue, Apr 6, 2010 at 5:36 AM, Al Lias al.l...@gmx.de wrote:

 Hi all,

this warning is written in FSFileSystem.java/completeFileInternal().
 It
 makes the calling code in NameNode.java throwing an IOException.

 FSFileSystem.java
 ...
 if (fileBlocks == null ) {
  NameNode.stateChangeLog.warn(
DIR* NameSystem.completeFile: 
+ failed to complete  + src
+  because dir.getFileBlocks() is null  +
   and pendingFile is  +
  ((pendingFile == null) ? null :
(from  + pendingFile.getClientMachine()))
  );
 ...

 What is the meaning of this warning? Any Idea what could have gone wrong
 in such a case?

 (This popped up through hbase, but as this code is in HDFS, I am asking
 this list)

 Thx

Al




-- 
Todd Lipcon
Software Engineer, Cloudera


Re: losing network interfaces during long running map-reduce jobs

2010-04-06 Thread David Howell
 could be just file handles you are losing; have up upped the OS defaults?


I have not, and that does seem like a likely culprit. Although, it's a
bit alarming that asking for one socket too many could take down the
networking stack...


Re: Errors reading lzo-compressed files from Hadoop

2010-04-06 Thread Alex Roetter

Todd Lipcon t...@... writes:

 
 Hey Dmitriy,
 
 This is very interesting (and worrisome in a way!) I'll try to take a look
 this afternoon.
 
 -Todd
 

Hi Todd,

I wanted to see if you made any progress on this front. I'm seeing a very
similar error, trying to run a MR (Hadoop 0.20.1) over a bunch of
LZOP compressed / indexed files (using Kevin Weil's package), and I have one
map task that always fails in what looks like the same place as described in 
the previous post. I haven't yet done the experimentation mentioned above 
(isolating the input file corresponding to the failed map task, decompressing
it / recompressing it, testing it out operating directly on local disk
instead of HDFS, etc).

However, since I am crashing in exactly the same place it seems likely this
is related, and thought I'd check on your work in the meantime.

FYI, my stack track is below:

2010-04-05 18:15:16,895 FATAL org.apache.hadoop.mapred.TaskTracker: Error
running child : java.lang.InternalError: lzo1x_decompress_safe returned:
at com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect
(Native Method)
at com.hadoop.compression.lzo.LzoDecompressor.decompress
(LzoDecompressor.java:303)
at
com.hadoop.compression.lzo.LzopDecompressor.decompress
(LzopDecompressor.java:104)
at com.hadoop.compression.lzo.LzopInputStream.decompress
(LzopInputStream.java:223)
at
org.apache.hadoop.io.compress.DecompressorStream.read
(DecompressorStream.java:74)
at java.io.InputStream.read(InputStream.java:85)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:187)
at
com.hadoop.mapreduce.LzoLineRecordReader.nextKeyValue
(LzoLineRecordReader.java:126)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue
(MapTask.java:423)
at 
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)


Any update much appreciated,
Alex







Jetty can't start the SelectChannelConnector

2010-04-06 Thread Edson Ramiro
Hi all,

I configured the Hadoop in a cluster and the NameNode and JobTracker are
running ok, but the DataNode and TaskTracker Doesn't start, they stop and
keep waiting
when they are going to start the Jetty

I observed that Jetty can't start the _SelectChannelConnector_

Is there any Jetty configuration that should be changed ?

There is no log message in the NN and JT when I try to start the DN and TT.

The kernel I'm using is:
Linux bl05 2.6.32.10 #2 SMP Tue Apr 6 12:33:42 BRT 2010 x86_64 GNU/Linux

This is the message when I start the DN. It happens with TT too.

ram...@bl05:~/hadoop-0.20.1+169.56$ ./bin/hadoop datanode
10/04/06 16:24:14 INFO datanode.DataNode: STARTUP_MSG:
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = bl05.ctinfra.ufpr.br/192.168.1.115
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.1+169.56
STARTUP_MSG:   build =  -r 8e662cb065be1c4bc61c55e6bff161e09c1d36f3;
compiled by 'chad' on Tue Feb  2 13:27:17 PST 2010
/
10/04/06 16:24:14 INFO datanode.DataNode: Registered FSDatasetStatusMBean
10/04/06 16:24:14 INFO datanode.DataNode: Opened info server at 50010
10/04/06 16:24:14 INFO datanode.DataNode: Balancing bandwith is 1048576
bytes/s
10/04/06 16:24:14 INFO mortbay.log: Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
org.mortbay.log.Slf4jLog
10/04/06 16:24:14 INFO http.HttpServer: Port returned by
webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the
listener on 50075
10/04/06 16:24:14 INFO http.HttpServer: listener.getLocalPort() returned
50075 webServer.getConnectors()[0].getLocalPort() returned 50075
10/04/06 16:24:14 INFO http.HttpServer: Jetty bound to port 50075
10/04/06 16:24:14 INFO mortbay.log: jetty-6.1.14

Thanks in Advance,

Edson Ramiro


Re: Jetty can't start the SelectChannelConnector

2010-04-06 Thread Edson Ramiro
Hi Todd,

I'm getting this behavior in another cluster too, there the same thing
happens.
and as I don't have a jstack installed in the first cluster and
I'm not the admin, I'm sending the results of the second cluster.

These are the results:

[erl...@cohiba ~ ]$ jstack  -l 22510
22510: well-known file is not secure
[erl...@cohiba ~ ]$ jstack -l 3836
3836: well-known file is not secure

The jstack -F result is in thread_dump files and the jstack -m result is in
java_native_frames.

The files ending with nn are the namenode results and the files ending with
dn are the datanode results.

Thanks,

Edson Ramiro


On 6 April 2010 18:19, Todd Lipcon t...@cloudera.com wrote:

 Hi Edson,

 Can you please run jstack on the daemons in question and paste the
 output here?

 -Todd

 On Tue, Apr 6, 2010 at 12:44 PM, Edson Ramiro erlfi...@gmail.com wrote:
  Hi all,
 
  I configured the Hadoop in a cluster and the NameNode and JobTracker are
  running ok, but the DataNode and TaskTracker Doesn't start, they stop and
  keep waiting
  when they are going to start the Jetty
 
  I observed that Jetty can't start the _SelectChannelConnector_
 
  Is there any Jetty configuration that should be changed ?
 
  There is no log message in the NN and JT when I try to start the DN and
 TT.
 
  The kernel I'm using is:
  Linux bl05 2.6.32.10 #2 SMP Tue Apr 6 12:33:42 BRT 2010 x86_64 GNU/Linux
 
  This is the message when I start the DN. It happens with TT too.
 
  ram...@bl05:~/hadoop-0.20.1+169.56$ ./bin/hadoop datanode
  10/04/06 16:24:14 INFO datanode.DataNode: STARTUP_MSG:
  /
  STARTUP_MSG: Starting DataNode
  STARTUP_MSG:   host = bl05.ctinfra.ufpr.br/192.168.1.115
  STARTUP_MSG:   args = []
  STARTUP_MSG:   version = 0.20.1+169.56
  STARTUP_MSG:   build =  -r 8e662cb065be1c4bc61c55e6bff161e09c1d36f3;
  compiled by 'chad' on Tue Feb  2 13:27:17 PST 2010
  /
  10/04/06 16:24:14 INFO datanode.DataNode: Registered FSDatasetStatusMBean
  10/04/06 16:24:14 INFO datanode.DataNode: Opened info server at 50010
  10/04/06 16:24:14 INFO datanode.DataNode: Balancing bandwith is 1048576
  bytes/s
  10/04/06 16:24:14 INFO mortbay.log: Logging to
  org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
  org.mortbay.log.Slf4jLog
  10/04/06 16:24:14 INFO http.HttpServer: Port returned by
  webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening
 the
  listener on 50075
  10/04/06 16:24:14 INFO http.HttpServer: listener.getLocalPort() returned
  50075 webServer.getConnectors()[0].getLocalPort() returned 50075
  10/04/06 16:24:14 INFO http.HttpServer: Jetty bound to port 50075
  10/04/06 16:24:14 INFO mortbay.log: jetty-6.1.14
 
  Thanks in Advance,
 
  Edson Ramiro
 



 --
 Todd Lipcon
 Software Engineer, Cloudera



Re: Jetty can't start the SelectChannelConnector

2010-04-06 Thread Todd Lipcon
Hi Edson,

Your attachments did not come through - can you put them on pastebin?

-Todd

On Tue, Apr 6, 2010 at 3:37 PM, Edson Ramiro erlfi...@gmail.com wrote:
 Hi Todd,

 I'm getting this behavior in another cluster too, there the same thing
 happens.
 and as I don't have a jstack installed in the first cluster and
 I'm not the admin, I'm sending the results of the second cluster.

 These are the results:

 [erl...@cohiba ~ ]$ jstack  -l 22510
 22510: well-known file is not secure
 [erl...@cohiba ~ ]$ jstack -l 3836
 3836: well-known file is not secure

 The jstack -F result is in thread_dump files and the jstack -m result is in
 java_native_frames.

 The files ending with nn are the namenode results and the files ending with
 dn are the datanode results.

 Thanks,

 Edson Ramiro


 On 6 April 2010 18:19, Todd Lipcon t...@cloudera.com wrote:

 Hi Edson,

 Can you please run jstack on the daemons in question and paste the
 output here?

 -Todd

 On Tue, Apr 6, 2010 at 12:44 PM, Edson Ramiro erlfi...@gmail.com wrote:
  Hi all,
 
  I configured the Hadoop in a cluster and the NameNode and JobTracker are
  running ok, but the DataNode and TaskTracker Doesn't start, they stop
  and
  keep waiting
  when they are going to start the Jetty
 
  I observed that Jetty can't start the _SelectChannelConnector_
 
  Is there any Jetty configuration that should be changed ?
 
  There is no log message in the NN and JT when I try to start the DN and
  TT.
 
  The kernel I'm using is:
  Linux bl05 2.6.32.10 #2 SMP Tue Apr 6 12:33:42 BRT 2010 x86_64 GNU/Linux
 
  This is the message when I start the DN. It happens with TT too.
 
  ram...@bl05:~/hadoop-0.20.1+169.56$ ./bin/hadoop datanode
  10/04/06 16:24:14 INFO datanode.DataNode: STARTUP_MSG:
  /
  STARTUP_MSG: Starting DataNode
  STARTUP_MSG:   host = bl05.ctinfra.ufpr.br/192.168.1.115
  STARTUP_MSG:   args = []
  STARTUP_MSG:   version = 0.20.1+169.56
  STARTUP_MSG:   build =  -r 8e662cb065be1c4bc61c55e6bff161e09c1d36f3;
  compiled by 'chad' on Tue Feb  2 13:27:17 PST 2010
  /
  10/04/06 16:24:14 INFO datanode.DataNode: Registered
  FSDatasetStatusMBean
  10/04/06 16:24:14 INFO datanode.DataNode: Opened info server at 50010
  10/04/06 16:24:14 INFO datanode.DataNode: Balancing bandwith is 1048576
  bytes/s
  10/04/06 16:24:14 INFO mortbay.log: Logging to
  org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
  org.mortbay.log.Slf4jLog
  10/04/06 16:24:14 INFO http.HttpServer: Port returned by
  webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening
  the
  listener on 50075
  10/04/06 16:24:14 INFO http.HttpServer: listener.getLocalPort() returned
  50075 webServer.getConnectors()[0].getLocalPort() returned 50075
  10/04/06 16:24:14 INFO http.HttpServer: Jetty bound to port 50075
  10/04/06 16:24:14 INFO mortbay.log: jetty-6.1.14
 
  Thanks in Advance,
 
  Edson Ramiro
 



 --
 Todd Lipcon
 Software Engineer, Cloudera





-- 
Todd Lipcon
Software Engineer, Cloudera


Re: Jetty can't start the SelectChannelConnector

2010-04-06 Thread Edson Ramiro
Thanks Todd,

but could you please explain why entropy is important ?


Edson Ramiro


On 6 April 2010 20:09, Todd Lipcon t...@cloudera.com wrote:

 Not enough entropy on your system - you need to generate entropy or
 fake some using this technique:


 http://www.chrissearle.org/blog/technical/increase_entropy_26_kernel_linux_box

 -Todd


 On Tue, Apr 6, 2010 at 4:05 PM, Edson Ramiro erlfi...@gmail.com wrote:
  ok,
 
 
  [erl...@cohiba ~ ]$ cat java_native_frames_dn
  Attaching to process ID 3836, please wait...
  Debugger attached successfully.
  Server compiler detected.
  JVM version is 11.2-b01
  Deadlock Detection:
 
  No deadlocks found.
 
  - 3864 -
  0xf775a4f1__libc_read + 0x41
  0xf702298creadBytes + 0xdc
  0xf701e717Java_java_io_FileInputStream_readBytes + 0x47
  0xf400b4aa* java.io.FileInputStream.readBytes(byte[], int, int) bci:0
  (Interpreted frame)
  0xf4003f69* java.io.FileInputStream.read(byte[], int, int) bci:4
  line:199 (Interpreted frame)
  0xf4003f69* java.io.BufferedInputStream.read1(byte[], int, int)
 bci:39
  line:256 (Interpreted frame)
  0xf4003f69* java.io.BufferedInputStream.read(byte[], int, int) bci:49
  line:317 (Interpreted frame)
  0xf4003f69* java.io.BufferedInputStream.fill() bci:175 line:218
  (Interpreted frame)
  0xf400408d* java.io.BufferedInputStream.read1(byte[], int, int)
 bci:44
  line:258 (Interpreted frame)
  0xf4003f69* java.io.BufferedInputStream.read(byte[], int, int) bci:49
  line:317 (Interpreted frame)
  0xf4003f69*
  sun.security.provider.SeedGenerator$URLSeedGenerator.getSeedByte() bci:12
  line:453 (Interpreted frame)
  0xf4003e61* sun.security.provider.SeedGenerator.getSeedBytes(byte[])
  bci:11 line:123 (Interpreted frame)
  0xf400408d* sun.security.provider.SeedGenerator.generateSeed(byte[])
  bci:4 line:118 (Interpreted frame)
  0xf400408d*
 sun.security.provider.SecureRandom.engineGenerateSeed(int)
  bci:5 line:114 (Interpreted frame)
  0xf4003f27*
 sun.security.provider.SecureRandom.engineNextBytes(byte[])
  bci:40 line:171 (Interpreted frame)
  0xf400408d* java.security.SecureRandom.nextBytes(byte[]) bci:5
 line:433
  (Interpreted frame)
  0xf400408d* java.security.SecureRandom.next(int) bci:17 line:455
  (Interpreted frame)
  0xf4003f69* java.util.Random.nextLong() bci:3 line:284 (Interpreted
  frame)
  0xf4003fab* org.mortbay.jetty.servlet.HashSessionIdManager.doStart()
  bci:73 line:139 (Interpreted frame)
  0xf400408d* org.mortbay.component.AbstractLifeCycle.start() bci:31
  line:50 (Interpreted frame)
  0xf4004569*
 org.mortbay.jetty.servlet.AbstractSessionManager.doStart()
  bci:96 line:168 (Interpreted frame)
  0xf400408d* org.mortbay.jetty.servlet.HashSessionManager.doStart()
  bci:12 line:67 (Interpreted frame)
  0xf400408d* org.mortbay.component.AbstractLifeCycle.start() bci:31
  line:50 (Interpreted frame)
  0xf4004569* org.mortbay.jetty.servlet.SessionHandler.doStart() bci:4
  line:115 (Interpreted frame)
  0xf400408d* org.mortbay.component.AbstractLifeCycle.start() bci:31
  line:50 (Interpreted frame)
  0xf4004569* org.mortbay.jetty.handler.HandlerWrapper.doStart() bci:11
  line:130 (Interpreted frame)
  0xf400408d* org.mortbay.jetty.handler.ContextHandler.startContext()
  bci:1 line:537 (Interpreted frame)
  0xf400408d* org.mortbay.jetty.servlet.Context.startContext() bci:1
  line:136 (Interpreted frame)
  0xf400408d* org.mortbay.jetty.webapp.WebAppContext.startContext()
  bci:123 line:1234 (Interpreted frame)
  0xf400408d* org.mortbay.jetty.handler.ContextHandler.doStart()
 bci:140
  line:517 (Interpreted frame)
  0xf400408d* org.mortbay.jetty.webapp.WebAppContext.doStart() bci:170
  line:460 (Interpreted frame)
  0xf400408d* org.mortbay.component.AbstractLifeCycle.start() bci:31
  line:50 (Interpreted frame)
  0xf4004569* org.mortbay.jetty.handler.HandlerCollection.doStart()
 bci:32
  line:152 (Interpreted frame)
  0xf400408d*
 org.mortbay.jetty.handler.ContextHandlerCollection.doStart()
  bci:5 line:156 (Interpreted frame)
  0xf400408d* org.mortbay.component.AbstractLifeCycle.start() bci:31
  line:50 (Interpreted frame)
  0xf4004569* org.mortbay.jetty.handler.HandlerWrapper.doStart() bci:11
  line:130 (Interpreted frame)
  0xf400408d* org.mortbay.jetty.Server.doStart() bci:201 line:222
  (Interpreted frame)
  0xf400408d* org.mortbay.component.AbstractLifeCycle.start() bci:31
  line:50 (Interpreted frame)
  0xf400408d* org.apache.hadoop.http.HttpServer.start() bci:383
 line:461
  (Interpreted frame)
  0xf400408d*
 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(org.apache.hadoop.conf.Configuration,
  java.util.AbstractList) bci:916 line:375 (Interpreted frame)
  0xf400408d*
 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(org.apache.hadoop.conf.Configuration,
  java.util.AbstractList) bci:158 line:216 

Re: Reducer-side join example

2010-04-06 Thread Ed Kohlwey
Hi,
Your question has an academic sound, so I'll give it an academic answer ;).
Unfortunately, there are not really any good generalized (ie. cross join a
large matrix with a large matrix) methods for doing joins in map-reduce. The
fundamental reason for this is that in the general case you're comparing
everything to everything, and so for each pair of possible rows, you must
actually generate each pair of rows. This means every node ships all its
data to every other node, no matter what (in the general case). I bring this
up not because you're looking to optimize cross joining, but because it
demonstrates the point that you will exploit the characteristics of your
data no matter what strategy you choose, and each will have domain-specific
flaws and advantages.

The typical strategy for a reduce side join is to use hadoop's sorting
functionality to group rows by their keys, such that the entire data set for
a particular key will be resident on a single reducer. The key insight is
that you're thinking about the join as a sorting problem. Yes this means you
risk producing data sets that fill your reducers, but thats a trade-off that
you accept to reduce the complexity of the original problem.

If the existing join framework in hadoop (whose javadocs are quite thorough)
is inadequate, you shouldn't be afraid to invent, implement, and test join
strategies that are specific to your domain.


On Tue, Apr 6, 2010 at 11:01 AM, M B machac...@gmail.com wrote:

 Thanks, I appreciate the example - what happens if File A and B have many
 more columns (all different data types)?  The logic doesn't seem to work in
 that case - unless we set up the values in the Map function to include the
 file name (maybe the output value is a HashMap or something, which might
 work).

 Also, I was asking to see a reduce-side join as we have other things going
 on in the Mapper and I'm not sure if we can tweak it's output (we send
 output to multiple places).  Does anyone have an example using the
 contrib/DataJoin or something similar?

 thanks

 On Mon, Apr 5, 2010 at 7:03 PM, He Chen airb...@gmail.com wrote:

  For the Map function:
  Input key: default
  input value: File A and File B lines
 
  output key: A, B, C,(first colomn of the final result)
  output value: 12, 24, Car, 13, Van, SUV...
 
  Reduce function:
  take the Map output and do:
  for each key
  {   if the value of a key is integer
 then same it to array1;
else save it to array2
  }
  for ith element in array1
   for jth element in array2
output(key, array1[i]+\t+array2[j]);
  done
 
  Hope this helps.
 
 
  On Mon, Apr 5, 2010 at 4:10 PM, M B machac...@gmail.com wrote:
 
   Hi, I need a good java example to get me started with some joining we
  need
   to do, any examples would be appreciated.
  
   File A:
   Field1  Field2
   A12
   B13
   C22
   A24
  
   File B:
Field1  Field2   Field3
   ACar   ...
   BTruck...
   BSUV ...
   BVan  ...
  
   So, we need to first join File A and B on Field1 (say both are string
   fields).  The result would just be:
   A   12   Car   ...
   A   24   Car   ...
   B   13   Truck   ...
   B   13   SUV   ...
B   13   Van   ...
   and so on - with all the fields from both files returning.
  
   Once we have that, we sometimes need to then transform it so we have a
   single record per key (Field1):
   A (12,Car) (24,Car)
   B (13,Truck) (13,SUV) (13,Van)
   --however it looks, basically tuples for each key (we'll modify this
  later
   to return a conatenated set of fields from B, etc)
  
   At other times, instead of transforming to a single row, we just need
 to
   modify rows based on values.  So if B.Field2 equals Van, we need to
 set
   Output.Field2 = whatever then output to file ...
  
   Are there any good examples of this in native java (we can't use
   pig/hive/etc)?
  
   thanks.
  
 
 
 
  --
  Best Wishes!
 
 
  --
  Chen He
   PhD. student of CSE Dept.
  Holland Computing Center
  University of Nebraska-Lincoln
  Lincoln NE 68588
 



Cluster in Safe Mode

2010-04-06 Thread Manish N
Hey all,

I've a 2 Node cluster which is now running in Safe Mode. Its been 15-16 hrs
now  yet to come out of Safe Mode. Does it normally take that long ?

The DataNode logs on Node running NameNode indicates following  similar
output on the slave node ( running only Data Node ) as well.

2010-04-07 10:03:10,687 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-310922324774702076_996024
2010-04-07 10:03:10,705 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_3302288729849061244_813694
2010-04-07 10:03:10,730 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-7252548330326272479_1259723
2010-04-07 10:03:10,745 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-5909954202848831867_1075933
2010-04-07 10:03:10,886 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-3213723859645738103_1075939
2010-04-07 10:03:10,910 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-2209269106581706132_676390
2010-04-07 10:03:10,923 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-6007998488187910667_676379
2010-04-07 10:03:11,086 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-1024215056075897357_676383
2010-04-07 10:03:11,127 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_3780597313184168671_1270304
2010-04-07 10:03:11,160 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_8891623760013835158_676336

One thing I wanted to point out is sometime back I'd to do setrep on the
entire Cluster, are these verifications messages related to that ?

Also while going through the NameNode logs i encountered following things.

2010-04-05 21:01:31,383 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
2010-04-05 21:01:49,240 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.21:50010
2010-04-05 21:01:49,243 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
2010-04-05 21:02:01,791 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.2:50010

then again @

2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.21:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.2:50010

I had to restart the cluster post which I got both the nodes back.

2010-04-06 10:11:24,325 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.registerDatanode: node registration from
192.168.100.21:50010storage DS-455083797-192
.168.100.21-50010-1268220157729
2010-04-06 10:11:24,328 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /default-rack/192.168.100.21:50010
2010-04-06 10:11:25,245 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.allocateBlock:
/data/listing/image/5/84025/35924c87e664a43893904effbd2be601_list.jpg.
blk_-1845977707636580795_1665561
2010-04-06 10:11:25,342 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 192.168.100.21:50010 is added
to blk_-1845977707636580795_1665561 size 72753
2010-04-06 10:11:44,257 INFO org.apache.hadoop.fs.FSNamesystem: Number of
transactions: 64 Total time for transactions(ms): 4 Number of syncs: 45
SyncTimes(ms): 387
2010-04-06 10:11:51,485 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.registerDatanode: node registration from
192.168.100.2:50010storage
DS-1237294752-192.168.100.2-50010-1252010614375
2010-04-06 10:11:51,488 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /default-rack/192.168.100.2:50010

Then again subsequently they were removed. No clue why this happened.

Ever since I'm seeing following things in logs..

2010-04-06 10:00:49,052 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 54310, call
create(/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg,
rwxr-xr-x, DFSClient_1226879860, true, 2, 67108864) from 192.168.100.5:40437:
error: org.apache.hadoop.dfs.SafeModeException: Cannot create
file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
Name node is in safe mode.
The ratio of reported blocks 0. has not reached the threshold 0.9990.
Safe mode will be turned off automatically.
org.apache.hadoop.dfs.SafeModeException: Cannot create
file/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg.
Name node is in safe mode.
The ratio of reported blocks 0. has not reached the threshold 0.9990.
Safe 

Re: Cluster in Safe Mode

2010-04-06 Thread Ravi Phulari
Looks like your all data nodes are down. Please make sure your data nodes are 
up and running (Check from Name node web ui and by jps on data nodes).
Fsck is showing that there are 0 minimally replicated files and Average block 
replication is 0.
Also please verify if your Data nodes data dir has any blocks.

-
Ravi


On 4/6/10 10:16 PM, Manish N m1n...@gmail.com wrote:

CORRUPT FILES:1601525
  MISSING BLOCKS:1601927
  MISSING SIZE:540525108291 B
  CORRUPT BLOCKS: 1601927
  
Minimally replicated blocks:0 (0.0 %)
Over-replicated blocks:0 (0.0 %)
Under-replicated blocks:0 (0.0 %)
Mis-replicated blocks:0 (0.0 %)
Default replication factor:2
Average block replication:0.0
Corrupt blocks:1601927

Ravi
--



RE: Cluster in Safe Mode

2010-04-06 Thread Sagar Shukla
Hi Manish,
  Do you see any errors on DataNode log-files ? It is quite likely that 
after the namenode starts the processes on datanode then are failing to start, 
causing the namenode to wait in safe mode for datanode services to start.

Thanks,
Sagar

-Original Message-
From: Manish N [mailto:m1n...@gmail.com]
Sent: Wednesday, April 07, 2010 10:47 AM
To: common-user@hadoop.apache.org
Subject: Cluster in Safe Mode

Hey all,

I've a 2 Node cluster which is now running in Safe Mode. Its been 15-16 hrs
now  yet to come out of Safe Mode. Does it normally take that long ?

The DataNode logs on Node running NameNode indicates following  similar
output on the slave node ( running only Data Node ) as well.

2010-04-07 10:03:10,687 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-310922324774702076_996024
2010-04-07 10:03:10,705 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_3302288729849061244_813694
2010-04-07 10:03:10,730 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-7252548330326272479_1259723
2010-04-07 10:03:10,745 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-5909954202848831867_1075933
2010-04-07 10:03:10,886 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-3213723859645738103_1075939
2010-04-07 10:03:10,910 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-2209269106581706132_676390
2010-04-07 10:03:10,923 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-6007998488187910667_676379
2010-04-07 10:03:11,086 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_-1024215056075897357_676383
2010-04-07 10:03:11,127 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_3780597313184168671_1270304
2010-04-07 10:03:11,160 INFO org.apache.hadoop.dfs.DataBlockScanner:
Verification succeeded for blk_8891623760013835158_676336

One thing I wanted to point out is sometime back I'd to do setrep on the
entire Cluster, are these verifications messages related to that ?

Also while going through the NameNode logs i encountered following things.

2010-04-05 21:01:31,383 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
2010-04-05 21:01:49,240 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.21:50010
2010-04-05 21:01:49,243 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
2010-04-05 21:02:01,791 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.2:50010

then again @

2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.21:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.21:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.heartbeatCheck: lost heartbeat from 192.168.100.2:50010
2010-04-06 06:41:56,290 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/192.168.100.2:50010

I had to restart the cluster post which I got both the nodes back.

2010-04-06 10:11:24,325 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.registerDatanode: node registration from
192.168.100.21:50010storage DS-455083797-192
.168.100.21-50010-1268220157729
2010-04-06 10:11:24,328 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /default-rack/192.168.100.21:50010
2010-04-06 10:11:25,245 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.allocateBlock:
/data/listing/image/5/84025/35924c87e664a43893904effbd2be601_list.jpg.
blk_-1845977707636580795_1665561
2010-04-06 10:11:25,342 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 192.168.100.21:50010 is added
to blk_-1845977707636580795_1665561 size 72753
2010-04-06 10:11:44,257 INFO org.apache.hadoop.fs.FSNamesystem: Number of
transactions: 64 Total time for transactions(ms): 4 Number of syncs: 45
SyncTimes(ms): 387
2010-04-06 10:11:51,485 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.registerDatanode: node registration from
192.168.100.2:50010storage
DS-1237294752-192.168.100.2-50010-1252010614375
2010-04-06 10:11:51,488 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /default-rack/192.168.100.2:50010

Then again subsequently they were removed. No clue why this happened.

Ever since I'm seeing following things in logs..

2010-04-06 10:00:49,052 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 54310, call
create(/data/listing/image/4/43734/5af88437f6c6a88d62c5f900b06ab8dd_high.jpg,
rwxr-xr-x, DFSClient_1226879860, true, 2, 67108864) from 192.168.100.5:40437:
error: org.apache.hadoop.dfs.SafeModeException: Cannot create