Upgrade from 0.19 to 0.20 issue

2011-08-21 Thread MilleBii
Hi

I just upgraded from 0.19 to 0.20 everything seems fine however the web
monitoring tool doesn't work any more:

neither http://mydomain.com:50070/webapps/hdfs/dfshealth.jsp

nore http//mydomain.com:50070/dfshealth.jsp

Both give me a 404.

Same stands for the job tracking tool

Any idea where to look at ?


-- 
-MilleBii-


Re: Upgrading namenode/secondary node hardware

2011-06-17 Thread MilleBii
I see it is not so obvious and potentially dangerous so I will be learning 
experimenting first.
Thx for the tip.

2011/6/17 Steve Loughran ste...@apache.org

 On 16/06/11 14:19, MilleBii wrote:

 But if my Filesystem is up  running fine... do I have to worry at all or
 will the copy (ftp transfer) of  hdfs will be enough.


 I'm not going to make any predictions there as if/when things go wrong

  -you do need to shut down the FS before the move
  -you ought to get the edit logs replayed before the move
  -you may want to try experimenting with copying the namenode data and
 bringing up the namenode (without any datanodes connected to, so it comes up
 in safe mode), to make sure everything works.

 I'd also worry that if you aren't familiar with the edit log, you may need
 to spend some time learning the subtle details of namenode journalling,
 replaying, backup and restoration, and what the secondary namenode does.
 It's easy to bring up a cluster and get overconfident that it works, right
 up to the moment it stops working. Experiment with your cluster's and teams'
 failure handling before you really need it



 2011/6/16 Steve Loughranste...@apache.org

  On 15/06/11 15:54, MilleBii wrote:

  Thx.

 #1 don't understand the edit logs remark.


 well, that's something you need to work on as its the key to keeping your
 cluster working. The edit log is the journal of changes made to a
 namenode,
 which gets streamed to HDD and your secondary Namenode. After a NN
 restart,
 it has to replay all changes since the last checkpoint to get its
 directory
 structure up to date. Lose the edit log and you may as well reformat the
 disks.








-- 
-MilleBii-


Re: Upgrading namenode/secondary node hardware

2011-06-16 Thread MilleBii
But if my Filesystem is up  running fine... do I have to worry at all or
will the copy (ftp transfer) of  hdfs will be enough.



2011/6/16 Steve Loughran ste...@apache.org

 On 15/06/11 15:54, MilleBii wrote:

 Thx.

 #1 don't understand the edit logs remark.


 well, that's something you need to work on as its the key to keeping your
 cluster working. The edit log is the journal of changes made to a namenode,
 which gets streamed to HDD and your secondary Namenode. After a NN restart,
 it has to replay all changes since the last checkpoint to get its directory
 structure up to date. Lose the edit log and you may as well reformat the
 disks.




-- 
-MilleBii-


Re: Upgrading namenode/secondary node hardware

2011-06-15 Thread MilleBii
Thx.

#1 don't understand the edit logs remark.
#2 good  nice
#3 my provider will give me a server with a different IP, so I will have to
change all /etc/hosts to point to the new master. But I don't need to change
the master/slaves files indeed.

2011/6/15 Steve Loughran ste...@apache.org

 On 14/06/11 22:01, MilleBii wrote:

 I want/need to upgrade my namenode/secondary node hardware. Actually also
 acts as one of the datanodes.

 Could not find any how-to guides.
 So what is the process to switch from one hardware to the next.

 1. For HDFS data : is it just a matter of copying all the hdfs data from
 old
 server to new server.


 yes, put it in the same place on your HA storage and you may not even need
 to reconfigure it. If you didn't shut down the filesystem cleanly, you'll
 need to replay the edit logs.


  2. what about the decommissioning procedure of data node, is it necessary
 in
 that case ?


 You shouldn't need to. This is no different from handling failover of a
 namenode, which you ought to try from time to time anyway, with two common
 tactics
  -have ready-to-go replacement servers with the same hostname/IP and shared
 storage
  -have ready-to-go replacement servers with different hostnames, then with
 your cluster management tools bounce the workers into a new configuration.


  3.For MapRed:  need to change the master in cluster configuration files


 I'd give the new boxes the same hostnames and IPAddresses as before, and
 nothing else will notice. And I recommend having good cluster management
 tooling anyway, of course.




-- 
-MilleBii-


Re: Upgrading namenode/secondary node hardware

2011-06-15 Thread MilleBii
Do you have a recommendation for a good cluster management tooling ?

2011/6/15 MilleBii mille...@gmail.com

 Thx.

 #1 don't understand the edit logs remark.
 #2 good  nice
 #3 my provider will give me a server with a different IP, so I will have to
 change all /etc/hosts to point to the new master. But I don't need to change
 the master/slaves files indeed.


 2011/6/15 Steve Loughran ste...@apache.org

 On 14/06/11 22:01, MilleBii wrote:

 I want/need to upgrade my namenode/secondary node hardware. Actually also
 acts as one of the datanodes.

 Could not find any how-to guides.
 So what is the process to switch from one hardware to the next.

 1. For HDFS data : is it just a matter of copying all the hdfs data from
 old
 server to new server.


 yes, put it in the same place on your HA storage and you may not even need
 to reconfigure it. If you didn't shut down the filesystem cleanly, you'll
 need to replay the edit logs.


  2. what about the decommissioning procedure of data node, is it necessary
 in
 that case ?


 You shouldn't need to. This is no different from handling failover of a
 namenode, which you ought to try from time to time anyway, with two common
 tactics
  -have ready-to-go replacement servers with the same hostname/IP and
 shared storage
  -have ready-to-go replacement servers with different hostnames, then with
 your cluster management tools bounce the workers into a new configuration.


  3.For MapRed:  need to change the master in cluster configuration files


 I'd give the new boxes the same hostnames and IPAddresses as before, and
 nothing else will notice. And I recommend having good cluster management
 tooling anyway, of course.




 --
 -MilleBii-




-- 
-MilleBii-


Upgrading namenode/secondary node hardware

2011-06-14 Thread MilleBii
I want/need to upgrade my namenode/secondary node hardware. Actually also
acts as one of the datanodes.

Could not find any how-to guides.
So what is the process to switch from one hardware to the next.

1. For HDFS data : is it just a matter of copying all the hdfs data from old
server to new server.
2. what about the decommissioning procedure of data node, is it necessary in
that case ?
3.For MapRed:  need to change the master in cluster configuration files

Any help or pointer welcomed !

-- 
-MilleBii-


Re: Job failing on same map twice no logs

2011-06-04 Thread MilleBii
Fixed the slowness issue was in my nutch configuration which I had changed
in the meantime.
Any one can help where to look for potential issues ? Logs are desperately
empty of errors ?

2011/6/3 MilleBii mille...@gmail.com

 I have just upgraded my single node conf with a new one. Seemed to woork
 fine.
 Did run a balance operation.

 First job failed on map63 after 4 attempts
 Second job failed on the same map63

 Logs are empty, what is strange it is that jobs became slower.
 In both case map63 is executed by the master node which was working before.
 Suspecting memory leaks I stop the cluster and started again. Fine
 Run a hadoop fsck. fine

 3rd time is in progress, but it looks like even slower now.


 Any suggestion what to do ?



 --
 -MilleBii-




-- 
-MilleBii-


Re: Job failing on same map twice no logs

2011-06-04 Thread MilleBii
This is the best I found.

http://www.brics.dk/automaton/doc/index.html?dk/brics/automaton/RegExp.html

2011/6/4 MilleBii mille...@gmail.com

 Fixed the slowness issue was in my nutch configuration which I had changed
 in the meantime.
 Any one can help where to look for potential issues ? Logs are desperately
 empty of errors ?


 2011/6/3 MilleBii mille...@gmail.com

 I have just upgraded my single node conf with a new one. Seemed to woork
 fine.
 Did run a balance operation.

 First job failed on map63 after 4 attempts
 Second job failed on the same map63

 Logs are empty, what is strange it is that jobs became slower.
 In both case map63 is executed by the master node which was working
 before.
 Suspecting memory leaks I stop the cluster and started again. Fine
 Run a hadoop fsck. fine

 3rd time is in progress, but it looks like even slower now.


 Any suggestion what to do ?



 --
 -MilleBii-




 --
 -MilleBii-




-- 
-MilleBii-


Job failing on same map twice no logs

2011-06-03 Thread MilleBii
I have just upgraded my single node conf with a new one. Seemed to woork
fine.
Did run a balance operation.

First job failed on map63 after 4 attempts
Second job failed on the same map63

Logs are empty, what is strange it is that jobs became slower.
In both case map63 is executed by the master node which was working before.
Suspecting memory leaks I stop the cluster and started again. Fine
Run a hadoop fsck. fine

3rd time is in progress, but it looks like even slower now.


Any suggestion what to do ?



-- 
-MilleBii-


Re: Adding first datanode isn't working

2011-06-02 Thread MilleBii
Firewall of ubuntu box.

2011/6/2 jagaran das jagaran_...@yahoo.co.in



 ufw 




 
 From: MilleBii mille...@gmail.com
 To: common-user@hadoop.apache.org
 Sent: Wed, 1 June, 2011 3:37:23 PM
 Subject: Re: Adding first datanode isn't working

 OK found my issue. Turned off ufw and it sees the datanode. So I need to
 fix
 my ufw setup.

 2011/6/1 MilleBii mille...@gmail.com

  Thx,  already did that
   so I can ssh phraseless master to master and master to slave1.
  Same as before datanode  tasktracker are starting up/shuting down well
 on
  slave1
 
 
 
 
 
  2011/6/1 jagaran das jagaran_...@yahoo.co.in
 
  Check the password less SSH is working or not
 
  Regards,
  Jagaran
 
 
 
  
  From: MilleBii mille...@gmail.com
  To: common-user@hadoop.apache.org
  Sent: Wed, 1 June, 2011 12:28:54 PM
  Subject: Adding first datanode isn't working
 
  Newbie on hadoop clusters.
  I have setup my two nodes conf as described by M. G. Noll
 
 
 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
 /
 
 
  The data node has datanode  tasktracker running (jps command shows
 them),
  which means start-dfs.sh and start-mapred.sh worked fine.
 
  I can also shut them down gracefully.
 
  However in the WEB UI I only see one node for the DFS
 
  Live Node : 1
  Dead Node : 0
 
  Same thing on the MapRed WEB interface.
 
  Datanode logs on slave are just empty.
  Did check the network settings both nodes have access to each other on
  relevant ports.
 
  Did make sure namespaceID are the same (
  https://issues.apache.org/jira/browse/HDFS-107)
  I did try to put data in the DFS worked but no data seemed to arrive in
  the
  slave datanode.
  Also tried a small MapRed only master node has been actually working,
 but
  that could be because there is only data in the master. Right ?
 
  --
  -MilleBii-
 
 
 
 
  --
  -MilleBii-
 



 --
 -MilleBii-




-- 
-MilleBii-


Adding first datanode isn't working

2011-06-01 Thread MilleBii
Newbie on hadoop clusters.
I have setup my two nodes conf as described by M. G. Noll
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

The data node has datanode  tasktracker running (jps command shows them),
which means start-dfs.sh and start-mapred.sh worked fine.

I can also shut them down gracefully.

However in the WEB UI I only see one node for the DFS

Live Node : 1
Dead Node : 0

Same thing on the MapRed WEB interface.

Datanode logs on slave are just empty.
Did check the network settings both nodes have access to each other on
relevant ports.

Did make sure namespaceID are the same (
https://issues.apache.org/jira/browse/HDFS-107)
I did try to put data in the DFS worked but no data seemed to arrive in the
slave datanode.
Also tried a small MapRed only master node has been actually working, but
that could be because there is only data in the master. Right ?

-- 
-MilleBii-


Re: Adding first datanode isn't working

2011-06-01 Thread MilleBii
Thx,  already did that
 so I can ssh phraseless master to master and master to slave1.
Same as before datanode  tasktracker are starting up/shuting down well on
slave1




2011/6/1 jagaran das jagaran_...@yahoo.co.in

 Check the password less SSH is working or not

 Regards,
 Jagaran



 
 From: MilleBii mille...@gmail.com
 To: common-user@hadoop.apache.org
 Sent: Wed, 1 June, 2011 12:28:54 PM
 Subject: Adding first datanode isn't working

 Newbie on hadoop clusters.
 I have setup my two nodes conf as described by M. G. Noll

 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/


 The data node has datanode  tasktracker running (jps command shows them),
 which means start-dfs.sh and start-mapred.sh worked fine.

 I can also shut them down gracefully.

 However in the WEB UI I only see one node for the DFS

 Live Node : 1
 Dead Node : 0

 Same thing on the MapRed WEB interface.

 Datanode logs on slave are just empty.
 Did check the network settings both nodes have access to each other on
 relevant ports.

 Did make sure namespaceID are the same (
 https://issues.apache.org/jira/browse/HDFS-107)
 I did try to put data in the DFS worked but no data seemed to arrive in the
 slave datanode.
 Also tried a small MapRed only master node has been actually working, but
 that could be because there is only data in the master. Right ?

 --
 -MilleBii-




-- 
-MilleBii-


Re: Adding first datanode isn't working

2011-06-01 Thread MilleBii
OK found my issue. Turned off ufw and it sees the datanode. So I need to fix
my ufw setup.

2011/6/1 MilleBii mille...@gmail.com

 Thx,  already did that
  so I can ssh phraseless master to master and master to slave1.
 Same as before datanode  tasktracker are starting up/shuting down well on
 slave1





 2011/6/1 jagaran das jagaran_...@yahoo.co.in

 Check the password less SSH is working or not

 Regards,
 Jagaran



 
 From: MilleBii mille...@gmail.com
 To: common-user@hadoop.apache.org
 Sent: Wed, 1 June, 2011 12:28:54 PM
 Subject: Adding first datanode isn't working

 Newbie on hadoop clusters.
 I have setup my two nodes conf as described by M. G. Noll

 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/


 The data node has datanode  tasktracker running (jps command shows them),
 which means start-dfs.sh and start-mapred.sh worked fine.

 I can also shut them down gracefully.

 However in the WEB UI I only see one node for the DFS

 Live Node : 1
 Dead Node : 0

 Same thing on the MapRed WEB interface.

 Datanode logs on slave are just empty.
 Did check the network settings both nodes have access to each other on
 relevant ports.

 Did make sure namespaceID are the same (
 https://issues.apache.org/jira/browse/HDFS-107)
 I did try to put data in the DFS worked but no data seemed to arrive in
 the
 slave datanode.
 Also tried a small MapRed only master node has been actually working, but
 that could be because there is only data in the master. Right ?

 --
 -MilleBii-




 --
 -MilleBii-




-- 
-MilleBii-


Re: Could not obtain block

2010-01-30 Thread MilleBii
Increased the ulimit to 64000 ... same problem
stop/start-all ... same problem but on a different block which of course
present, so it looks like there is nothing wrong with actual data in the
hdfs.

I use the Nutch default hadoop 0.19.x anything related ?

2010/1/30 Ken Goodhope kengoodh...@gmail.com

 Could not obtain block errors are often caused by running out of
 available
 file handles.  You can confirm this by going to the shell and entering
 ulimit -n.  If it says 1024, the default, then you will want to increase
 it to about 64,000.

 On Fri, Jan 29, 2010 at 4:06 PM, MilleBii mille...@gmail.com wrote:

  X-POST with Nutch mailing list.
 
  HEEELP !!!
 
  Kind of get stuck on this one.
  I backed-up my hdfs data, reformated the hdfs, put data back, try to
 merge
  my segments together and it explodes again.
 
  Exception in thread Lucene Merge Thread #0
  org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException:
  Could not obtain block: blk_4670839132945043210_1585
  file=/user/nutch/crawl/indexed-segments/20100113003609/part-0/_ym.frq
 at
 
 
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
 
  If I go into the hfds/data directory I DO find the faulty block 
  Could it be a synchro problem on the segment merger code ?
 
  2010/1/29 MilleBii mille...@gmail.com
 
   I'm looking for some help. I'm Nutch user, everything was working fine,
  but
   now I get the following error when indexing.
   I have a single note pseudo distributed set up.
   Some people on the Nutch list indicated to me that I could full, so I
   remove many things and hdfs is far from full.
   This file  directory was perfectly OK the day before.
   I did a hadoop fsck... report says healthy.
  
   What can I do ?
  
   Is is safe to do a Linux FSCK just in case ?
  
   Caused by: java.io.IOException: Could not obtain block:
   blk_8851198258748412820_9031
  
 
 file=/user/nutch/crawl/indexed-segments/20100111233601/part-0/_103.frq
  
  
   --
   -MilleBii-
  
 
 
 
  --
  -MilleBii-
 



 --
 Ken Goodhope
 Cell: 425-750-5616

 362 Bellevue Way NE Apt N415
 Bellevue WA, 98004




-- 
-MilleBii-


Re: Could not obtain block

2010-01-30 Thread MilleBii
Ken,

FIXED !!! SO MUCH THANKS

Command prompt ulimit  wasn't enough, one needs to hard set it and reboot
explained here
http://posidev.com/blog/2009/06/04/set-ulimit-parameters-on-ubuntu/




2010/1/30 MilleBii mille...@gmail.com

 Increased the ulimit to 64000 ... same problem
 stop/start-all ... same problem but on a different block which of course
 present, so it looks like there is nothing wrong with actual data in the
 hdfs.

 I use the Nutch default hadoop 0.19.x anything related ?

 2010/1/30 Ken Goodhope kengoodh...@gmail.com

 Could not obtain block errors are often caused by running out of
 available
 file handles.  You can confirm this by going to the shell and entering
 ulimit -n.  If it says 1024, the default, then you will want to increase
 it to about 64,000.

 On Fri, Jan 29, 2010 at 4:06 PM, MilleBii mille...@gmail.com wrote:

  X-POST with Nutch mailing list.
 
  HEEELP !!!
 
  Kind of get stuck on this one.
  I backed-up my hdfs data, reformated the hdfs, put data back, try to
 merge
  my segments together and it explodes again.
 
  Exception in thread Lucene Merge Thread #0
  org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException:
  Could not obtain block: blk_4670839132945043210_1585
 
 file=/user/nutch/crawl/indexed-segments/20100113003609/part-0/_ym.frq
 at
 
 
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
 
  If I go into the hfds/data directory I DO find the faulty block 
  Could it be a synchro problem on the segment merger code ?
 
  2010/1/29 MilleBii mille...@gmail.com
 
   I'm looking for some help. I'm Nutch user, everything was working
 fine,
  but
   now I get the following error when indexing.
   I have a single note pseudo distributed set up.
   Some people on the Nutch list indicated to me that I could full, so I
   remove many things and hdfs is far from full.
   This file  directory was perfectly OK the day before.
   I did a hadoop fsck... report says healthy.
  
   What can I do ?
  
   Is is safe to do a Linux FSCK just in case ?
  
   Caused by: java.io.IOException: Could not obtain block:
   blk_8851198258748412820_9031
  
 
 file=/user/nutch/crawl/indexed-segments/20100111233601/part-0/_103.frq
  
  
   --
   -MilleBii-
  
 
 
 
  --
  -MilleBii-
 



 --
 Ken Goodhope
 Cell: 425-750-5616

 362 Bellevue Way NE Apt N415
 Bellevue WA, 98004




 --
 -MilleBii-




-- 
-MilleBii-


Re: Could not obtain block

2010-01-29 Thread MilleBii
X-POST with Nutch mailing list.

HEEELP !!!

Kind of get stuck on this one.
I backed-up my hdfs data, reformated the hdfs, put data back, try to merge
my segments together and it explodes again.

Exception in thread Lucene Merge Thread #0
org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException:
Could not obtain block: blk_4670839132945043210_1585
file=/user/nutch/crawl/indexed-segments/20100113003609/part-0/_ym.frq
at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)

If I go into the hfds/data directory I DO find the faulty block 
Could it be a synchro problem on the segment merger code ?

2010/1/29 MilleBii mille...@gmail.com

 I'm looking for some help. I'm Nutch user, everything was working fine, but
 now I get the following error when indexing.
 I have a single note pseudo distributed set up.
 Some people on the Nutch list indicated to me that I could full, so I
 remove many things and hdfs is far from full.
 This file  directory was perfectly OK the day before.
 I did a hadoop fsck... report says healthy.

 What can I do ?

 Is is safe to do a Linux FSCK just in case ?

 Caused by: java.io.IOException: Could not obtain block:
 blk_8851198258748412820_9031
 file=/user/nutch/crawl/indexed-segments/20100111233601/part-0/_103.frq


 --
 -MilleBii-




-- 
-MilleBii-