MR Jobs managing on Hadoop cluster

2011-07-04 Thread Ophir Cohen
Hi,
Recently we deployed 20-nodes cluster in our organization. Shortly it would
doubled (at least) and will start to handle billions of rows.

My question concerns the managing option.

I would like to let users (i.e. internal developers) to submit, schedule and
monitor their jobs.
Of course, I can give them command line access and make them submit jobs
directly.
The problem is that its not a whole management system.

In my vision I can see one web page that holds possibility to submit job
(i.e. add jar and set scheduling), monitor the jobs and even add alert
options (we work with HPOV).

In the current state there is Ganglia, command line and, say, Oozie - but
not in one handy place

Is there any app that do so?
Any other ideas?

Thanks,
Ophir


Re: Possible issue when creating/deleting HBase table multiple times

2011-07-04 Thread Florin P
Hello!

Thank you for your responses. I'm working on  0.90.1-cdh3u1-SNAPSHOT. Yes,
if I have the following scenario:
1. disable table
2. delete table
3. create table

 if I don't put the guard not(HBaseAdmin.tableExists(hbTable)) before creating 
rhe table (at step 3) the  then it will end up in 
org.apache.hadoop.hbase.TableExistsException. 
Regards,
  Florin  

--- On Fri, 7/1/11, Ted Yu yuzhih...@gmail.com wrote:

 From: Ted Yu yuzhih...@gmail.com
 Subject: Re: Possible issue when creating/deleting HBase table multiple times
 To: cdh-...@cloudera.org
 Cc: user@hbase.apache.org
 Date: Friday, July 1, 2011, 6:10 PM
 Seems to be a CDH question.
 
 On Fri, Jul 1, 2011 at 3:07 PM, Stack st...@duboce.net
 wrote:
 
  And if you try to create it says its already there?
 
  We have a job that does this over and over every few
 minutes and its
  been running for months on end.  I wonder whats
 different.  You are on
  0.90.3?
 
  St.Ack
 
  On Fri, Jul 1, 2011 at 12:59 AM, Florin P florinp...@yahoo.com
 wrote:
   Hello!
     I'm using HBase
 0.90.1-cdh3u1-SNAPSHOT. Running the attached
  code(adapted after sujee at sujee.net), after a while
   I was getting the below exception. The main
 scenario is like this:
  
   1. if table does not exist, create it
   2. populate the table with some data
   3. flush the data
   4. close the table
   5. disable table
   6. drop table
   7. repeat steps 1-6 for several times. After a
 while you'll get the
  mentioned error.
    Please help.
  
   Regards,
    Florin
  
   org.apache.hadoop.hbase.TableNotFoundException:
  org.apache.hadoop.hbase.TableNotFoundException:
 use_case_drop
          at
 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
  Method)
          at
 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
          at
 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
          at
 java.lang.reflect.Constructor.newInstance(Constructor.java:513)
          at
 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
          at
 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
          at
 
 org.apache.hadoop.hbase.client.HBaseAdmin.disableTableAsync(HBaseAdmin.java:531)
          at
 
 org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:550)
          at
 
 org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:538)
          at
 com.sample.hbase.HBaseDropCreate.main(HBaseDropCreate.java:80)
  
  
 



Re: MR Jobs managing on Hadoop cluster

2011-07-04 Thread Ted Dunning
There is also Azkaban (http://sna-projects.com/azkaban/) which provides the
scheduling and some historical statistics.  Azkaban is much simpler than
oozie, but lacks some capabilities.


On Sun, Jul 3, 2011 at 11:19 PM, Ophir Cohen oph...@gmail.com wrote:

 Hi,
 Recently we deployed 20-nodes cluster in our organization. Shortly it would
 doubled (at least) and will start to handle billions of rows.

 My question concerns the managing option.

 I would like to let users (i.e. internal developers) to submit, schedule
 and
 monitor their jobs.
 Of course, I can give them command line access and make them submit jobs
 directly.
 The problem is that its not a whole management system.

 In my vision I can see one web page that holds possibility to submit job
 (i.e. add jar and set scheduling), monitor the jobs and even add alert
 options (we work with HPOV).

 In the current state there is Ganglia, command line and, say, Oozie - but
 not in one handy place

 Is there any app that do so?
 Any other ideas?

 Thanks,
 Ophir



Re: Errors after major compaction

2011-07-04 Thread Ted Yu
Thanks for the understanding. 

Can you log a JIRA and put your ideas below in it ?



On Jul 4, 2011, at 12:42 AM, Eran Kutner e...@gigya.com wrote:

 Thanks for the explanation Ted,
 
 I will try to apply HBASE-3789 and hope for the best but my understanding is
 that it doesn't really solve the problem, it only reduces the probability of
 it happening, at least in one particular scenario. I would hope for a more
 robust solution.
 My concern is that the region allocation process seems to rely too much on
 timing considerations and doesn't seem to take enough measures to guarantee
 conflicts do not occur. I understand that in a distributed environment, when
 you don't get a timely response from a remote machine you can't know for
 sure if it did or did not receive the request, however there are things that
 can be done to mitigate this and reduce the conflict time significantly. For
 example, when I run dbck it knows that some regions are multiply assigned,
 the master could do the same and try to resolve the conflict. Another
 approach would be to handle late responses, even if the response from the
 remote machine arrives after it was assumed to be dead the master should
 have enough information to know it had created a conflict by assigning the
 region to another server. An even better solution, I think, is for the RS to
 periodically test that it is indeed the rightful owner of every region it
 holds and relinquish control over the region if it's not.
 Obviously a state where two RSs hold the same region is pathological and can
 lead to data loss, as demonstrated in my case. The system should be able to
 actively protect itself against such a scenario. It probably doesn't need
 saying but there is really nothing worse for a data storage system than data
 loss.
 
 In my case the problem didn't happen in the initial phase but after
 disabling and enabling a table with about 12K regions.
 
 -eran
 
 
 
 On Sun, Jul 3, 2011 at 23:49, Ted Yu yuzhih...@gmail.com wrote:
 
 Let me try to answer some of your questions.
 The two paragraphs below were written along my reasoning which is in
 reverse
 order of the actual call sequence.
 
 For #4 below, the log indicates that the following was executed:
 private void assign(final RegionState state, final boolean setOfflineInZK,
 final boolean forceNewPlan) {
   for (int i = 0; i  this.maximumAssignmentAttempts; i++) {
 if (setOfflineInZK  !*setOfflineInZooKeeper*(state)) return;
 
 The above was due to the timeout which you noted in #2 which would have
 caused
 TimeoutMonitor.chore() to run this code (line 1787)
 
 for (Map.EntryHRegionInfo, Boolean e: assigns.entrySet()){
   assign(e.getKey(), false, e.getValue());
 }
 
 This means there is lack of coordination between
 assignmentManager.TimeoutMonitor and OpenedRegionHandler
 
 The reason I mention HBASE-3789 is that it is marked as Incompatible change
 and is in TRUNK already.
 The application of HBASE-3789 to 0.90 branch would change the behavior
 (timing) of region assignment.
 
 I think it makes sense to evaluate the effect of HBASE-3789 in 0.90.4
 
 BTW were the incorrect region assignments observed for a table with
 multiple
 initial regions ?
 If so, I have HBASE-4010 in TRUNK which speeds up initial region assignment
 by about 50%.
 
 Cheers
 
 On Sun, Jul 3, 2011 at 12:02 PM, Eran Kutner e...@gigya.com wrote:
 
 Ted,
 So if I understand correctly the the theory is that because of the issue
 fixed in HBASE-3789 the master took too long to detect that the region
 was
 successfully opened by the first server so it forced closed it and
 transitioned to a second server, but there are a few things about this
 scenario I don't understand, probably because I don't know enough about
 the
 inner workings of the region transition process and would appreciate it
 if
 you can help me understand:
 1. The RS opened the region at 16:37:49.
 2. The master started handling the opened event at 16:39:54 - this delay
 can
 probably be explained by HBASE-3789
 3. At 16:39:54 the master log says: Opened region gs_raw_events,. on
 hadoop1-s05.farm-ny.gigya.com
 4. Then at 16:40:00 the master log says: master:6-0x13004a31d7804c4
 Creating (or updating) unassigned node for 584dac5cc70d8682f71c4675a843c3
 09 with OFFLINE state - why did it decide to take the region offline
 after
 learning it was successfully opened?
 5. Then it tries to reopen the region on hadoop1-s05, which indicates in
 its
 log that the open request failed because the region was already open -
 why
 didn't the master use that information to learn that the region was
 already
 open?
 6. At 16:43:57 the master decides the region transition timed out and
 starts
 forcing the transition - HBASE-3789 again?
 7. Now the master forces the transition of the region to hadoop1-s02 but
 there is no sign of that on hadoop1-s05 - why doesn't the old RS
 (hadoop1-s05) detect that it is no longer the master and relinquishes
 control of the 

M/R scan problem

2011-07-04 Thread Lior Schachter
Hi all,
I'm running a scan using the M/R framework.
My table contains hundreds of millions of rows and I'm scanning using
start/stop key about 50 million rows.

The problem is that some map tasks get stuck and the task manager kills
these maps after 600 seconds. When retrying the task everything works fine
(sometimes).

To verify that the problem is in hbase (and not in the map code) I removed
all the code from my map function, so it looks like this:
public void map(ImmutableBytesWritable key, Result value, Context context)
throws IOException, InterruptedException {
}

Also, when the map got stuck on a region, I tried to scan this region (using
simple scan from a Java main) and it worked fine.

Any ideas ?

Thanks,
Lior


Re: M/R scan problem

2011-07-04 Thread Ted Yu
Do you use TableInputFormat ?
To scan large number of rows, it would be better to produce one Split per
region.

What HBase version do you use ?
Do you find any exception in master / region server logs around the moment
of timeout ?

Cheers

On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote:

 Hi all,
 I'm running a scan using the M/R framework.
 My table contains hundreds of millions of rows and I'm scanning using
 start/stop key about 50 million rows.

 The problem is that some map tasks get stuck and the task manager kills
 these maps after 600 seconds. When retrying the task everything works fine
 (sometimes).

 To verify that the problem is in hbase (and not in the map code) I removed
 all the code from my map function, so it looks like this:
 public void map(ImmutableBytesWritable key, Result value, Context context)
 throws IOException, InterruptedException {
 }

 Also, when the map got stuck on a region, I tried to scan this region
 (using
 simple scan from a Java main) and it worked fine.

 Any ideas ?

 Thanks,
 Lior



client-side caching

2011-07-04 Thread Claudio Martella
Hello list,

i'm using hbase 0.90.3 on a 5 nodes cluster. I'm using a table as a
string-long map. As I'm using this map a lot, I was thinking about
installing memcache on the client side, as to avoid flooding hbase for
the same value over and over.

What is the best practice in these situations? some client-side caching
already in hbase?

Best,

Claudio

-- 
Claudio Martella
Digital Technologies
Unit Research  Development - Analyst

TIS innovation park
Via Siemens 19 | Siemensstr. 19
39100 Bolzano | 39100 Bozen
Tel. +39 0471 068 123
Fax  +39 0471 068 129
claudio.marte...@tis.bz.it http://www.tis.bz.it

Short information regarding use of personal data. According to Section 13 of 
Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we 
process your personal data in order to fulfil contractual and fiscal 
obligations and also to send you information regarding our services and events. 
Your personal data are processed with and without electronic means and by 
respecting data subjects' rights, fundamental freedoms and dignity, 
particularly with regard to confidentiality, personal identity and the right to 
personal data protection. At any time and without formalities you can write an 
e-mail to priv...@tis.bz.it in order to object the processing of your personal 
data for the purpose of sending advertising materials and also to exercise the 
right to access personal data and other rights referred to in Section 7 of 
Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, 
Siemens Street n. 19, Bolzano. You can find the complete information on the web 
site www.tis.bz.it.






Re: M/R scan problem

2011-07-04 Thread Lior Schachter
1. yes - I configure my job using this line:
TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan,
ScanMapper.class, Text.class, MapWritable.class, job)

which internally uses TableInputFormat.class

2. One split per region ? What do you mean ? How do I do that ?

3. hbase version 0.90.2

4. no exceptions. the logs are very clean.



On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote:

 Do you use TableInputFormat ?
 To scan large number of rows, it would be better to produce one Split per
 region.

 What HBase version do you use ?
 Do you find any exception in master / region server logs around the moment
 of timeout ?

 Cheers

 On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com
 wrote:

  Hi all,
  I'm running a scan using the M/R framework.
  My table contains hundreds of millions of rows and I'm scanning using
  start/stop key about 50 million rows.
 
  The problem is that some map tasks get stuck and the task manager kills
  these maps after 600 seconds. When retrying the task everything works
 fine
  (sometimes).
 
  To verify that the problem is in hbase (and not in the map code) I
 removed
  all the code from my map function, so it looks like this:
  public void map(ImmutableBytesWritable key, Result value, Context
 context)
  throws IOException, InterruptedException {
  }
 
  Also, when the map got stuck on a region, I tried to scan this region
  (using
  simple scan from a Java main) and it worked fine.
 
  Any ideas ?
 
  Thanks,
  Lior
 



Re: M/R scan problem

2011-07-04 Thread Ted Yu
For #2, see TableInputFormatBase.getSplits():
   * Calculates the splits that will serve as input for the map tasks. The
   * number of splits matches the number of regions in a table.


On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote:

 1. yes - I configure my job using this line:
 TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan,
 ScanMapper.class, Text.class, MapWritable.class, job)

 which internally uses TableInputFormat.class

 2. One split per region ? What do you mean ? How do I do that ?

 3. hbase version 0.90.2

 4. no exceptions. the logs are very clean.



 On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote:

  Do you use TableInputFormat ?
  To scan large number of rows, it would be better to produce one Split per
  region.
 
  What HBase version do you use ?
  Do you find any exception in master / region server logs around the
 moment
  of timeout ?
 
  Cheers
 
  On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com
  wrote:
 
   Hi all,
   I'm running a scan using the M/R framework.
   My table contains hundreds of millions of rows and I'm scanning using
   start/stop key about 50 million rows.
  
   The problem is that some map tasks get stuck and the task manager kills
   these maps after 600 seconds. When retrying the task everything works
  fine
   (sometimes).
  
   To verify that the problem is in hbase (and not in the map code) I
  removed
   all the code from my map function, so it looks like this:
   public void map(ImmutableBytesWritable key, Result value, Context
  context)
   throws IOException, InterruptedException {
   }
  
   Also, when the map got stuck on a region, I tried to scan this region
   (using
   simple scan from a Java main) and it worked fine.
  
   Any ideas ?
  
   Thanks,
   Lior
  
 



Re: Errors after major compaction

2011-07-04 Thread Eran Kutner
Sure, I'll do that.

-eran



On Mon, Jul 4, 2011 at 12:30, Ted Yu yuzhih...@gmail.com wrote:

 Thanks for the understanding.

 Can you log a JIRA and put your ideas below in it ?



 On Jul 4, 2011, at 12:42 AM, Eran Kutner e...@gigya.com wrote:

  Thanks for the explanation Ted,
 
  I will try to apply HBASE-3789 and hope for the best but my understanding
 is
  that it doesn't really solve the problem, it only reduces the probability
 of
  it happening, at least in one particular scenario. I would hope for a
 more
  robust solution.
  My concern is that the region allocation process seems to rely too much
 on
  timing considerations and doesn't seem to take enough measures to
 guarantee
  conflicts do not occur. I understand that in a distributed environment,
 when
  you don't get a timely response from a remote machine you can't know for
  sure if it did or did not receive the request, however there are things
 that
  can be done to mitigate this and reduce the conflict time significantly.
 For
  example, when I run dbck it knows that some regions are multiply
 assigned,
  the master could do the same and try to resolve the conflict. Another
  approach would be to handle late responses, even if the response from the
  remote machine arrives after it was assumed to be dead the master should
  have enough information to know it had created a conflict by assigning
 the
  region to another server. An even better solution, I think, is for the RS
 to
  periodically test that it is indeed the rightful owner of every region it
  holds and relinquish control over the region if it's not.
  Obviously a state where two RSs hold the same region is pathological and
 can
  lead to data loss, as demonstrated in my case. The system should be able
 to
  actively protect itself against such a scenario. It probably doesn't need
  saying but there is really nothing worse for a data storage system than
 data
  loss.
 
  In my case the problem didn't happen in the initial phase but after
  disabling and enabling a table with about 12K regions.
 
  -eran
 
 
 
  On Sun, Jul 3, 2011 at 23:49, Ted Yu yuzhih...@gmail.com wrote:
 
  Let me try to answer some of your questions.
  The two paragraphs below were written along my reasoning which is in
  reverse
  order of the actual call sequence.
 
  For #4 below, the log indicates that the following was executed:
  private void assign(final RegionState state, final boolean
 setOfflineInZK,
  final boolean forceNewPlan) {
for (int i = 0; i  this.maximumAssignmentAttempts; i++) {
  if (setOfflineInZK  !*setOfflineInZooKeeper*(state)) return;
 
  The above was due to the timeout which you noted in #2 which would have
  caused
  TimeoutMonitor.chore() to run this code (line 1787)
 
  for (Map.EntryHRegionInfo, Boolean e: assigns.entrySet()){
assign(e.getKey(), false, e.getValue());
  }
 
  This means there is lack of coordination between
  assignmentManager.TimeoutMonitor and OpenedRegionHandler
 
  The reason I mention HBASE-3789 is that it is marked as Incompatible
 change
  and is in TRUNK already.
  The application of HBASE-3789 to 0.90 branch would change the behavior
  (timing) of region assignment.
 
  I think it makes sense to evaluate the effect of HBASE-3789 in 0.90.4
 
  BTW were the incorrect region assignments observed for a table with
  multiple
  initial regions ?
  If so, I have HBASE-4010 in TRUNK which speeds up initial region
 assignment
  by about 50%.
 
  Cheers
 
  On Sun, Jul 3, 2011 at 12:02 PM, Eran Kutner e...@gigya.com wrote:
 
  Ted,
  So if I understand correctly the the theory is that because of the
 issue
  fixed in HBASE-3789 the master took too long to detect that the region
  was
  successfully opened by the first server so it forced closed it and
  transitioned to a second server, but there are a few things about this
  scenario I don't understand, probably because I don't know enough about
  the
  inner workings of the region transition process and would appreciate it
  if
  you can help me understand:
  1. The RS opened the region at 16:37:49.
  2. The master started handling the opened event at 16:39:54 - this
 delay
  can
  probably be explained by HBASE-3789
  3. At 16:39:54 the master log says: Opened region gs_raw_events,.
 on
  hadoop1-s05.farm-ny.gigya.com
  4. Then at 16:40:00 the master log says: master:6-0x13004a31d7804c4
  Creating (or updating) unassigned node for
 584dac5cc70d8682f71c4675a843c3
  09 with OFFLINE state - why did it decide to take the region offline
  after
  learning it was successfully opened?
  5. Then it tries to reopen the region on hadoop1-s05, which indicates
 in
  its
  log that the open request failed because the region was already open -
  why
  didn't the master use that information to learn that the region was
  already
  open?
  6. At 16:43:57 the master decides the region transition timed out and
  starts
  forcing the transition - HBASE-3789 again?
  7. Now the master 

Re: M/R scan problem

2011-07-04 Thread Lior Schachter
1. Currently every map gets one region. So I don't understand what
difference will it make using the splits.
2. How should I use the TableInputFormatBase.getSplits() ? Could not find
examples for that.

Thanks,
Lior


On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote:

 For #2, see TableInputFormatBase.getSplits():
   * Calculates the splits that will serve as input for the map tasks. The
   * number of splits matches the number of regions in a table.


 On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com
 wrote:

  1. yes - I configure my job using this line:
  TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan,
  ScanMapper.class, Text.class, MapWritable.class, job)
 
  which internally uses TableInputFormat.class
 
  2. One split per region ? What do you mean ? How do I do that ?
 
  3. hbase version 0.90.2
 
  4. no exceptions. the logs are very clean.
 
 
 
  On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote:
 
   Do you use TableInputFormat ?
   To scan large number of rows, it would be better to produce one Split
 per
   region.
  
   What HBase version do you use ?
   Do you find any exception in master / region server logs around the
  moment
   of timeout ?
  
   Cheers
  
   On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com
   wrote:
  
Hi all,
I'm running a scan using the M/R framework.
My table contains hundreds of millions of rows and I'm scanning using
start/stop key about 50 million rows.
   
The problem is that some map tasks get stuck and the task manager
 kills
these maps after 600 seconds. When retrying the task everything works
   fine
(sometimes).
   
To verify that the problem is in hbase (and not in the map code) I
   removed
all the code from my map function, so it looks like this:
public void map(ImmutableBytesWritable key, Result value, Context
   context)
throws IOException, InterruptedException {
}
   
Also, when the map got stuck on a region, I tried to scan this region
(using
simple scan from a Java main) and it worked fine.
   
Any ideas ?
   
Thanks,
Lior
   
  
 



Re: M/R scan problem

2011-07-04 Thread Ted Yu
I wasn't clear in my previous email.
It was not answer to why map tasks got stuck.
TableInputFormatBase.getSplits() is being called already.

Can you try getting jstack of one of the map tasks before task tracker kills
it ?

Thanks

On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com wrote:

 1. Currently every map gets one region. So I don't understand what
 difference will it make using the splits.
 2. How should I use the TableInputFormatBase.getSplits() ? Could not find
 examples for that.

 Thanks,
 Lior


 On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote:

  For #2, see TableInputFormatBase.getSplits():
* Calculates the splits that will serve as input for the map tasks. The
* number of splits matches the number of regions in a table.
 
 
  On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com
  wrote:
 
   1. yes - I configure my job using this line:
   TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
 scan,
   ScanMapper.class, Text.class, MapWritable.class, job)
  
   which internally uses TableInputFormat.class
  
   2. One split per region ? What do you mean ? How do I do that ?
  
   3. hbase version 0.90.2
  
   4. no exceptions. the logs are very clean.
  
  
  
   On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote:
  
Do you use TableInputFormat ?
To scan large number of rows, it would be better to produce one Split
  per
region.
   
What HBase version do you use ?
Do you find any exception in master / region server logs around the
   moment
of timeout ?
   
Cheers
   
On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com
wrote:
   
 Hi all,
 I'm running a scan using the M/R framework.
 My table contains hundreds of millions of rows and I'm scanning
 using
 start/stop key about 50 million rows.

 The problem is that some map tasks get stuck and the task manager
  kills
 these maps after 600 seconds. When retrying the task everything
 works
fine
 (sometimes).

 To verify that the problem is in hbase (and not in the map code) I
removed
 all the code from my map function, so it looks like this:
 public void map(ImmutableBytesWritable key, Result value, Context
context)
 throws IOException, InterruptedException {
 }

 Also, when the map got stuck on a region, I tried to scan this
 region
 (using
 simple scan from a Java main) and it worked fine.

 Any ideas ?

 Thanks,
 Lior

   
  
 



Re: M/R scan problem

2011-07-04 Thread Lior Schachter
I used kill -3, following the thread dump:

Full thread dump Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode):

IPC Client (47) connection to /127.0.0.1:59759 from hadoop daemon
prio=10 tid=0x2aaab05ca800 nid=0x4eaf in Object.wait()
[0x403c1000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xf9dba860 (a 
org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:403)
- locked 0xf9dba860 (a 
org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:445)

SpillThread daemon prio=10 tid=0x2aaab0585000 nid=0x4c99 waiting
on condition [0x404c2000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0xf9af0c38 (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1169)

main-EventThread daemon prio=10 tid=0x2aaab035d000 nid=0x4c95
waiting on condition [0x41207000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0xf9af5f58 (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)

main-SendThread(hadoop09.infolinks.local:2181) daemon prio=10
tid=0x2aaab035c000 nid=0x4c94 runnable [0x40815000]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
- locked 0xf9af61a8 (a sun.nio.ch.Util$2)
- locked 0xf9af61b8 (a java.util.Collections$UnmodifiableSet)
- locked 0xf9af6160 (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)

communication thread daemon prio=10 tid=0x4d02
nid=0x4c93 waiting on condition [0x42497000]
   java.lang.Thread.State: RUNNABLE
at java.util.Hashtable.put(Hashtable.java:420)
- locked 0xf9dbaa58 (a java.util.Hashtable)
at org.apache.hadoop.ipc.Client$Connection.addCall(Client.java:225)
- locked 0xf9dba860 (a 
org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.access$1600(Client.java:176)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:854)
at org.apache.hadoop.ipc.Client.call(Client.java:720)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at org.apache.hadoop.mapred.$Proxy0.ping(Unknown Source)
at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:548)
at java.lang.Thread.run(Thread.java:662)

Thread for syncLogs daemon prio=10 tid=0x2aaab02e9800 nid=0x4c90
runnable [0x40714000]
   java.lang.Thread.State: RUNNABLE
at java.util.Arrays.copyOf(Arrays.java:2882)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
at java.lang.StringBuilder.append(StringBuilder.java:119)
at java.io.UnixFileSystem.resolve(UnixFileSystem.java:93)
at java.io.File.init(File.java:312)
at org.apache.hadoop.mapred.TaskLog.getTaskLogFile(TaskLog.java:72)
at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:180)
at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:230)
- locked 0xeea92fc0 (a java.lang.Class for
org.apache.hadoop.mapred.TaskLog)
at org.apache.hadoop.mapred.Child$2.run(Child.java:89)

Low Memory Detector daemon prio=10 tid=0x2aaab0001800 nid=0x4c86
runnable [0x]
   java.lang.Thread.State: RUNNABLE

CompilerThread1 daemon prio=10 tid=0x4cb4e800 nid=0x4c85
waiting on condition [0x]
   java.lang.Thread.State: RUNNABLE

CompilerThread0 daemon prio=10 tid=0x4cb4b000 nid=0x4c84
waiting on condition 

Re: hbck -fix

2011-07-04 Thread Stack
Thank you Wayne.  I'll dig in Weds (I'm not by a computer till then).
St.Ack

On Sun, Jul 3, 2011 at 9:56 AM, Wayne wav...@gmail.com wrote:
 I have uploaded the logs. I do not have a snapshot of the .META. table in
 the messed up state. The root partition ran out of space at 2:30 am. Below
 are links to the various logs. It appears everything but the data nodes went
 south. The data nodes kept repeating the same errors shown in the log below.

 Master log

 http://pastebin.com/WmBAC0Xm

 Namenode log

 http://pastebin.com/tjRqfCaChttp://pastebin.com/tjRqfCaC

 Node 2 region server log

 http://pastebin.com/M3EH02bP

 Node 2 data node log

 http://pastebin.com/XKgUAMTK

 Thanks.


 On Sun, Jul 3, 2011 at 12:40 AM, Stack st...@duboce.net wrote:

 You have a snapshot of the state of .META. at time you noticed it
 messed up?  And the master log from around the time of the startup
 post-fs-fillup?
 St.Ack

 On Sat, Jul 2, 2011 at 7:27 PM, Wayne wav...@gmail.com wrote:
  Like most problems we brought it on ourselves. To me the bigger issue is
 how
  to get out. Since region definitions are the core of what hbase does, it
  would be great to have a bullet proof recovery process that we can invoke
 to
  get us out. Bugs and human error will bring on problems and nothing will
  ever change that, but not having tools to help recover out of the hole is
  where I think it is lacking. HDFS is very stable. The hbase .META. table
  (and -ROOT-?) are the core how HBase manages things. If this gets out of
  whack all is lost. I think it would be great to have automatic backup of
 the
  meta table and the ability to recover everything based on the HDFS data
 out
  there and the backup. Something like a recovery mode that goes through
 and
  sees what is out there and rebuilds the meta based on it. With corrupted
  data and lost regions etc. etc. like any relational database there should
 be
  one or more recovery modes that goes through everything and rebuilds it
  consistently. Data may be lost but at least the cluster will be left in a
  100% consistent/clean state. Manual editing of .META. is not something
  anyone should do (especially me). It is prone to human error...it should
 be
  easy to have well tested recover tools that can do the hard work for us.
 
  Below is an attempt at the play by play in case it helps. It all started
  with the root partition of the namenode/hmaster filling up due to a table
  export.
 
  When I restarted hadoop this error was in the namenode log;
  java.io.IOException: Incorrect data format. logVersion is -18 but
  writables.length is 0
 
  So i found this
 https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/e35ee876da1a3bbc
 ,
  which mentioned editing the namenode log files after verifying our
 namenode
  log files seem to have the same symptom. So I copied each namenode name
  file to root's home directory and followed their advice.
  That allowed the namenode to start, but then HDFS wouldn't come up. It
 kept
  hanging in safe-mode with the repeated error;
  The ratio of reported blocks 0.9925 has not reached the threshold
 0.9990.
  Safe mode will be turned off automatically.
  So i turned safe-mode off with; hadoop dfsadmin -safemode leave and I
  tried to run hadoop fsck a few times and it still showed HDFS as
  corrupt, so i did hadoop fsck -move and this is the last part of the
  output;
 
 Status:
  CORRUPT
   Total size: 1423140871890 B (Total open files size: 668770828 B)
   Total dirs: 3172
   Total files: 2584 (Files currently being written: 11)
   Total blocks (validated): 23095 (avg. block size 61621167 B) (Total open
  file blocks (not validated): 10)
   
   CORRUPT FILES: 65
   MISSING BLOCKS: 173
   MISSING SIZE: 8560948988 B
   CORRUPT BLOCKS: 173
   
   Minimally replicated blocks: 22922 (99.25092 %)
   Over-replicated blocks: 0 (0.0 %)
   Under-replicated blocks: 0 (0.0 %)
   Mis-replicated blocks: 0 (0.0 %)
   Default replication factor: 3
   Average block replication: 2.9775276
   Corrupt blocks: 173
   Missing replicas: 0 (0.0 %)
   Number of data-nodes: 10
   Number of racks: 1
 
  I ran it again and got this;
  .Status: HEALTHY
   Total size: 1414579922902 B (Total open files size: 668770828 B)
   Total dirs: 3272
   Total files: 2519 (Files currently being written: 11)
   Total blocks (validated): 22922 (avg. block size 61712761 B) (Total open
  file blocks (not validated): 10)
   Minimally replicated blocks: 22922 (100.0 %)
   Over-replicated blocks: 0 (0.0 %)
   Under-replicated blocks: 0 (0.0 %)
   Mis-replicated blocks: 0 (0.0 %)
   Default replication factor: 3
   Average block replication: 3.0
   Corrupt blocks: 0
   Missing replicas: 0 (0.0 %)
   Number of data-nodes: 10
   Number of racks: 1
 
 
  The filesystem under path '/' is HEALTHY
 
  So i started everything and it 

Re: M/R scan problem

2011-07-04 Thread Ted Yu
In the future, provide full dump using pastebin.com
Write snippet of log in email.

Can you tell us what the following lines are about ?
HBaseURLsDaysAggregator.java:124
HBaseURLsDaysAggregator.java:131

How many mappers were launched ?

What value is used for hbase.zookeeper.property.maxClientCnxns ?
You may need to increase the value for above setting.

On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com wrote:

 I used kill -3, following the thread dump:

 ...


 On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote:

  I wasn't clear in my previous email.
  It was not answer to why map tasks got stuck.
  TableInputFormatBase.getSplits() is being called already.
 
  Can you try getting jstack of one of the map tasks before task tracker
  kills
  it ?
 
  Thanks
 
  On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com
  wrote:
 
   1. Currently every map gets one region. So I don't understand what
   difference will it make using the splits.
   2. How should I use the TableInputFormatBase.getSplits() ? Could not
 find
   examples for that.
  
   Thanks,
   Lior
  
  
   On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote:
  
For #2, see TableInputFormatBase.getSplits():
  * Calculates the splits that will serve as input for the map tasks.
  The
  * number of splits matches the number of regions in a table.
   
   
On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com
wrote:
   
 1. yes - I configure my job using this line:
 TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
   scan,
 ScanMapper.class, Text.class, MapWritable.class, job)

 which internally uses TableInputFormat.class

 2. One split per region ? What do you mean ? How do I do that ?

 3. hbase version 0.90.2

 4. no exceptions. the logs are very clean.



 On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com
 wrote:

  Do you use TableInputFormat ?
  To scan large number of rows, it would be better to produce one
  Split
per
  region.
 
  What HBase version do you use ?
  Do you find any exception in master / region server logs around
 the
 moment
  of timeout ?
 
  Cheers
 
  On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
  li...@infolinks.com
  wrote:
 
   Hi all,
   I'm running a scan using the M/R framework.
   My table contains hundreds of millions of rows and I'm scanning
   using
   start/stop key about 50 million rows.
  
   The problem is that some map tasks get stuck and the task
 manager
kills
   these maps after 600 seconds. When retrying the task everything
   works
  fine
   (sometimes).
  
   To verify that the problem is in hbase (and not in the map
 code)
  I
  removed
   all the code from my map function, so it looks like this:
   public void map(ImmutableBytesWritable key, Result value,
 Context
  context)
   throws IOException, InterruptedException {
   }
  
   Also, when the map got stuck on a region, I tried to scan this
   region
   (using
   simple scan from a Java main) and it worked fine.
  
   Any ideas ?
  
   Thanks,
   Lior
  
 

   
  
 



Re: M/R scan problem

2011-07-04 Thread Lior Schachter
1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are
not important since even when I removed all my map code the tasks got stuck
(but the thread dumps were generated after I revived the code). If you think
its important I'll remove the map code again and re-generate the thread
dumps...

2. 82 maps were launched but only 36 ran simultaneously.

3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ?

Thanks,
Lior


On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote:

 In the future, provide full dump using pastebin.com
 Write snippet of log in email.

 Can you tell us what the following lines are about ?
 HBaseURLsDaysAggregator.java:124
 HBaseURLsDaysAggregator.java:131

 How many mappers were launched ?

 What value is used for hbase.zookeeper.property.maxClientCnxns ?
 You may need to increase the value for above setting.

 On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com
 wrote:

  I used kill -3, following the thread dump:
 
  ...
 
 
  On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote:
 
   I wasn't clear in my previous email.
   It was not answer to why map tasks got stuck.
   TableInputFormatBase.getSplits() is being called already.
  
   Can you try getting jstack of one of the map tasks before task tracker
   kills
   it ?
  
   Thanks
  
   On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com
   wrote:
  
1. Currently every map gets one region. So I don't understand what
difference will it make using the splits.
2. How should I use the TableInputFormatBase.getSplits() ? Could not
  find
examples for that.
   
Thanks,
Lior
   
   
On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote:
   
 For #2, see TableInputFormatBase.getSplits():
   * Calculates the splits that will serve as input for the map
 tasks.
   The
   * number of splits matches the number of regions in a table.


 On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter 
 li...@infolinks.com
 wrote:

  1. yes - I configure my job using this line:
 
 TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
scan,
  ScanMapper.class, Text.class, MapWritable.class, job)
 
  which internally uses TableInputFormat.class
 
  2. One split per region ? What do you mean ? How do I do that ?
 
  3. hbase version 0.90.2
 
  4. no exceptions. the logs are very clean.
 
 
 
  On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com
  wrote:
 
   Do you use TableInputFormat ?
   To scan large number of rows, it would be better to produce one
   Split
 per
   region.
  
   What HBase version do you use ?
   Do you find any exception in master / region server logs around
  the
  moment
   of timeout ?
  
   Cheers
  
   On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
   li...@infolinks.com
   wrote:
  
Hi all,
I'm running a scan using the M/R framework.
My table contains hundreds of millions of rows and I'm
 scanning
using
start/stop key about 50 million rows.
   
The problem is that some map tasks get stuck and the task
  manager
 kills
these maps after 600 seconds. When retrying the task
 everything
works
   fine
(sometimes).
   
To verify that the problem is in hbase (and not in the map
  code)
   I
   removed
all the code from my map function, so it looks like this:
public void map(ImmutableBytesWritable key, Result value,
  Context
   context)
throws IOException, InterruptedException {
}
   
Also, when the map got stuck on a region, I tried to scan
 this
region
(using
simple scan from a Java main) and it worked fine.
   
Any ideas ?
   
Thanks,
Lior
   
  
 

   
  
 



Re: client-side caching

2011-07-04 Thread Ted Yu
See HBASE-4018

On Mon, Jul 4, 2011 at 7:33 AM, Claudio Martella claudio.marte...@tis.bz.it
 wrote:

 Hello list,

 i'm using hbase 0.90.3 on a 5 nodes cluster. I'm using a table as a
 string-long map. As I'm using this map a lot, I was thinking about
 installing memcache on the client side, as to avoid flooding hbase for
 the same value over and over.

 What is the best practice in these situations? some client-side caching
 already in hbase?

 Best,

 Claudio

 --
 Claudio Martella
 Digital Technologies
 Unit Research  Development - Analyst

 TIS innovation park
 Via Siemens 19 | Siemensstr. 19
 39100 Bolzano | 39100 Bozen
 Tel. +39 0471 068 123
 Fax  +39 0471 068 129
 claudio.marte...@tis.bz.it http://www.tis.bz.it

 Short information regarding use of personal data. According to Section 13
 of Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we
 process your personal data in order to fulfil contractual and fiscal
 obligations and also to send you information regarding our services and
 events. Your personal data are processed with and without electronic means
 and by respecting data subjects' rights, fundamental freedoms and dignity,
 particularly with regard to confidentiality, personal identity and the right
 to personal data protection. At any time and without formalities you can
 write an e-mail to priv...@tis.bz.it in order to object the processing of
 your personal data for the purpose of sending advertising materials and also
 to exercise the right to access personal data and other rights referred to
 in Section 7 of Decree 196/2003. The data controller is TIS Techno
 Innovation Alto Adige, Siemens Street n. 19, Bolzano. You can find the
 complete information on the web site www.tis.bz.it.







Re: hbck -fix

2011-07-04 Thread Stack
On Sun, Jul 3, 2011 at 10:12 AM, Wayne wav...@gmail.com wrote:
 HBase needs to evolve a little more before organizations
 like ours can just use it without having to become experts.

I'd agree with this.  In its current state, at least a part-time,
seasoned operations engineer (per Andrew's description) is necessary
if a substantial production deploy.  I don't think that an onerous
expectation for a critical piece of infrastructure.  It'd certainly
broaden our appeal though if we could get into the mysql calibre of
ease-of-use

That said, the issue you ran into where an 'incident' make it so a
'smart' fellow was unable to reconstitute his store needs addressing.
We'll work on this.

St.Ack


 I have to say the community behind HBase is fantastic and goes above and
 beyond to help greenies like ourselves be successful. With just a little
 more polish around the edges I think it can and will really
 become successful for a much wider audience. Thanks for everyones help.


 On Sun, Jul 3, 2011 at 4:08 AM, Andrew Purtell apurt...@apache.org wrote:

 I shorthanded this a bit:

  Certainly a seasoned operations engineer would be a good investment for
 anyone.


 Let's try instead:

 Certainly a seasoned operations engineer [with Java experience] would be a
 good investment for anyone [running Hadoop based systems].

 I'm not sure what I wrote earlier adequately conveyed the thought.


   - Andy




  From: Andrew Purtell apurt...@apache.org
  To: user@hbase.apache.org user@hbase.apache.org
  Cc:
  Sent: Sunday, July 3, 2011 12:39 AM
  Subject: Re: hbck -fix
 
  Wayne,
 
  Did you by chance have your NameNode configured to write the edit log to
 only
  one disk, and in this case only the root volume of the NameNode host? As
 I'm
  sure you are now aware, the NameNode's edit log was corrupted, at least
 the
  tail of it anyway, when the volume upon which it was being written was
 filled by
  an errant process. The HDFS NameNode has a special critical role and it
 really
  must be treated with the utmost care. It can and should be configured to
 write
  the fsimage and edit log to multiple local dedicated disks. And, user
 processes
  should never run on it.
 
 
   Hope has long since flown out the window. I just changed my opinion of
 what
   it takes to manage hbase. A Java engineer is required on staff.
 
  Perhaps.
 
  Certainly a seasoned operations engineer would be a good investment for
 anyone.
 
   Having
   RF=3 in HDFS offers no insurance against hbase lossing its shirt and
 having
   .META. getting corrupted.
 
  This is a valid point. If HDFS loses track of blocks containing META
 table data
  due to fsimage corruption on the NameNode, having those blocks on 3
 DataNodes is
  of no use.
 
 
  I've done exercises in the past like delete META on disk and recreate it
  with the earlier set of utilities (add_table.rb). This always worked for
  me when I've tried it.
 
 
  Results from torture tests that HBase was subjected to in the timeframe
 leading
  up to 0.90 also resulted in better handling of .META. table related
 errors. They
  are fortunately demonstrably now rare.
 
 
  Clearly however there is room for further improvement here.
  I will work on https://issues.apache.org/jira/browse/HBASE-4058 and
 hopefully
  produce a unit test that fully exercises the ability of HBCK to
 reconstitute
  META and gives
  reliable results that can be incorporated into the test suite. My concern
 here
  is getting repeatable results demonstrating HBCK weaknesses will be
 challenging.
 
 
  Best regards,
 
 
         - Andy
 
  Problems worthy of attack prove their worth by hitting back. - Piet Hein
 (via
  Tom White)
 
 
  - Original Message -
   From: Wayne wav...@gmail.com
   To: user@hbase.apache.org
   Cc:
   Sent: Saturday, July 2, 2011 9:55 AM
   Subject: Re: hbck -fix
 
   It just returns a ton of errors (import: command not found). Our
 cluster is
   hosed anyway. I am waiting to get it completely re-installed from
 scratch.
   Hope has long since flown out the window. I just changed my opinion of
 what
   it takes to manage hbase. A Java engineer is required on staff. I also
   realized now a backup strategy is more important than for a RDBMS.
 Having
   RF=3 in HDFS offers no insurance against hbase lossing its shirt and
 having
   .META. getting corrupted. I think I just found the achilles heel.
 
 
   On Sat, Jul 2, 2011 at 12:40 PM, Ted Yu yuzhih...@gmail.com wrote:
 
    Have you tried running check_meta.rb with --fix ?
 
    On Sat, Jul 2, 2011 at 9:19 AM, Wayne wav...@gmail.com wrote:
 
     We are running 0.90.3. We were testing the table export not
  realizing
   the
     data goes to the root drive and not HDFS. The export filled the
   master's
     root partition. The logger had issues and HDFS got corrupted
     (java.io.IOException:
     Incorrect data format. logVersion is -18 but writables.length is
   0). We
     had
     to run hadoop fsck -move to fix the corrupted hdfs 

Re: M/R scan problem

2011-07-04 Thread Ted Yu
The reason I asked about HBaseURLsDaysAggregator.java was that I see no
HBase (client) code in call stack.
I have little clue for the problem you experienced.

There may be more than one connection to zookeeper from one map task.
So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns

Cheers

On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter li...@infolinks.com wrote:

 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are
 not important since even when I removed all my map code the tasks got stuck
 (but the thread dumps were generated after I revived the code). If you
 think
 its important I'll remove the map code again and re-generate the thread
 dumps...

 2. 82 maps were launched but only 36 ran simultaneously.

 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ?

 Thanks,
 Lior


 On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote:

  In the future, provide full dump using pastebin.com
  Write snippet of log in email.
 
  Can you tell us what the following lines are about ?
  HBaseURLsDaysAggregator.java:124
  HBaseURLsDaysAggregator.java:131
 
  How many mappers were launched ?
 
  What value is used for hbase.zookeeper.property.maxClientCnxns ?
  You may need to increase the value for above setting.
 
  On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com
  wrote:
 
   I used kill -3, following the thread dump:
  
   ...
  
  
   On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote:
  
I wasn't clear in my previous email.
It was not answer to why map tasks got stuck.
TableInputFormatBase.getSplits() is being called already.
   
Can you try getting jstack of one of the map tasks before task
 tracker
kills
it ?
   
Thanks
   
On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com
wrote:
   
 1. Currently every map gets one region. So I don't understand what
 difference will it make using the splits.
 2. How should I use the TableInputFormatBase.getSplits() ? Could
 not
   find
 examples for that.

 Thanks,
 Lior


 On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com
 wrote:

  For #2, see TableInputFormatBase.getSplits():
* Calculates the splits that will serve as input for the map
  tasks.
The
* number of splits matches the number of regions in a table.
 
 
  On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter 
  li...@infolinks.com
  wrote:
 
   1. yes - I configure my job using this line:
  
  TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
 scan,
   ScanMapper.class, Text.class, MapWritable.class, job)
  
   which internally uses TableInputFormat.class
  
   2. One split per region ? What do you mean ? How do I do that ?
  
   3. hbase version 0.90.2
  
   4. no exceptions. the logs are very clean.
  
  
  
   On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com
   wrote:
  
Do you use TableInputFormat ?
To scan large number of rows, it would be better to produce
 one
Split
  per
region.
   
What HBase version do you use ?
Do you find any exception in master / region server logs
 around
   the
   moment
of timeout ?
   
Cheers
   
On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
li...@infolinks.com
wrote:
   
 Hi all,
 I'm running a scan using the M/R framework.
 My table contains hundreds of millions of rows and I'm
  scanning
 using
 start/stop key about 50 million rows.

 The problem is that some map tasks get stuck and the task
   manager
  kills
 these maps after 600 seconds. When retrying the task
  everything
 works
fine
 (sometimes).

 To verify that the problem is in hbase (and not in the map
   code)
I
removed
 all the code from my map function, so it looks like this:
 public void map(ImmutableBytesWritable key, Result value,
   Context
context)
 throws IOException, InterruptedException {
 }

 Also, when the map got stuck on a region, I tried to scan
  this
 region
 (using
 simple scan from a Java main) and it worked fine.

 Any ideas ?

 Thanks,
 Lior

   
  
 

   
  
 



Re: M/R scan problem

2011-07-04 Thread Ted Yu
From master UI, click 'zk dump'
:60010/zk.jsp would show you the active connections. See if the count
reaches 300 when map tasks run.

On Mon, Jul 4, 2011 at 10:12 AM, Ted Yu yuzhih...@gmail.com wrote:

 The reason I asked about HBaseURLsDaysAggregator.java was that I see no
 HBase (client) code in call stack.
 I have little clue for the problem you experienced.

 There may be more than one connection to zookeeper from one map task.
 So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns

 Cheers


 On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter li...@infolinks.comwrote:

 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 :
 are
 not important since even when I removed all my map code the tasks got
 stuck
 (but the thread dumps were generated after I revived the code). If you
 think
 its important I'll remove the map code again and re-generate the thread
 dumps...

 2. 82 maps were launched but only 36 ran simultaneously.

 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ?

 Thanks,
 Lior


 On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote:

  In the future, provide full dump using pastebin.com
  Write snippet of log in email.
 
  Can you tell us what the following lines are about ?
  HBaseURLsDaysAggregator.java:124
  HBaseURLsDaysAggregator.java:131
 
  How many mappers were launched ?
 
  What value is used for hbase.zookeeper.property.maxClientCnxns ?
  You may need to increase the value for above setting.
 
  On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com
  wrote:
 
   I used kill -3, following the thread dump:
  
   ...
  
  
   On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote:
  
I wasn't clear in my previous email.
It was not answer to why map tasks got stuck.
TableInputFormatBase.getSplits() is being called already.
   
Can you try getting jstack of one of the map tasks before task
 tracker
kills
it ?
   
Thanks
   
On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com
 
wrote:
   
 1. Currently every map gets one region. So I don't understand what
 difference will it make using the splits.
 2. How should I use the TableInputFormatBase.getSplits() ? Could
 not
   find
 examples for that.

 Thanks,
 Lior


 On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com
 wrote:

  For #2, see TableInputFormatBase.getSplits():
* Calculates the splits that will serve as input for the map
  tasks.
The
* number of splits matches the number of regions in a table.
 
 
  On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter 
  li...@infolinks.com
  wrote:
 
   1. yes - I configure my job using this line:
  
  TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
 scan,
   ScanMapper.class, Text.class, MapWritable.class, job)
  
   which internally uses TableInputFormat.class
  
   2. One split per region ? What do you mean ? How do I do that
 ?
  
   3. hbase version 0.90.2
  
   4. no exceptions. the logs are very clean.
  
  
  
   On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com
   wrote:
  
Do you use TableInputFormat ?
To scan large number of rows, it would be better to produce
 one
Split
  per
region.
   
What HBase version do you use ?
Do you find any exception in master / region server logs
 around
   the
   moment
of timeout ?
   
Cheers
   
On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
li...@infolinks.com
wrote:
   
 Hi all,
 I'm running a scan using the M/R framework.
 My table contains hundreds of millions of rows and I'm
  scanning
 using
 start/stop key about 50 million rows.

 The problem is that some map tasks get stuck and the task
   manager
  kills
 these maps after 600 seconds. When retrying the task
  everything
 works
fine
 (sometimes).

 To verify that the problem is in hbase (and not in the map
   code)
I
removed
 all the code from my map function, so it looks like this:
 public void map(ImmutableBytesWritable key, Result value,
   Context
context)
 throws IOException, InterruptedException {
 }

 Also, when the map got stuck on a region, I tried to scan
  this
 region
 (using
 simple scan from a Java main) and it worked fine.

 Any ideas ?

 Thanks,
 Lior

   
  
 

   
  
 





Re: M/R scan problem

2011-07-04 Thread Lior Schachter
I will increase the number of connections to 1000.

Thanks !

Lior




On Mon, Jul 4, 2011 at 8:12 PM, Ted Yu yuzhih...@gmail.com wrote:

 The reason I asked about HBaseURLsDaysAggregator.java was that I see no
 HBase (client) code in call stack.
 I have little clue for the problem you experienced.

 There may be more than one connection to zookeeper from one map task.
 So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns

 Cheers

 On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter li...@infolinks.com
 wrote:

  1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 :
 are
  not important since even when I removed all my map code the tasks got
 stuck
  (but the thread dumps were generated after I revived the code). If you
  think
  its important I'll remove the map code again and re-generate the thread
  dumps...
 
  2. 82 maps were launched but only 36 ran simultaneously.
 
  3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ?
 
  Thanks,
  Lior
 
 
  On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote:
 
   In the future, provide full dump using pastebin.com
   Write snippet of log in email.
  
   Can you tell us what the following lines are about ?
   HBaseURLsDaysAggregator.java:124
   HBaseURLsDaysAggregator.java:131
  
   How many mappers were launched ?
  
   What value is used for hbase.zookeeper.property.maxClientCnxns ?
   You may need to increase the value for above setting.
  
   On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com
   wrote:
  
I used kill -3, following the thread dump:
   
...
   
   
On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote:
   
 I wasn't clear in my previous email.
 It was not answer to why map tasks got stuck.
 TableInputFormatBase.getSplits() is being called already.

 Can you try getting jstack of one of the map tasks before task
  tracker
 kills
 it ?

 Thanks

 On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter 
 li...@infolinks.com
 wrote:

  1. Currently every map gets one region. So I don't understand
 what
  difference will it make using the splits.
  2. How should I use the TableInputFormatBase.getSplits() ? Could
  not
find
  examples for that.
 
  Thanks,
  Lior
 
 
  On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com
  wrote:
 
   For #2, see TableInputFormatBase.getSplits():
 * Calculates the splits that will serve as input for the map
   tasks.
 The
 * number of splits matches the number of regions in a table.
  
  
   On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter 
   li...@infolinks.com
   wrote:
  
1. yes - I configure my job using this line:
   
   TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
  scan,
ScanMapper.class, Text.class, MapWritable.class, job)
   
which internally uses TableInputFormat.class
   
2. One split per region ? What do you mean ? How do I do that
 ?
   
3. hbase version 0.90.2
   
4. no exceptions. the logs are very clean.
   
   
   
On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com
wrote:
   
 Do you use TableInputFormat ?
 To scan large number of rows, it would be better to produce
  one
 Split
   per
 region.

 What HBase version do you use ?
 Do you find any exception in master / region server logs
  around
the
moment
 of timeout ?

 Cheers

 On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
 li...@infolinks.com
 wrote:

  Hi all,
  I'm running a scan using the M/R framework.
  My table contains hundreds of millions of rows and I'm
   scanning
  using
  start/stop key about 50 million rows.
 
  The problem is that some map tasks get stuck and the task
manager
   kills
  these maps after 600 seconds. When retrying the task
   everything
  works
 fine
  (sometimes).
 
  To verify that the problem is in hbase (and not in the
 map
code)
 I
 removed
  all the code from my map function, so it looks like this:
  public void map(ImmutableBytesWritable key, Result value,
Context
 context)
  throws IOException, InterruptedException {
  }
 
  Also, when the map got stuck on a region, I tried to scan
   this
  region
  (using
  simple scan from a Java main) and it worked fine.
 
  Any ideas ?
 
  Thanks,
  Lior
 

   
  
 

   
  
 



Re: hbck -fix

2011-07-04 Thread Stack
On Sun, Jul 3, 2011 at 12:39 AM, Andrew Purtell apurt...@apache.org wrote:
 I've done exercises in the past like delete META on disk and recreate it with 
 the earlier set of utilities (add_table.rb). This always worked for me when 
 I've tried it.


We need to update add_table.rb at least.  The onlining of regions was
done by the metascan.  It no longer exists in 0.90.  Maybe a
disable/enable after an add_table.rb would do but probably better to
revamp and merge it with hbck?


 Results from torture tests that HBase was subjected to in the timeframe 
 leading up to 0.90 also resulted in better handling of .META. table related 
 errors. They are fortunately demonstrably now rare.


Agreed.


My concern here is getting repeatable results demonstrating HBCK weaknesses 
will be challenging.


Yes.  This is the tough one.  I was hoping Wayne had a snapshot of
.META. to help at least characterize the problem.

(This does sound like something our Dan Harvey ran into recently on an
hbase 0.20.x hbase.  Let me go back to him.  He might have some input
that will help here.)

St.Ack


Re: M/R scan problem

2011-07-04 Thread Ted Yu
Although connection count may not be the root cause, please read
http://zhihongyu.blogspot.com/2011/04/managing-connections-in-hbase-090-and.htmlif
you have time.
0.92.0 would do a much better job of managing connections.

On Mon, Jul 4, 2011 at 10:14 AM, Lior Schachter li...@infolinks.com wrote:

 I will increase the number of connections to 1000.

 Thanks !

 Lior




 On Mon, Jul 4, 2011 at 8:12 PM, Ted Yu yuzhih...@gmail.com wrote:

  The reason I asked about HBaseURLsDaysAggregator.java was that I see no
  HBase (client) code in call stack.
  I have little clue for the problem you experienced.
 
  There may be more than one connection to zookeeper from one map task.
  So it doesn't hurt if you increase
 hbase.zookeeper.property.maxClientCnxns
 
  Cheers
 
  On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter li...@infolinks.com
  wrote:
 
   1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 :
  are
   not important since even when I removed all my map code the tasks got
  stuck
   (but the thread dumps were generated after I revived the code). If you
   think
   its important I'll remove the map code again and re-generate the thread
   dumps...
  
   2. 82 maps were launched but only 36 ran simultaneously.
  
   3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it
 ?
  
   Thanks,
   Lior
  
  
   On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote:
  
In the future, provide full dump using pastebin.com
Write snippet of log in email.
   
Can you tell us what the following lines are about ?
HBaseURLsDaysAggregator.java:124
HBaseURLsDaysAggregator.java:131
   
How many mappers were launched ?
   
What value is used for hbase.zookeeper.property.maxClientCnxns ?
You may need to increase the value for above setting.
   
On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com
wrote:
   
 I used kill -3, following the thread dump:

 ...


 On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com
 wrote:

  I wasn't clear in my previous email.
  It was not answer to why map tasks got stuck.
  TableInputFormatBase.getSplits() is being called already.
 
  Can you try getting jstack of one of the map tasks before task
   tracker
  kills
  it ?
 
  Thanks
 
  On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter 
  li...@infolinks.com
  wrote:
 
   1. Currently every map gets one region. So I don't understand
  what
   difference will it make using the splits.
   2. How should I use the TableInputFormatBase.getSplits() ?
 Could
   not
 find
   examples for that.
  
   Thanks,
   Lior
  
  
   On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com
   wrote:
  
For #2, see TableInputFormatBase.getSplits():
  * Calculates the splits that will serve as input for the
 map
tasks.
  The
  * number of splits matches the number of regions in a
 table.
   
   
On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter 
li...@infolinks.com
wrote:
   
 1. yes - I configure my job using this line:

TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
   scan,
 ScanMapper.class, Text.class, MapWritable.class, job)

 which internally uses TableInputFormat.class

 2. One split per region ? What do you mean ? How do I do
 that
  ?

 3. hbase version 0.90.2

 4. no exceptions. the logs are very clean.



 On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu 
 yuzhih...@gmail.com
 wrote:

  Do you use TableInputFormat ?
  To scan large number of rows, it would be better to
 produce
   one
  Split
per
  region.
 
  What HBase version do you use ?
  Do you find any exception in master / region server logs
   around
 the
 moment
  of timeout ?
 
  Cheers
 
  On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
  li...@infolinks.com
  wrote:
 
   Hi all,
   I'm running a scan using the M/R framework.
   My table contains hundreds of millions of rows and I'm
scanning
   using
   start/stop key about 50 million rows.
  
   The problem is that some map tasks get stuck and the
 task
 manager
kills
   these maps after 600 seconds. When retrying the task
everything
   works
  fine
   (sometimes).
  
   To verify that the problem is in hbase (and not in the
  map
 code)
  I
  removed
   all the code from my map function, so it looks like
 this:
   public void map(ImmutableBytesWritable key, Result
 value,
 Context
  context)
   throws 

Re: M/R scan problem

2011-07-04 Thread Michel Segel
Did a quick trim...

Sorry to jump in on the tail end of this...
Two things you may want to look at...

Are you timing out because you haven't updated your status within the task or 
are you taking 600seconds to complete a single map() iteration.

You can test this by tracking to see how long you are spending in each map 
iteration and printing out the result if it is longer than 2 mins... 

Also try updating your status in each iteration by sending a unique status 
update like current system time...
...


Sent from a remote device. Please excuse any typos...

Mike Segel

On Jul 4, 2011, at 12:35 PM, Ted Yu yuzhih...@gmail.com wrote:

 Although connection count may not be the root cause, please read
 http://zhihongyu.blogspot.com/2011/04/managing-connections-in-hbase-090-and.htmlif
 you have time.
 0.92.0 would do a much better job of managing connections.
 
 On Mon, Jul 4, 2011 at 10:14 AM, Lior Schachter li...@infolinks.com wrote:
 


Re: Reg: The HLog that is created in region creation

2011-07-04 Thread Ted Yu
Not really used. 
See hbase-4010



On Jul 4, 2011, at 8:55 PM, Ramkrishna S Vasudevan ramakrish...@huawei.com 
wrote:

 Hello
 
 
 
 Can anybody tell me what is the use of the HLog created per region when a
 region is created?
 
 
 
 Regards
 
 Ram
 
 
 
 
 ***
 This e-mail and attachments contain confidential information from HUAWEI,
 which is intended only for the person or entity whose address is listed
 above. Any use of the information contained herein in any way (including,
 but not limited to, total or partial disclosure, reproduction, or
 dissemination) by persons other than the intended recipient's) is
 prohibited. If you receive this e-mail in error, please notify the sender by
 phone or email immediately and delete it!