MR Jobs managing on Hadoop cluster
Hi, Recently we deployed 20-nodes cluster in our organization. Shortly it would doubled (at least) and will start to handle billions of rows. My question concerns the managing option. I would like to let users (i.e. internal developers) to submit, schedule and monitor their jobs. Of course, I can give them command line access and make them submit jobs directly. The problem is that its not a whole management system. In my vision I can see one web page that holds possibility to submit job (i.e. add jar and set scheduling), monitor the jobs and even add alert options (we work with HPOV). In the current state there is Ganglia, command line and, say, Oozie - but not in one handy place Is there any app that do so? Any other ideas? Thanks, Ophir
Re: Possible issue when creating/deleting HBase table multiple times
Hello! Thank you for your responses. I'm working on 0.90.1-cdh3u1-SNAPSHOT. Yes, if I have the following scenario: 1. disable table 2. delete table 3. create table if I don't put the guard not(HBaseAdmin.tableExists(hbTable)) before creating rhe table (at step 3) the then it will end up in org.apache.hadoop.hbase.TableExistsException. Regards, Florin --- On Fri, 7/1/11, Ted Yu yuzhih...@gmail.com wrote: From: Ted Yu yuzhih...@gmail.com Subject: Re: Possible issue when creating/deleting HBase table multiple times To: cdh-...@cloudera.org Cc: user@hbase.apache.org Date: Friday, July 1, 2011, 6:10 PM Seems to be a CDH question. On Fri, Jul 1, 2011 at 3:07 PM, Stack st...@duboce.net wrote: And if you try to create it says its already there? We have a job that does this over and over every few minutes and its been running for months on end. I wonder whats different. You are on 0.90.3? St.Ack On Fri, Jul 1, 2011 at 12:59 AM, Florin P florinp...@yahoo.com wrote: Hello! I'm using HBase 0.90.1-cdh3u1-SNAPSHOT. Running the attached code(adapted after sujee at sujee.net), after a while I was getting the below exception. The main scenario is like this: 1. if table does not exist, create it 2. populate the table with some data 3. flush the data 4. close the table 5. disable table 6. drop table 7. repeat steps 1-6 for several times. After a while you'll get the mentioned error. Please help. Regards, Florin org.apache.hadoop.hbase.TableNotFoundException: org.apache.hadoop.hbase.TableNotFoundException: use_case_drop at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79) at org.apache.hadoop.hbase.client.HBaseAdmin.disableTableAsync(HBaseAdmin.java:531) at org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:550) at org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:538) at com.sample.hbase.HBaseDropCreate.main(HBaseDropCreate.java:80)
Re: MR Jobs managing on Hadoop cluster
There is also Azkaban (http://sna-projects.com/azkaban/) which provides the scheduling and some historical statistics. Azkaban is much simpler than oozie, but lacks some capabilities. On Sun, Jul 3, 2011 at 11:19 PM, Ophir Cohen oph...@gmail.com wrote: Hi, Recently we deployed 20-nodes cluster in our organization. Shortly it would doubled (at least) and will start to handle billions of rows. My question concerns the managing option. I would like to let users (i.e. internal developers) to submit, schedule and monitor their jobs. Of course, I can give them command line access and make them submit jobs directly. The problem is that its not a whole management system. In my vision I can see one web page that holds possibility to submit job (i.e. add jar and set scheduling), monitor the jobs and even add alert options (we work with HPOV). In the current state there is Ganglia, command line and, say, Oozie - but not in one handy place Is there any app that do so? Any other ideas? Thanks, Ophir
Re: Errors after major compaction
Thanks for the understanding. Can you log a JIRA and put your ideas below in it ? On Jul 4, 2011, at 12:42 AM, Eran Kutner e...@gigya.com wrote: Thanks for the explanation Ted, I will try to apply HBASE-3789 and hope for the best but my understanding is that it doesn't really solve the problem, it only reduces the probability of it happening, at least in one particular scenario. I would hope for a more robust solution. My concern is that the region allocation process seems to rely too much on timing considerations and doesn't seem to take enough measures to guarantee conflicts do not occur. I understand that in a distributed environment, when you don't get a timely response from a remote machine you can't know for sure if it did or did not receive the request, however there are things that can be done to mitigate this and reduce the conflict time significantly. For example, when I run dbck it knows that some regions are multiply assigned, the master could do the same and try to resolve the conflict. Another approach would be to handle late responses, even if the response from the remote machine arrives after it was assumed to be dead the master should have enough information to know it had created a conflict by assigning the region to another server. An even better solution, I think, is for the RS to periodically test that it is indeed the rightful owner of every region it holds and relinquish control over the region if it's not. Obviously a state where two RSs hold the same region is pathological and can lead to data loss, as demonstrated in my case. The system should be able to actively protect itself against such a scenario. It probably doesn't need saying but there is really nothing worse for a data storage system than data loss. In my case the problem didn't happen in the initial phase but after disabling and enabling a table with about 12K regions. -eran On Sun, Jul 3, 2011 at 23:49, Ted Yu yuzhih...@gmail.com wrote: Let me try to answer some of your questions. The two paragraphs below were written along my reasoning which is in reverse order of the actual call sequence. For #4 below, the log indicates that the following was executed: private void assign(final RegionState state, final boolean setOfflineInZK, final boolean forceNewPlan) { for (int i = 0; i this.maximumAssignmentAttempts; i++) { if (setOfflineInZK !*setOfflineInZooKeeper*(state)) return; The above was due to the timeout which you noted in #2 which would have caused TimeoutMonitor.chore() to run this code (line 1787) for (Map.EntryHRegionInfo, Boolean e: assigns.entrySet()){ assign(e.getKey(), false, e.getValue()); } This means there is lack of coordination between assignmentManager.TimeoutMonitor and OpenedRegionHandler The reason I mention HBASE-3789 is that it is marked as Incompatible change and is in TRUNK already. The application of HBASE-3789 to 0.90 branch would change the behavior (timing) of region assignment. I think it makes sense to evaluate the effect of HBASE-3789 in 0.90.4 BTW were the incorrect region assignments observed for a table with multiple initial regions ? If so, I have HBASE-4010 in TRUNK which speeds up initial region assignment by about 50%. Cheers On Sun, Jul 3, 2011 at 12:02 PM, Eran Kutner e...@gigya.com wrote: Ted, So if I understand correctly the the theory is that because of the issue fixed in HBASE-3789 the master took too long to detect that the region was successfully opened by the first server so it forced closed it and transitioned to a second server, but there are a few things about this scenario I don't understand, probably because I don't know enough about the inner workings of the region transition process and would appreciate it if you can help me understand: 1. The RS opened the region at 16:37:49. 2. The master started handling the opened event at 16:39:54 - this delay can probably be explained by HBASE-3789 3. At 16:39:54 the master log says: Opened region gs_raw_events,. on hadoop1-s05.farm-ny.gigya.com 4. Then at 16:40:00 the master log says: master:6-0x13004a31d7804c4 Creating (or updating) unassigned node for 584dac5cc70d8682f71c4675a843c3 09 with OFFLINE state - why did it decide to take the region offline after learning it was successfully opened? 5. Then it tries to reopen the region on hadoop1-s05, which indicates in its log that the open request failed because the region was already open - why didn't the master use that information to learn that the region was already open? 6. At 16:43:57 the master decides the region transition timed out and starts forcing the transition - HBASE-3789 again? 7. Now the master forces the transition of the region to hadoop1-s02 but there is no sign of that on hadoop1-s05 - why doesn't the old RS (hadoop1-s05) detect that it is no longer the master and relinquishes control of the
M/R scan problem
Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: M/R scan problem
Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
client-side caching
Hello list, i'm using hbase 0.90.3 on a 5 nodes cluster. I'm using a table as a string-long map. As I'm using this map a lot, I was thinking about installing memcache on the client side, as to avoid flooding hbase for the same value over and over. What is the best practice in these situations? some client-side caching already in hbase? Best, Claudio -- Claudio Martella Digital Technologies Unit Research Development - Analyst TIS innovation park Via Siemens 19 | Siemensstr. 19 39100 Bolzano | 39100 Bozen Tel. +39 0471 068 123 Fax +39 0471 068 129 claudio.marte...@tis.bz.it http://www.tis.bz.it Short information regarding use of personal data. According to Section 13 of Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we process your personal data in order to fulfil contractual and fiscal obligations and also to send you information regarding our services and events. Your personal data are processed with and without electronic means and by respecting data subjects' rights, fundamental freedoms and dignity, particularly with regard to confidentiality, personal identity and the right to personal data protection. At any time and without formalities you can write an e-mail to priv...@tis.bz.it in order to object the processing of your personal data for the purpose of sending advertising materials and also to exercise the right to access personal data and other rights referred to in Section 7 of Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens Street n. 19, Bolzano. You can find the complete information on the web site www.tis.bz.it.
Re: M/R scan problem
1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: M/R scan problem
For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote: 1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: Errors after major compaction
Sure, I'll do that. -eran On Mon, Jul 4, 2011 at 12:30, Ted Yu yuzhih...@gmail.com wrote: Thanks for the understanding. Can you log a JIRA and put your ideas below in it ? On Jul 4, 2011, at 12:42 AM, Eran Kutner e...@gigya.com wrote: Thanks for the explanation Ted, I will try to apply HBASE-3789 and hope for the best but my understanding is that it doesn't really solve the problem, it only reduces the probability of it happening, at least in one particular scenario. I would hope for a more robust solution. My concern is that the region allocation process seems to rely too much on timing considerations and doesn't seem to take enough measures to guarantee conflicts do not occur. I understand that in a distributed environment, when you don't get a timely response from a remote machine you can't know for sure if it did or did not receive the request, however there are things that can be done to mitigate this and reduce the conflict time significantly. For example, when I run dbck it knows that some regions are multiply assigned, the master could do the same and try to resolve the conflict. Another approach would be to handle late responses, even if the response from the remote machine arrives after it was assumed to be dead the master should have enough information to know it had created a conflict by assigning the region to another server. An even better solution, I think, is for the RS to periodically test that it is indeed the rightful owner of every region it holds and relinquish control over the region if it's not. Obviously a state where two RSs hold the same region is pathological and can lead to data loss, as demonstrated in my case. The system should be able to actively protect itself against such a scenario. It probably doesn't need saying but there is really nothing worse for a data storage system than data loss. In my case the problem didn't happen in the initial phase but after disabling and enabling a table with about 12K regions. -eran On Sun, Jul 3, 2011 at 23:49, Ted Yu yuzhih...@gmail.com wrote: Let me try to answer some of your questions. The two paragraphs below were written along my reasoning which is in reverse order of the actual call sequence. For #4 below, the log indicates that the following was executed: private void assign(final RegionState state, final boolean setOfflineInZK, final boolean forceNewPlan) { for (int i = 0; i this.maximumAssignmentAttempts; i++) { if (setOfflineInZK !*setOfflineInZooKeeper*(state)) return; The above was due to the timeout which you noted in #2 which would have caused TimeoutMonitor.chore() to run this code (line 1787) for (Map.EntryHRegionInfo, Boolean e: assigns.entrySet()){ assign(e.getKey(), false, e.getValue()); } This means there is lack of coordination between assignmentManager.TimeoutMonitor and OpenedRegionHandler The reason I mention HBASE-3789 is that it is marked as Incompatible change and is in TRUNK already. The application of HBASE-3789 to 0.90 branch would change the behavior (timing) of region assignment. I think it makes sense to evaluate the effect of HBASE-3789 in 0.90.4 BTW were the incorrect region assignments observed for a table with multiple initial regions ? If so, I have HBASE-4010 in TRUNK which speeds up initial region assignment by about 50%. Cheers On Sun, Jul 3, 2011 at 12:02 PM, Eran Kutner e...@gigya.com wrote: Ted, So if I understand correctly the the theory is that because of the issue fixed in HBASE-3789 the master took too long to detect that the region was successfully opened by the first server so it forced closed it and transitioned to a second server, but there are a few things about this scenario I don't understand, probably because I don't know enough about the inner workings of the region transition process and would appreciate it if you can help me understand: 1. The RS opened the region at 16:37:49. 2. The master started handling the opened event at 16:39:54 - this delay can probably be explained by HBASE-3789 3. At 16:39:54 the master log says: Opened region gs_raw_events,. on hadoop1-s05.farm-ny.gigya.com 4. Then at 16:40:00 the master log says: master:6-0x13004a31d7804c4 Creating (or updating) unassigned node for 584dac5cc70d8682f71c4675a843c3 09 with OFFLINE state - why did it decide to take the region offline after learning it was successfully opened? 5. Then it tries to reopen the region on hadoop1-s05, which indicates in its log that the open request failed because the region was already open - why didn't the master use that information to learn that the region was already open? 6. At 16:43:57 the master decides the region transition timed out and starts forcing the transition - HBASE-3789 again? 7. Now the master
Re: M/R scan problem
1. Currently every map gets one region. So I don't understand what difference will it make using the splits. 2. How should I use the TableInputFormatBase.getSplits() ? Could not find examples for that. Thanks, Lior On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote: For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote: 1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: M/R scan problem
I wasn't clear in my previous email. It was not answer to why map tasks got stuck. TableInputFormatBase.getSplits() is being called already. Can you try getting jstack of one of the map tasks before task tracker kills it ? Thanks On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com wrote: 1. Currently every map gets one region. So I don't understand what difference will it make using the splits. 2. How should I use the TableInputFormatBase.getSplits() ? Could not find examples for that. Thanks, Lior On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote: For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote: 1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: M/R scan problem
I used kill -3, following the thread dump: Full thread dump Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode): IPC Client (47) connection to /127.0.0.1:59759 from hadoop daemon prio=10 tid=0x2aaab05ca800 nid=0x4eaf in Object.wait() [0x403c1000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xf9dba860 (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:403) - locked 0xf9dba860 (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:445) SpillThread daemon prio=10 tid=0x2aaab0585000 nid=0x4c99 waiting on condition [0x404c2000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0xf9af0c38 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1169) main-EventThread daemon prio=10 tid=0x2aaab035d000 nid=0x4c95 waiting on condition [0x41207000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0xf9af5f58 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502) main-SendThread(hadoop09.infolinks.local:2181) daemon prio=10 tid=0x2aaab035c000 nid=0x4c94 runnable [0x40815000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked 0xf9af61a8 (a sun.nio.ch.Util$2) - locked 0xf9af61b8 (a java.util.Collections$UnmodifiableSet) - locked 0xf9af6160 (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107) communication thread daemon prio=10 tid=0x4d02 nid=0x4c93 waiting on condition [0x42497000] java.lang.Thread.State: RUNNABLE at java.util.Hashtable.put(Hashtable.java:420) - locked 0xf9dbaa58 (a java.util.Hashtable) at org.apache.hadoop.ipc.Client$Connection.addCall(Client.java:225) - locked 0xf9dba860 (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.access$1600(Client.java:176) at org.apache.hadoop.ipc.Client.getConnection(Client.java:854) at org.apache.hadoop.ipc.Client.call(Client.java:720) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at org.apache.hadoop.mapred.$Proxy0.ping(Unknown Source) at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:548) at java.lang.Thread.run(Thread.java:662) Thread for syncLogs daemon prio=10 tid=0x2aaab02e9800 nid=0x4c90 runnable [0x40714000] java.lang.Thread.State: RUNNABLE at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at java.io.UnixFileSystem.resolve(UnixFileSystem.java:93) at java.io.File.init(File.java:312) at org.apache.hadoop.mapred.TaskLog.getTaskLogFile(TaskLog.java:72) at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:180) at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:230) - locked 0xeea92fc0 (a java.lang.Class for org.apache.hadoop.mapred.TaskLog) at org.apache.hadoop.mapred.Child$2.run(Child.java:89) Low Memory Detector daemon prio=10 tid=0x2aaab0001800 nid=0x4c86 runnable [0x] java.lang.Thread.State: RUNNABLE CompilerThread1 daemon prio=10 tid=0x4cb4e800 nid=0x4c85 waiting on condition [0x] java.lang.Thread.State: RUNNABLE CompilerThread0 daemon prio=10 tid=0x4cb4b000 nid=0x4c84 waiting on condition
Re: hbck -fix
Thank you Wayne. I'll dig in Weds (I'm not by a computer till then). St.Ack On Sun, Jul 3, 2011 at 9:56 AM, Wayne wav...@gmail.com wrote: I have uploaded the logs. I do not have a snapshot of the .META. table in the messed up state. The root partition ran out of space at 2:30 am. Below are links to the various logs. It appears everything but the data nodes went south. The data nodes kept repeating the same errors shown in the log below. Master log http://pastebin.com/WmBAC0Xm Namenode log http://pastebin.com/tjRqfCaChttp://pastebin.com/tjRqfCaC Node 2 region server log http://pastebin.com/M3EH02bP Node 2 data node log http://pastebin.com/XKgUAMTK Thanks. On Sun, Jul 3, 2011 at 12:40 AM, Stack st...@duboce.net wrote: You have a snapshot of the state of .META. at time you noticed it messed up? And the master log from around the time of the startup post-fs-fillup? St.Ack On Sat, Jul 2, 2011 at 7:27 PM, Wayne wav...@gmail.com wrote: Like most problems we brought it on ourselves. To me the bigger issue is how to get out. Since region definitions are the core of what hbase does, it would be great to have a bullet proof recovery process that we can invoke to get us out. Bugs and human error will bring on problems and nothing will ever change that, but not having tools to help recover out of the hole is where I think it is lacking. HDFS is very stable. The hbase .META. table (and -ROOT-?) are the core how HBase manages things. If this gets out of whack all is lost. I think it would be great to have automatic backup of the meta table and the ability to recover everything based on the HDFS data out there and the backup. Something like a recovery mode that goes through and sees what is out there and rebuilds the meta based on it. With corrupted data and lost regions etc. etc. like any relational database there should be one or more recovery modes that goes through everything and rebuilds it consistently. Data may be lost but at least the cluster will be left in a 100% consistent/clean state. Manual editing of .META. is not something anyone should do (especially me). It is prone to human error...it should be easy to have well tested recover tools that can do the hard work for us. Below is an attempt at the play by play in case it helps. It all started with the root partition of the namenode/hmaster filling up due to a table export. When I restarted hadoop this error was in the namenode log; java.io.IOException: Incorrect data format. logVersion is -18 but writables.length is 0 So i found this https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/e35ee876da1a3bbc , which mentioned editing the namenode log files after verifying our namenode log files seem to have the same symptom. So I copied each namenode name file to root's home directory and followed their advice. That allowed the namenode to start, but then HDFS wouldn't come up. It kept hanging in safe-mode with the repeated error; The ratio of reported blocks 0.9925 has not reached the threshold 0.9990. Safe mode will be turned off automatically. So i turned safe-mode off with; hadoop dfsadmin -safemode leave and I tried to run hadoop fsck a few times and it still showed HDFS as corrupt, so i did hadoop fsck -move and this is the last part of the output; Status: CORRUPT Total size: 1423140871890 B (Total open files size: 668770828 B) Total dirs: 3172 Total files: 2584 (Files currently being written: 11) Total blocks (validated): 23095 (avg. block size 61621167 B) (Total open file blocks (not validated): 10) CORRUPT FILES: 65 MISSING BLOCKS: 173 MISSING SIZE: 8560948988 B CORRUPT BLOCKS: 173 Minimally replicated blocks: 22922 (99.25092 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.9775276 Corrupt blocks: 173 Missing replicas: 0 (0.0 %) Number of data-nodes: 10 Number of racks: 1 I ran it again and got this; .Status: HEALTHY Total size: 1414579922902 B (Total open files size: 668770828 B) Total dirs: 3272 Total files: 2519 (Files currently being written: 11) Total blocks (validated): 22922 (avg. block size 61712761 B) (Total open file blocks (not validated): 10) Minimally replicated blocks: 22922 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 3.0 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Number of data-nodes: 10 Number of racks: 1 The filesystem under path '/' is HEALTHY So i started everything and it
Re: M/R scan problem
In the future, provide full dump using pastebin.com Write snippet of log in email. Can you tell us what the following lines are about ? HBaseURLsDaysAggregator.java:124 HBaseURLsDaysAggregator.java:131 How many mappers were launched ? What value is used for hbase.zookeeper.property.maxClientCnxns ? You may need to increase the value for above setting. On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com wrote: I used kill -3, following the thread dump: ... On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote: I wasn't clear in my previous email. It was not answer to why map tasks got stuck. TableInputFormatBase.getSplits() is being called already. Can you try getting jstack of one of the map tasks before task tracker kills it ? Thanks On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com wrote: 1. Currently every map gets one region. So I don't understand what difference will it make using the splits. 2. How should I use the TableInputFormatBase.getSplits() ? Could not find examples for that. Thanks, Lior On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote: For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote: 1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: M/R scan problem
1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are not important since even when I removed all my map code the tasks got stuck (but the thread dumps were generated after I revived the code). If you think its important I'll remove the map code again and re-generate the thread dumps... 2. 82 maps were launched but only 36 ran simultaneously. 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ? Thanks, Lior On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote: In the future, provide full dump using pastebin.com Write snippet of log in email. Can you tell us what the following lines are about ? HBaseURLsDaysAggregator.java:124 HBaseURLsDaysAggregator.java:131 How many mappers were launched ? What value is used for hbase.zookeeper.property.maxClientCnxns ? You may need to increase the value for above setting. On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com wrote: I used kill -3, following the thread dump: ... On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote: I wasn't clear in my previous email. It was not answer to why map tasks got stuck. TableInputFormatBase.getSplits() is being called already. Can you try getting jstack of one of the map tasks before task tracker kills it ? Thanks On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com wrote: 1. Currently every map gets one region. So I don't understand what difference will it make using the splits. 2. How should I use the TableInputFormatBase.getSplits() ? Could not find examples for that. Thanks, Lior On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote: For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote: 1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: client-side caching
See HBASE-4018 On Mon, Jul 4, 2011 at 7:33 AM, Claudio Martella claudio.marte...@tis.bz.it wrote: Hello list, i'm using hbase 0.90.3 on a 5 nodes cluster. I'm using a table as a string-long map. As I'm using this map a lot, I was thinking about installing memcache on the client side, as to avoid flooding hbase for the same value over and over. What is the best practice in these situations? some client-side caching already in hbase? Best, Claudio -- Claudio Martella Digital Technologies Unit Research Development - Analyst TIS innovation park Via Siemens 19 | Siemensstr. 19 39100 Bolzano | 39100 Bozen Tel. +39 0471 068 123 Fax +39 0471 068 129 claudio.marte...@tis.bz.it http://www.tis.bz.it Short information regarding use of personal data. According to Section 13 of Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we process your personal data in order to fulfil contractual and fiscal obligations and also to send you information regarding our services and events. Your personal data are processed with and without electronic means and by respecting data subjects' rights, fundamental freedoms and dignity, particularly with regard to confidentiality, personal identity and the right to personal data protection. At any time and without formalities you can write an e-mail to priv...@tis.bz.it in order to object the processing of your personal data for the purpose of sending advertising materials and also to exercise the right to access personal data and other rights referred to in Section 7 of Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens Street n. 19, Bolzano. You can find the complete information on the web site www.tis.bz.it.
Re: hbck -fix
On Sun, Jul 3, 2011 at 10:12 AM, Wayne wav...@gmail.com wrote: HBase needs to evolve a little more before organizations like ours can just use it without having to become experts. I'd agree with this. In its current state, at least a part-time, seasoned operations engineer (per Andrew's description) is necessary if a substantial production deploy. I don't think that an onerous expectation for a critical piece of infrastructure. It'd certainly broaden our appeal though if we could get into the mysql calibre of ease-of-use That said, the issue you ran into where an 'incident' make it so a 'smart' fellow was unable to reconstitute his store needs addressing. We'll work on this. St.Ack I have to say the community behind HBase is fantastic and goes above and beyond to help greenies like ourselves be successful. With just a little more polish around the edges I think it can and will really become successful for a much wider audience. Thanks for everyones help. On Sun, Jul 3, 2011 at 4:08 AM, Andrew Purtell apurt...@apache.org wrote: I shorthanded this a bit: Certainly a seasoned operations engineer would be a good investment for anyone. Let's try instead: Certainly a seasoned operations engineer [with Java experience] would be a good investment for anyone [running Hadoop based systems]. I'm not sure what I wrote earlier adequately conveyed the thought. - Andy From: Andrew Purtell apurt...@apache.org To: user@hbase.apache.org user@hbase.apache.org Cc: Sent: Sunday, July 3, 2011 12:39 AM Subject: Re: hbck -fix Wayne, Did you by chance have your NameNode configured to write the edit log to only one disk, and in this case only the root volume of the NameNode host? As I'm sure you are now aware, the NameNode's edit log was corrupted, at least the tail of it anyway, when the volume upon which it was being written was filled by an errant process. The HDFS NameNode has a special critical role and it really must be treated with the utmost care. It can and should be configured to write the fsimage and edit log to multiple local dedicated disks. And, user processes should never run on it. Hope has long since flown out the window. I just changed my opinion of what it takes to manage hbase. A Java engineer is required on staff. Perhaps. Certainly a seasoned operations engineer would be a good investment for anyone. Having RF=3 in HDFS offers no insurance against hbase lossing its shirt and having .META. getting corrupted. This is a valid point. If HDFS loses track of blocks containing META table data due to fsimage corruption on the NameNode, having those blocks on 3 DataNodes is of no use. I've done exercises in the past like delete META on disk and recreate it with the earlier set of utilities (add_table.rb). This always worked for me when I've tried it. Results from torture tests that HBase was subjected to in the timeframe leading up to 0.90 also resulted in better handling of .META. table related errors. They are fortunately demonstrably now rare. Clearly however there is room for further improvement here. I will work on https://issues.apache.org/jira/browse/HBASE-4058 and hopefully produce a unit test that fully exercises the ability of HBCK to reconstitute META and gives reliable results that can be incorporated into the test suite. My concern here is getting repeatable results demonstrating HBCK weaknesses will be challenging. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) - Original Message - From: Wayne wav...@gmail.com To: user@hbase.apache.org Cc: Sent: Saturday, July 2, 2011 9:55 AM Subject: Re: hbck -fix It just returns a ton of errors (import: command not found). Our cluster is hosed anyway. I am waiting to get it completely re-installed from scratch. Hope has long since flown out the window. I just changed my opinion of what it takes to manage hbase. A Java engineer is required on staff. I also realized now a backup strategy is more important than for a RDBMS. Having RF=3 in HDFS offers no insurance against hbase lossing its shirt and having .META. getting corrupted. I think I just found the achilles heel. On Sat, Jul 2, 2011 at 12:40 PM, Ted Yu yuzhih...@gmail.com wrote: Have you tried running check_meta.rb with --fix ? On Sat, Jul 2, 2011 at 9:19 AM, Wayne wav...@gmail.com wrote: We are running 0.90.3. We were testing the table export not realizing the data goes to the root drive and not HDFS. The export filled the master's root partition. The logger had issues and HDFS got corrupted (java.io.IOException: Incorrect data format. logVersion is -18 but writables.length is 0). We had to run hadoop fsck -move to fix the corrupted hdfs
Re: M/R scan problem
The reason I asked about HBaseURLsDaysAggregator.java was that I see no HBase (client) code in call stack. I have little clue for the problem you experienced. There may be more than one connection to zookeeper from one map task. So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns Cheers On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter li...@infolinks.com wrote: 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are not important since even when I removed all my map code the tasks got stuck (but the thread dumps were generated after I revived the code). If you think its important I'll remove the map code again and re-generate the thread dumps... 2. 82 maps were launched but only 36 ran simultaneously. 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ? Thanks, Lior On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote: In the future, provide full dump using pastebin.com Write snippet of log in email. Can you tell us what the following lines are about ? HBaseURLsDaysAggregator.java:124 HBaseURLsDaysAggregator.java:131 How many mappers were launched ? What value is used for hbase.zookeeper.property.maxClientCnxns ? You may need to increase the value for above setting. On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com wrote: I used kill -3, following the thread dump: ... On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote: I wasn't clear in my previous email. It was not answer to why map tasks got stuck. TableInputFormatBase.getSplits() is being called already. Can you try getting jstack of one of the map tasks before task tracker kills it ? Thanks On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com wrote: 1. Currently every map gets one region. So I don't understand what difference will it make using the splits. 2. How should I use the TableInputFormatBase.getSplits() ? Could not find examples for that. Thanks, Lior On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote: For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote: 1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: M/R scan problem
From master UI, click 'zk dump' :60010/zk.jsp would show you the active connections. See if the count reaches 300 when map tasks run. On Mon, Jul 4, 2011 at 10:12 AM, Ted Yu yuzhih...@gmail.com wrote: The reason I asked about HBaseURLsDaysAggregator.java was that I see no HBase (client) code in call stack. I have little clue for the problem you experienced. There may be more than one connection to zookeeper from one map task. So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns Cheers On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter li...@infolinks.comwrote: 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are not important since even when I removed all my map code the tasks got stuck (but the thread dumps were generated after I revived the code). If you think its important I'll remove the map code again and re-generate the thread dumps... 2. 82 maps were launched but only 36 ran simultaneously. 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ? Thanks, Lior On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote: In the future, provide full dump using pastebin.com Write snippet of log in email. Can you tell us what the following lines are about ? HBaseURLsDaysAggregator.java:124 HBaseURLsDaysAggregator.java:131 How many mappers were launched ? What value is used for hbase.zookeeper.property.maxClientCnxns ? You may need to increase the value for above setting. On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com wrote: I used kill -3, following the thread dump: ... On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote: I wasn't clear in my previous email. It was not answer to why map tasks got stuck. TableInputFormatBase.getSplits() is being called already. Can you try getting jstack of one of the map tasks before task tracker kills it ? Thanks On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com wrote: 1. Currently every map gets one region. So I don't understand what difference will it make using the splits. 2. How should I use the TableInputFormatBase.getSplits() ? Could not find examples for that. Thanks, Lior On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote: For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote: 1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: M/R scan problem
I will increase the number of connections to 1000. Thanks ! Lior On Mon, Jul 4, 2011 at 8:12 PM, Ted Yu yuzhih...@gmail.com wrote: The reason I asked about HBaseURLsDaysAggregator.java was that I see no HBase (client) code in call stack. I have little clue for the problem you experienced. There may be more than one connection to zookeeper from one map task. So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns Cheers On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter li...@infolinks.com wrote: 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are not important since even when I removed all my map code the tasks got stuck (but the thread dumps were generated after I revived the code). If you think its important I'll remove the map code again and re-generate the thread dumps... 2. 82 maps were launched but only 36 ran simultaneously. 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ? Thanks, Lior On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote: In the future, provide full dump using pastebin.com Write snippet of log in email. Can you tell us what the following lines are about ? HBaseURLsDaysAggregator.java:124 HBaseURLsDaysAggregator.java:131 How many mappers were launched ? What value is used for hbase.zookeeper.property.maxClientCnxns ? You may need to increase the value for above setting. On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com wrote: I used kill -3, following the thread dump: ... On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote: I wasn't clear in my previous email. It was not answer to why map tasks got stuck. TableInputFormatBase.getSplits() is being called already. Can you try getting jstack of one of the map tasks before task tracker kills it ? Thanks On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com wrote: 1. Currently every map gets one region. So I don't understand what difference will it make using the splits. 2. How should I use the TableInputFormatBase.getSplits() ? Could not find examples for that. Thanks, Lior On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote: For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote: 1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: hbck -fix
On Sun, Jul 3, 2011 at 12:39 AM, Andrew Purtell apurt...@apache.org wrote: I've done exercises in the past like delete META on disk and recreate it with the earlier set of utilities (add_table.rb). This always worked for me when I've tried it. We need to update add_table.rb at least. The onlining of regions was done by the metascan. It no longer exists in 0.90. Maybe a disable/enable after an add_table.rb would do but probably better to revamp and merge it with hbck? Results from torture tests that HBase was subjected to in the timeframe leading up to 0.90 also resulted in better handling of .META. table related errors. They are fortunately demonstrably now rare. Agreed. My concern here is getting repeatable results demonstrating HBCK weaknesses will be challenging. Yes. This is the tough one. I was hoping Wayne had a snapshot of .META. to help at least characterize the problem. (This does sound like something our Dan Harvey ran into recently on an hbase 0.20.x hbase. Let me go back to him. He might have some input that will help here.) St.Ack
Re: M/R scan problem
Although connection count may not be the root cause, please read http://zhihongyu.blogspot.com/2011/04/managing-connections-in-hbase-090-and.htmlif you have time. 0.92.0 would do a much better job of managing connections. On Mon, Jul 4, 2011 at 10:14 AM, Lior Schachter li...@infolinks.com wrote: I will increase the number of connections to 1000. Thanks ! Lior On Mon, Jul 4, 2011 at 8:12 PM, Ted Yu yuzhih...@gmail.com wrote: The reason I asked about HBaseURLsDaysAggregator.java was that I see no HBase (client) code in call stack. I have little clue for the problem you experienced. There may be more than one connection to zookeeper from one map task. So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns Cheers On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter li...@infolinks.com wrote: 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are not important since even when I removed all my map code the tasks got stuck (but the thread dumps were generated after I revived the code). If you think its important I'll remove the map code again and re-generate the thread dumps... 2. 82 maps were launched but only 36 ran simultaneously. 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ? Thanks, Lior On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu yuzhih...@gmail.com wrote: In the future, provide full dump using pastebin.com Write snippet of log in email. Can you tell us what the following lines are about ? HBaseURLsDaysAggregator.java:124 HBaseURLsDaysAggregator.java:131 How many mappers were launched ? What value is used for hbase.zookeeper.property.maxClientCnxns ? You may need to increase the value for above setting. On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter li...@infolinks.com wrote: I used kill -3, following the thread dump: ... On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu yuzhih...@gmail.com wrote: I wasn't clear in my previous email. It was not answer to why map tasks got stuck. TableInputFormatBase.getSplits() is being called already. Can you try getting jstack of one of the map tasks before task tracker kills it ? Thanks On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter li...@infolinks.com wrote: 1. Currently every map gets one region. So I don't understand what difference will it make using the splits. 2. How should I use the TableInputFormatBase.getSplits() ? Could not find examples for that. Thanks, Lior On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu yuzhih...@gmail.com wrote: For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter li...@infolinks.com wrote: 1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu yuzhih...@gmail.com wrote: Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter li...@infolinks.com wrote: Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws
Re: M/R scan problem
Did a quick trim... Sorry to jump in on the tail end of this... Two things you may want to look at... Are you timing out because you haven't updated your status within the task or are you taking 600seconds to complete a single map() iteration. You can test this by tracking to see how long you are spending in each map iteration and printing out the result if it is longer than 2 mins... Also try updating your status in each iteration by sending a unique status update like current system time... ... Sent from a remote device. Please excuse any typos... Mike Segel On Jul 4, 2011, at 12:35 PM, Ted Yu yuzhih...@gmail.com wrote: Although connection count may not be the root cause, please read http://zhihongyu.blogspot.com/2011/04/managing-connections-in-hbase-090-and.htmlif you have time. 0.92.0 would do a much better job of managing connections. On Mon, Jul 4, 2011 at 10:14 AM, Lior Schachter li...@infolinks.com wrote:
Re: Reg: The HLog that is created in region creation
Not really used. See hbase-4010 On Jul 4, 2011, at 8:55 PM, Ramkrishna S Vasudevan ramakrish...@huawei.com wrote: Hello Can anybody tell me what is the use of the HLog created per region when a region is created? Regards Ram *** This e-mail and attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient's) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!