Re: Scan problem
Thanks all of you, and your answer help me a lot. 2018-03-19 22:31 GMT+08:00 Saad Mufti : > Another option if you have enough disk space/off heap memory space is to > enable bucket cache to cache even more of your data, and set the > PREFETCH_ON_OPEN => true option on the column families you want always > cache. That way HBase will prefetch your data into the bucket cache and > your scan won't have that initial slowdown. Or if you want to do it > globally for all column families, set the configuration flag > "hbase.rs.prefetchblocksonopen" to "true". Keep in mind though that if you > do this, you should either have enough bucket cache space for all your > data, otherwise there will be a lot of useless eviction activity at HBase > startup and even later. > > Also, where a region is located will also be heavily impacted by which > region balancer you have chosen and how you have tuned it in terms of how > often to run and other parameters. A split region will stay initially at > least on the same region server but your balancer if and when run can move > it (an indeed any region) elsewhere to satisfy its criteria. > > Cheers. > > > Saad > > > On Mon, Mar 19, 2018 at 1:14 AM, ramkrishna vasudevan < > ramkrishna.s.vasude...@gmail.com> wrote: > > > Hi > > > > First regarding the scans, > > > > Generally the data resides in the store files which is in HDFS. So > probably > > the first scan that you are doing is reading from HDFS which involves > disk > > reads. Once the blocks are read, they are cached in the Block cache of > > HBase. So your further reads go through that and hence you see further > > speed up in the scans. > > > > >> And another question about region split, I want to know which > > RegionServer > > will load the new region afther splited , > > Will they be the same One with the old region? > > Yes . Generally same region server hosts it. > > > > In master the code is here, > > https://github.com/apache/hbase/blob/master/hbase- > > server/src/main/java/org/apache/hadoop/hbase/master/assignment/ > > SplitTableRegionProcedure.java > > > > You may need to understand the entire flow to know how the regions are > > opened after a split. > > > > Regards > > Ram > > > > On Sat, Mar 17, 2018 at 9:02 PM, Yang Zhang > > wrote: > > > > > Hello everyone > > > > > > I try to do many Scan use RegionScanner in coprocessor, and > > ervery > > > time ,the first Scan cost about 10 times than the other, > > > I don't know why this will happen > > > > > > OneBucket Scan cost is : 8794 ms Num is : 710 > > > OneBucket Scan cost is : 91 ms Num is : 776 > > > OneBucket Scan cost is : 87 ms Num is : 808 > > > OneBucket Scan cost is : 105 ms Num is : 748 > > > OneBucket Scan cost is : 68 ms Num is : 200 > > > > > > > > > And another question about region split, I want to know which > > RegionServer > > > will load the new region afther splited , > > > Will they be the same One with the old region? Anyone know where I can > > > find the code to learn about that? > > > > > > > > > Thanks for your help > > > > > >
Re: Scan problem
Another option if you have enough disk space/off heap memory space is to enable bucket cache to cache even more of your data, and set the PREFETCH_ON_OPEN => true option on the column families you want always cache. That way HBase will prefetch your data into the bucket cache and your scan won't have that initial slowdown. Or if you want to do it globally for all column families, set the configuration flag "hbase.rs.prefetchblocksonopen" to "true". Keep in mind though that if you do this, you should either have enough bucket cache space for all your data, otherwise there will be a lot of useless eviction activity at HBase startup and even later. Also, where a region is located will also be heavily impacted by which region balancer you have chosen and how you have tuned it in terms of how often to run and other parameters. A split region will stay initially at least on the same region server but your balancer if and when run can move it (an indeed any region) elsewhere to satisfy its criteria. Cheers. Saad On Mon, Mar 19, 2018 at 1:14 AM, ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com> wrote: > Hi > > First regarding the scans, > > Generally the data resides in the store files which is in HDFS. So probably > the first scan that you are doing is reading from HDFS which involves disk > reads. Once the blocks are read, they are cached in the Block cache of > HBase. So your further reads go through that and hence you see further > speed up in the scans. > > >> And another question about region split, I want to know which > RegionServer > will load the new region afther splited , > Will they be the same One with the old region? > Yes . Generally same region server hosts it. > > In master the code is here, > https://github.com/apache/hbase/blob/master/hbase- > server/src/main/java/org/apache/hadoop/hbase/master/assignment/ > SplitTableRegionProcedure.java > > You may need to understand the entire flow to know how the regions are > opened after a split. > > Regards > Ram > > On Sat, Mar 17, 2018 at 9:02 PM, Yang Zhang > wrote: > > > Hello everyone > > > > I try to do many Scan use RegionScanner in coprocessor, and > ervery > > time ,the first Scan cost about 10 times than the other, > > I don't know why this will happen > > > > OneBucket Scan cost is : 8794 ms Num is : 710 > > OneBucket Scan cost is : 91 ms Num is : 776 > > OneBucket Scan cost is : 87 ms Num is : 808 > > OneBucket Scan cost is : 105 ms Num is : 748 > > OneBucket Scan cost is : 68 ms Num is : 200 > > > > > > And another question about region split, I want to know which > RegionServer > > will load the new region afther splited , > > Will they be the same One with the old region? Anyone know where I can > > find the code to learn about that? > > > > > > Thanks for your help > > >
Re: Scan problem
Hi First regarding the scans, Generally the data resides in the store files which is in HDFS. So probably the first scan that you are doing is reading from HDFS which involves disk reads. Once the blocks are read, they are cached in the Block cache of HBase. So your further reads go through that and hence you see further speed up in the scans. >> And another question about region split, I want to know which RegionServer will load the new region afther splited , Will they be the same One with the old region? Yes . Generally same region server hosts it. In master the code is here, https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java You may need to understand the entire flow to know how the regions are opened after a split. Regards Ram On Sat, Mar 17, 2018 at 9:02 PM, Yang Zhang wrote: > Hello everyone > > I try to do many Scan use RegionScanner in coprocessor, and ervery > time ,the first Scan cost about 10 times than the other, > I don't know why this will happen > > OneBucket Scan cost is : 8794 ms Num is : 710 > OneBucket Scan cost is : 91 ms Num is : 776 > OneBucket Scan cost is : 87 ms Num is : 808 > OneBucket Scan cost is : 105 ms Num is : 748 > OneBucket Scan cost is : 68 ms Num is : 200 > > > And another question about region split, I want to know which RegionServer > will load the new region afther splited , > Will they be the same One with the old region? Anyone know where I can > find the code to learn about that? > > > Thanks for your help >
Scan problem
Hello everyone I try to do many Scan use RegionScanner in coprocessor, and ervery time ,the first Scan cost about 10 times than the other, I don't know why this will happen OneBucket Scan cost is : 8794 ms Num is : 710 OneBucket Scan cost is : 91 ms Num is : 776 OneBucket Scan cost is : 87 ms Num is : 808 OneBucket Scan cost is : 105 ms Num is : 748 OneBucket Scan cost is : 68 ms Num is : 200 And another question about region split, I want to know which RegionServer will load the new region afther splited , Will they be the same One with the old region? Anyone know where I can find the code to learn about that? Thanks for your help
Re: M/R scan problem
Did a quick trim... Sorry to jump in on the tail end of this... Two things you may want to look at... Are you timing out because you haven't updated your status within the task or are you taking 600seconds to complete a single map() iteration. You can test this by tracking to see how long you are spending in each map iteration and printing out the result if it is longer than 2 mins... Also try updating your status in each iteration by sending a unique status update like current system time... ... Sent from a remote device. Please excuse any typos... Mike Segel On Jul 4, 2011, at 12:35 PM, Ted Yu wrote: > Although connection count may not be the root cause, please read > http://zhihongyu.blogspot.com/2011/04/managing-connections-in-hbase-090-and.htmlif > you have time. > 0.92.0 would do a much better job of managing connections. > > On Mon, Jul 4, 2011 at 10:14 AM, Lior Schachter wrote: >>
Re: M/R scan problem
Although connection count may not be the root cause, please read http://zhihongyu.blogspot.com/2011/04/managing-connections-in-hbase-090-and.htmlif you have time. 0.92.0 would do a much better job of managing connections. On Mon, Jul 4, 2011 at 10:14 AM, Lior Schachter wrote: > I will increase the number of connections to 1000. > > Thanks ! > > Lior > > > > > On Mon, Jul 4, 2011 at 8:12 PM, Ted Yu wrote: > > > The reason I asked about HBaseURLsDaysAggregator.java was that I see no > > HBase (client) code in call stack. > > I have little clue for the problem you experienced. > > > > There may be more than one connection to zookeeper from one map task. > > So it doesn't hurt if you increase > hbase.zookeeper.property.maxClientCnxns > > > > Cheers > > > > On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter > > wrote: > > > > > 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : > > are > > > not important since even when I removed all my map code the tasks got > > stuck > > > (but the thread dumps were generated after I revived the code). If you > > > think > > > its important I'll remove the map code again and re-generate the thread > > > dumps... > > > > > > 2. 82 maps were launched but only 36 ran simultaneously. > > > > > > 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it > ? > > > > > > Thanks, > > > Lior > > > > > > > > > On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu wrote: > > > > > > > In the future, provide full dump using pastebin.com > > > > Write snippet of log in email. > > > > > > > > Can you tell us what the following lines are about ? > > > > HBaseURLsDaysAggregator.java:124 > > > > HBaseURLsDaysAggregator.java:131 > > > > > > > > How many mappers were launched ? > > > > > > > > What value is used for hbase.zookeeper.property.maxClientCnxns ? > > > > You may need to increase the value for above setting. > > > > > > > > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter > > > > wrote: > > > > > > > > > I used kill -3, following the thread dump: > > > > > > > > > > ... > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu > wrote: > > > > > > > > > > > I wasn't clear in my previous email. > > > > > > It was not answer to why map tasks got stuck. > > > > > > TableInputFormatBase.getSplits() is being called already. > > > > > > > > > > > > Can you try getting jstack of one of the map tasks before task > > > tracker > > > > > > kills > > > > > > it ? > > > > > > > > > > > > Thanks > > > > > > > > > > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter < > > li...@infolinks.com> > > > > > > wrote: > > > > > > > > > > > > > 1. Currently every map gets one region. So I don't understand > > what > > > > > > > difference will it make using the splits. > > > > > > > 2. How should I use the TableInputFormatBase.getSplits() ? > Could > > > not > > > > > find > > > > > > > examples for that. > > > > > > > > > > > > > > Thanks, > > > > > > > Lior > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu > > > wrote: > > > > > > > > > > > > > > > For #2, see TableInputFormatBase.getSplits(): > > > > > > > > * Calculates the splits that will serve as input for the > map > > > > tasks. > > > > > > The > > > > > > > > * number of splits matches the number of regions in a > table. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter < > > > > li...@infolinks.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > 1. yes - I configure my job using this line: > > > > > > > > > > > > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, > > > > > > > scan, > > > > > > > > > ScanMapper.class, Text.class, MapWritable.class, job) > > > > > > > > > > > > > > > > > > which internally uses TableInputFormat.class > > > > > > > > > > > > > > > > > > 2. One split per region ? What do you mean ? How do I do > that > > ? > > > > > > > > > > > > > > > > > > 3. hbase version 0.90.2 > > > > > > > > > > > > > > > > > > 4. no exceptions. the logs are very clean. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu < > yuzhih...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > > > > > > > Do you use TableInputFormat ? > > > > > > > > > > To scan large number of rows, it would be better to > produce > > > one > > > > > > Split > > > > > > > > per > > > > > > > > > > region. > > > > > > > > > > > > > > > > > > > > What HBase version do you use ? > > > > > > > > > > Do you find any exception in master / region server logs > > > around > > > > > the > > > > > > > > > moment > > > > > > > > > > of timeout ? > > > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter < > > > > > > li...@infolinks.com> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > I'm running a scan using the M/R framework. > > > > > > > > > > > My t
Re: M/R scan problem
I will increase the number of connections to 1000. Thanks ! Lior On Mon, Jul 4, 2011 at 8:12 PM, Ted Yu wrote: > The reason I asked about HBaseURLsDaysAggregator.java was that I see no > HBase (client) code in call stack. > I have little clue for the problem you experienced. > > There may be more than one connection to zookeeper from one map task. > So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns > > Cheers > > On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter > wrote: > > > 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : > are > > not important since even when I removed all my map code the tasks got > stuck > > (but the thread dumps were generated after I revived the code). If you > > think > > its important I'll remove the map code again and re-generate the thread > > dumps... > > > > 2. 82 maps were launched but only 36 ran simultaneously. > > > > 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ? > > > > Thanks, > > Lior > > > > > > On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu wrote: > > > > > In the future, provide full dump using pastebin.com > > > Write snippet of log in email. > > > > > > Can you tell us what the following lines are about ? > > > HBaseURLsDaysAggregator.java:124 > > > HBaseURLsDaysAggregator.java:131 > > > > > > How many mappers were launched ? > > > > > > What value is used for hbase.zookeeper.property.maxClientCnxns ? > > > You may need to increase the value for above setting. > > > > > > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter > > > wrote: > > > > > > > I used kill -3, following the thread dump: > > > > > > > > ... > > > > > > > > > > > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu wrote: > > > > > > > > > I wasn't clear in my previous email. > > > > > It was not answer to why map tasks got stuck. > > > > > TableInputFormatBase.getSplits() is being called already. > > > > > > > > > > Can you try getting jstack of one of the map tasks before task > > tracker > > > > > kills > > > > > it ? > > > > > > > > > > Thanks > > > > > > > > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter < > li...@infolinks.com> > > > > > wrote: > > > > > > > > > > > 1. Currently every map gets one region. So I don't understand > what > > > > > > difference will it make using the splits. > > > > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could > > not > > > > find > > > > > > examples for that. > > > > > > > > > > > > Thanks, > > > > > > Lior > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu > > wrote: > > > > > > > > > > > > > For #2, see TableInputFormatBase.getSplits(): > > > > > > > * Calculates the splits that will serve as input for the map > > > tasks. > > > > > The > > > > > > > * number of splits matches the number of regions in a table. > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter < > > > li...@infolinks.com> > > > > > > > wrote: > > > > > > > > > > > > > > > 1. yes - I configure my job using this line: > > > > > > > > > > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, > > > > > > scan, > > > > > > > > ScanMapper.class, Text.class, MapWritable.class, job) > > > > > > > > > > > > > > > > which internally uses TableInputFormat.class > > > > > > > > > > > > > > > > 2. One split per region ? What do you mean ? How do I do that > ? > > > > > > > > > > > > > > > > 3. hbase version 0.90.2 > > > > > > > > > > > > > > > > 4. no exceptions. the logs are very clean. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu > > > > wrote: > > > > > > > > > > > > > > > > > Do you use TableInputFormat ? > > > > > > > > > To scan large number of rows, it would be better to produce > > one > > > > > Split > > > > > > > per > > > > > > > > > region. > > > > > > > > > > > > > > > > > > What HBase version do you use ? > > > > > > > > > Do you find any exception in master / region server logs > > around > > > > the > > > > > > > > moment > > > > > > > > > of timeout ? > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter < > > > > > li...@infolinks.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > I'm running a scan using the M/R framework. > > > > > > > > > > My table contains hundreds of millions of rows and I'm > > > scanning > > > > > > using > > > > > > > > > > start/stop key about 50 million rows. > > > > > > > > > > > > > > > > > > > > The problem is that some map tasks get stuck and the task > > > > manager > > > > > > > kills > > > > > > > > > > these maps after 600 seconds. When retrying the task > > > everything > > > > > > works > > > > > > > > > fine > > > > > > > > > > (sometimes). > > > > > > > > > > > > > > > > > > > > To verify that the problem is in hbase (and not in the > map > > > > code) > > > > > I > > > > > > > > > removed > > > > > > > >
Re: M/R scan problem
>From master UI, click 'zk dump' :60010/zk.jsp would show you the active connections. See if the count reaches 300 when map tasks run. On Mon, Jul 4, 2011 at 10:12 AM, Ted Yu wrote: > The reason I asked about HBaseURLsDaysAggregator.java was that I see no > HBase (client) code in call stack. > I have little clue for the problem you experienced. > > There may be more than one connection to zookeeper from one map task. > So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns > > Cheers > > > On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter wrote: > >> 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : >> are >> not important since even when I removed all my map code the tasks got >> stuck >> (but the thread dumps were generated after I revived the code). If you >> think >> its important I'll remove the map code again and re-generate the thread >> dumps... >> >> 2. 82 maps were launched but only 36 ran simultaneously. >> >> 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ? >> >> Thanks, >> Lior >> >> >> On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu wrote: >> >> > In the future, provide full dump using pastebin.com >> > Write snippet of log in email. >> > >> > Can you tell us what the following lines are about ? >> > HBaseURLsDaysAggregator.java:124 >> > HBaseURLsDaysAggregator.java:131 >> > >> > How many mappers were launched ? >> > >> > What value is used for hbase.zookeeper.property.maxClientCnxns ? >> > You may need to increase the value for above setting. >> > >> > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter >> > wrote: >> > >> > > I used kill -3, following the thread dump: >> > > >> > > ... >> > > >> > > >> > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu wrote: >> > > >> > > > I wasn't clear in my previous email. >> > > > It was not answer to why map tasks got stuck. >> > > > TableInputFormatBase.getSplits() is being called already. >> > > > >> > > > Can you try getting jstack of one of the map tasks before task >> tracker >> > > > kills >> > > > it ? >> > > > >> > > > Thanks >> > > > >> > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter > > >> > > > wrote: >> > > > >> > > > > 1. Currently every map gets one region. So I don't understand what >> > > > > difference will it make using the splits. >> > > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could >> not >> > > find >> > > > > examples for that. >> > > > > >> > > > > Thanks, >> > > > > Lior >> > > > > >> > > > > >> > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu >> wrote: >> > > > > >> > > > > > For #2, see TableInputFormatBase.getSplits(): >> > > > > > * Calculates the splits that will serve as input for the map >> > tasks. >> > > > The >> > > > > > * number of splits matches the number of regions in a table. >> > > > > > >> > > > > > >> > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter < >> > li...@infolinks.com> >> > > > > > wrote: >> > > > > > >> > > > > > > 1. yes - I configure my job using this line: >> > > > > > > >> > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, >> > > > > scan, >> > > > > > > ScanMapper.class, Text.class, MapWritable.class, job) >> > > > > > > >> > > > > > > which internally uses TableInputFormat.class >> > > > > > > >> > > > > > > 2. One split per region ? What do you mean ? How do I do that >> ? >> > > > > > > >> > > > > > > 3. hbase version 0.90.2 >> > > > > > > >> > > > > > > 4. no exceptions. the logs are very clean. >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu >> > > wrote: >> > > > > > > >> > > > > > > > Do you use TableInputFormat ? >> > > > > > > > To scan large number of rows, it would be better to produce >> one >> > > > Split >> > > > > > per >> > > > > > > > region. >> > > > > > > > >> > > > > > > > What HBase version do you use ? >> > > > > > > > Do you find any exception in master / region server logs >> around >> > > the >> > > > > > > moment >> > > > > > > > of timeout ? >> > > > > > > > >> > > > > > > > Cheers >> > > > > > > > >> > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter < >> > > > li...@infolinks.com> >> > > > > > > > wrote: >> > > > > > > > >> > > > > > > > > Hi all, >> > > > > > > > > I'm running a scan using the M/R framework. >> > > > > > > > > My table contains hundreds of millions of rows and I'm >> > scanning >> > > > > using >> > > > > > > > > start/stop key about 50 million rows. >> > > > > > > > > >> > > > > > > > > The problem is that some map tasks get stuck and the task >> > > manager >> > > > > > kills >> > > > > > > > > these maps after 600 seconds. When retrying the task >> > everything >> > > > > works >> > > > > > > > fine >> > > > > > > > > (sometimes). >> > > > > > > > > >> > > > > > > > > To verify that the problem is in hbase (and not in the map >> > > code) >> > > > I >> > > > > > > > removed >> > > > > > > > > all the code from my map function, so it looks like this: >> > > > > > > > > public void ma
Re: M/R scan problem
The reason I asked about HBaseURLsDaysAggregator.java was that I see no HBase (client) code in call stack. I have little clue for the problem you experienced. There may be more than one connection to zookeeper from one map task. So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns Cheers On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter wrote: > 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are > not important since even when I removed all my map code the tasks got stuck > (but the thread dumps were generated after I revived the code). If you > think > its important I'll remove the map code again and re-generate the thread > dumps... > > 2. 82 maps were launched but only 36 ran simultaneously. > > 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ? > > Thanks, > Lior > > > On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu wrote: > > > In the future, provide full dump using pastebin.com > > Write snippet of log in email. > > > > Can you tell us what the following lines are about ? > > HBaseURLsDaysAggregator.java:124 > > HBaseURLsDaysAggregator.java:131 > > > > How many mappers were launched ? > > > > What value is used for hbase.zookeeper.property.maxClientCnxns ? > > You may need to increase the value for above setting. > > > > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter > > wrote: > > > > > I used kill -3, following the thread dump: > > > > > > ... > > > > > > > > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu wrote: > > > > > > > I wasn't clear in my previous email. > > > > It was not answer to why map tasks got stuck. > > > > TableInputFormatBase.getSplits() is being called already. > > > > > > > > Can you try getting jstack of one of the map tasks before task > tracker > > > > kills > > > > it ? > > > > > > > > Thanks > > > > > > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter > > > > wrote: > > > > > > > > > 1. Currently every map gets one region. So I don't understand what > > > > > difference will it make using the splits. > > > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could > not > > > find > > > > > examples for that. > > > > > > > > > > Thanks, > > > > > Lior > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu > wrote: > > > > > > > > > > > For #2, see TableInputFormatBase.getSplits(): > > > > > > * Calculates the splits that will serve as input for the map > > tasks. > > > > The > > > > > > * number of splits matches the number of regions in a table. > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter < > > li...@infolinks.com> > > > > > > wrote: > > > > > > > > > > > > > 1. yes - I configure my job using this line: > > > > > > > > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, > > > > > scan, > > > > > > > ScanMapper.class, Text.class, MapWritable.class, job) > > > > > > > > > > > > > > which internally uses TableInputFormat.class > > > > > > > > > > > > > > 2. One split per region ? What do you mean ? How do I do that ? > > > > > > > > > > > > > > 3. hbase version 0.90.2 > > > > > > > > > > > > > > 4. no exceptions. the logs are very clean. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu > > > wrote: > > > > > > > > > > > > > > > Do you use TableInputFormat ? > > > > > > > > To scan large number of rows, it would be better to produce > one > > > > Split > > > > > > per > > > > > > > > region. > > > > > > > > > > > > > > > > What HBase version do you use ? > > > > > > > > Do you find any exception in master / region server logs > around > > > the > > > > > > > moment > > > > > > > > of timeout ? > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter < > > > > li...@infolinks.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > I'm running a scan using the M/R framework. > > > > > > > > > My table contains hundreds of millions of rows and I'm > > scanning > > > > > using > > > > > > > > > start/stop key about 50 million rows. > > > > > > > > > > > > > > > > > > The problem is that some map tasks get stuck and the task > > > manager > > > > > > kills > > > > > > > > > these maps after 600 seconds. When retrying the task > > everything > > > > > works > > > > > > > > fine > > > > > > > > > (sometimes). > > > > > > > > > > > > > > > > > > To verify that the problem is in hbase (and not in the map > > > code) > > > > I > > > > > > > > removed > > > > > > > > > all the code from my map function, so it looks like this: > > > > > > > > > public void map(ImmutableBytesWritable key, Result value, > > > Context > > > > > > > > context) > > > > > > > > > throws IOException, InterruptedException { > > > > > > > > > } > > > > > > > > > > > > > > > > > > Also, when the map got stuck on a region, I tried to scan > > this > > > > > region > > > > > > > > > (using > > > > > > > > > simple scan from a Java main
Re: M/R scan problem
1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are not important since even when I removed all my map code the tasks got stuck (but the thread dumps were generated after I revived the code). If you think its important I'll remove the map code again and re-generate the thread dumps... 2. 82 maps were launched but only 36 ran simultaneously. 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ? Thanks, Lior On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu wrote: > In the future, provide full dump using pastebin.com > Write snippet of log in email. > > Can you tell us what the following lines are about ? > HBaseURLsDaysAggregator.java:124 > HBaseURLsDaysAggregator.java:131 > > How many mappers were launched ? > > What value is used for hbase.zookeeper.property.maxClientCnxns ? > You may need to increase the value for above setting. > > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter > wrote: > > > I used kill -3, following the thread dump: > > > > ... > > > > > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu wrote: > > > > > I wasn't clear in my previous email. > > > It was not answer to why map tasks got stuck. > > > TableInputFormatBase.getSplits() is being called already. > > > > > > Can you try getting jstack of one of the map tasks before task tracker > > > kills > > > it ? > > > > > > Thanks > > > > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter > > > wrote: > > > > > > > 1. Currently every map gets one region. So I don't understand what > > > > difference will it make using the splits. > > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could not > > find > > > > examples for that. > > > > > > > > Thanks, > > > > Lior > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu wrote: > > > > > > > > > For #2, see TableInputFormatBase.getSplits(): > > > > > * Calculates the splits that will serve as input for the map > tasks. > > > The > > > > > * number of splits matches the number of regions in a table. > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter < > li...@infolinks.com> > > > > > wrote: > > > > > > > > > > > 1. yes - I configure my job using this line: > > > > > > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, > > > > scan, > > > > > > ScanMapper.class, Text.class, MapWritable.class, job) > > > > > > > > > > > > which internally uses TableInputFormat.class > > > > > > > > > > > > 2. One split per region ? What do you mean ? How do I do that ? > > > > > > > > > > > > 3. hbase version 0.90.2 > > > > > > > > > > > > 4. no exceptions. the logs are very clean. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu > > wrote: > > > > > > > > > > > > > Do you use TableInputFormat ? > > > > > > > To scan large number of rows, it would be better to produce one > > > Split > > > > > per > > > > > > > region. > > > > > > > > > > > > > > What HBase version do you use ? > > > > > > > Do you find any exception in master / region server logs around > > the > > > > > > moment > > > > > > > of timeout ? > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter < > > > li...@infolinks.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi all, > > > > > > > > I'm running a scan using the M/R framework. > > > > > > > > My table contains hundreds of millions of rows and I'm > scanning > > > > using > > > > > > > > start/stop key about 50 million rows. > > > > > > > > > > > > > > > > The problem is that some map tasks get stuck and the task > > manager > > > > > kills > > > > > > > > these maps after 600 seconds. When retrying the task > everything > > > > works > > > > > > > fine > > > > > > > > (sometimes). > > > > > > > > > > > > > > > > To verify that the problem is in hbase (and not in the map > > code) > > > I > > > > > > > removed > > > > > > > > all the code from my map function, so it looks like this: > > > > > > > > public void map(ImmutableBytesWritable key, Result value, > > Context > > > > > > > context) > > > > > > > > throws IOException, InterruptedException { > > > > > > > > } > > > > > > > > > > > > > > > > Also, when the map got stuck on a region, I tried to scan > this > > > > region > > > > > > > > (using > > > > > > > > simple scan from a Java main) and it worked fine. > > > > > > > > > > > > > > > > Any ideas ? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Lior > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: M/R scan problem
In the future, provide full dump using pastebin.com Write snippet of log in email. Can you tell us what the following lines are about ? HBaseURLsDaysAggregator.java:124 HBaseURLsDaysAggregator.java:131 How many mappers were launched ? What value is used for hbase.zookeeper.property.maxClientCnxns ? You may need to increase the value for above setting. On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter wrote: > I used kill -3, following the thread dump: > > ... > > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu wrote: > > > I wasn't clear in my previous email. > > It was not answer to why map tasks got stuck. > > TableInputFormatBase.getSplits() is being called already. > > > > Can you try getting jstack of one of the map tasks before task tracker > > kills > > it ? > > > > Thanks > > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter > > wrote: > > > > > 1. Currently every map gets one region. So I don't understand what > > > difference will it make using the splits. > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could not > find > > > examples for that. > > > > > > Thanks, > > > Lior > > > > > > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu wrote: > > > > > > > For #2, see TableInputFormatBase.getSplits(): > > > > * Calculates the splits that will serve as input for the map tasks. > > The > > > > * number of splits matches the number of regions in a table. > > > > > > > > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter > > > > wrote: > > > > > > > > > 1. yes - I configure my job using this line: > > > > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, > > > scan, > > > > > ScanMapper.class, Text.class, MapWritable.class, job) > > > > > > > > > > which internally uses TableInputFormat.class > > > > > > > > > > 2. One split per region ? What do you mean ? How do I do that ? > > > > > > > > > > 3. hbase version 0.90.2 > > > > > > > > > > 4. no exceptions. the logs are very clean. > > > > > > > > > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu > wrote: > > > > > > > > > > > Do you use TableInputFormat ? > > > > > > To scan large number of rows, it would be better to produce one > > Split > > > > per > > > > > > region. > > > > > > > > > > > > What HBase version do you use ? > > > > > > Do you find any exception in master / region server logs around > the > > > > > moment > > > > > > of timeout ? > > > > > > > > > > > > Cheers > > > > > > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter < > > li...@infolinks.com> > > > > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > I'm running a scan using the M/R framework. > > > > > > > My table contains hundreds of millions of rows and I'm scanning > > > using > > > > > > > start/stop key about 50 million rows. > > > > > > > > > > > > > > The problem is that some map tasks get stuck and the task > manager > > > > kills > > > > > > > these maps after 600 seconds. When retrying the task everything > > > works > > > > > > fine > > > > > > > (sometimes). > > > > > > > > > > > > > > To verify that the problem is in hbase (and not in the map > code) > > I > > > > > > removed > > > > > > > all the code from my map function, so it looks like this: > > > > > > > public void map(ImmutableBytesWritable key, Result value, > Context > > > > > > context) > > > > > > > throws IOException, InterruptedException { > > > > > > > } > > > > > > > > > > > > > > Also, when the map got stuck on a region, I tried to scan this > > > region > > > > > > > (using > > > > > > > simple scan from a Java main) and it worked fine. > > > > > > > > > > > > > > Any ideas ? > > > > > > > > > > > > > > Thanks, > > > > > > > Lior > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: M/R scan problem
I used kill -3, following the thread dump: Full thread dump Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode): "IPC Client (47) connection to /127.0.0.1:59759 from hadoop" daemon prio=10 tid=0x2aaab05ca800 nid=0x4eaf in Object.wait() [0x403c1000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xf9dba860> (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:403) - locked <0xf9dba860> (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:445) "SpillThread" daemon prio=10 tid=0x2aaab0585000 nid=0x4c99 waiting on condition [0x404c2000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xf9af0c38> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1169) "main-EventThread" daemon prio=10 tid=0x2aaab035d000 nid=0x4c95 waiting on condition [0x41207000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xf9af5f58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502) "main-SendThread(hadoop09.infolinks.local:2181)" daemon prio=10 tid=0x2aaab035c000 nid=0x4c94 runnable [0x40815000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0xf9af61a8> (a sun.nio.ch.Util$2) - locked <0xf9af61b8> (a java.util.Collections$UnmodifiableSet) - locked <0xf9af6160> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107) "communication thread" daemon prio=10 tid=0x4d02 nid=0x4c93 waiting on condition [0x42497000] java.lang.Thread.State: RUNNABLE at java.util.Hashtable.put(Hashtable.java:420) - locked <0xf9dbaa58> (a java.util.Hashtable) at org.apache.hadoop.ipc.Client$Connection.addCall(Client.java:225) - locked <0xf9dba860> (a org.apache.hadoop.ipc.Client$Connection) at org.apache.hadoop.ipc.Client$Connection.access$1600(Client.java:176) at org.apache.hadoop.ipc.Client.getConnection(Client.java:854) at org.apache.hadoop.ipc.Client.call(Client.java:720) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at org.apache.hadoop.mapred.$Proxy0.ping(Unknown Source) at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:548) at java.lang.Thread.run(Thread.java:662) "Thread for syncLogs" daemon prio=10 tid=0x2aaab02e9800 nid=0x4c90 runnable [0x40714000] java.lang.Thread.State: RUNNABLE at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at java.io.UnixFileSystem.resolve(UnixFileSystem.java:93) at java.io.File.(File.java:312) at org.apache.hadoop.mapred.TaskLog.getTaskLogFile(TaskLog.java:72) at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:180) at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:230) - locked <0xeea92fc0> (a java.lang.Class for org.apache.hadoop.mapred.TaskLog) at org.apache.hadoop.mapred.Child$2.run(Child.java:89) "Low Memory Detector" daemon prio=10 tid=0x2aaab0001800 nid=0x4c86 runnable [0x] java.lang.Thread.State: RUNNABLE "CompilerThread1" daemon prio=10 tid=0x4cb4e800 nid=0x4c85 waiting on condition [0x] java.lang.Thread.State: RUNNABLE "CompilerThread0" daemon prio=10 tid=0x4cb4b000 nid=0x4c84 waiti
Re: M/R scan problem
I wasn't clear in my previous email. It was not answer to why map tasks got stuck. TableInputFormatBase.getSplits() is being called already. Can you try getting jstack of one of the map tasks before task tracker kills it ? Thanks On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter wrote: > 1. Currently every map gets one region. So I don't understand what > difference will it make using the splits. > 2. How should I use the TableInputFormatBase.getSplits() ? Could not find > examples for that. > > Thanks, > Lior > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu wrote: > > > For #2, see TableInputFormatBase.getSplits(): > > * Calculates the splits that will serve as input for the map tasks. The > > * number of splits matches the number of regions in a table. > > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter > > wrote: > > > > > 1. yes - I configure my job using this line: > > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, > scan, > > > ScanMapper.class, Text.class, MapWritable.class, job) > > > > > > which internally uses TableInputFormat.class > > > > > > 2. One split per region ? What do you mean ? How do I do that ? > > > > > > 3. hbase version 0.90.2 > > > > > > 4. no exceptions. the logs are very clean. > > > > > > > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu wrote: > > > > > > > Do you use TableInputFormat ? > > > > To scan large number of rows, it would be better to produce one Split > > per > > > > region. > > > > > > > > What HBase version do you use ? > > > > Do you find any exception in master / region server logs around the > > > moment > > > > of timeout ? > > > > > > > > Cheers > > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter > > > > wrote: > > > > > > > > > Hi all, > > > > > I'm running a scan using the M/R framework. > > > > > My table contains hundreds of millions of rows and I'm scanning > using > > > > > start/stop key about 50 million rows. > > > > > > > > > > The problem is that some map tasks get stuck and the task manager > > kills > > > > > these maps after 600 seconds. When retrying the task everything > works > > > > fine > > > > > (sometimes). > > > > > > > > > > To verify that the problem is in hbase (and not in the map code) I > > > > removed > > > > > all the code from my map function, so it looks like this: > > > > > public void map(ImmutableBytesWritable key, Result value, Context > > > > context) > > > > > throws IOException, InterruptedException { > > > > > } > > > > > > > > > > Also, when the map got stuck on a region, I tried to scan this > region > > > > > (using > > > > > simple scan from a Java main) and it worked fine. > > > > > > > > > > Any ideas ? > > > > > > > > > > Thanks, > > > > > Lior > > > > > > > > > > > > > > >
Re: M/R scan problem
1. Currently every map gets one region. So I don't understand what difference will it make using the splits. 2. How should I use the TableInputFormatBase.getSplits() ? Could not find examples for that. Thanks, Lior On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu wrote: > For #2, see TableInputFormatBase.getSplits(): > * Calculates the splits that will serve as input for the map tasks. The > * number of splits matches the number of regions in a table. > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter > wrote: > > > 1. yes - I configure my job using this line: > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, > > ScanMapper.class, Text.class, MapWritable.class, job) > > > > which internally uses TableInputFormat.class > > > > 2. One split per region ? What do you mean ? How do I do that ? > > > > 3. hbase version 0.90.2 > > > > 4. no exceptions. the logs are very clean. > > > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu wrote: > > > > > Do you use TableInputFormat ? > > > To scan large number of rows, it would be better to produce one Split > per > > > region. > > > > > > What HBase version do you use ? > > > Do you find any exception in master / region server logs around the > > moment > > > of timeout ? > > > > > > Cheers > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter > > > wrote: > > > > > > > Hi all, > > > > I'm running a scan using the M/R framework. > > > > My table contains hundreds of millions of rows and I'm scanning using > > > > start/stop key about 50 million rows. > > > > > > > > The problem is that some map tasks get stuck and the task manager > kills > > > > these maps after 600 seconds. When retrying the task everything works > > > fine > > > > (sometimes). > > > > > > > > To verify that the problem is in hbase (and not in the map code) I > > > removed > > > > all the code from my map function, so it looks like this: > > > > public void map(ImmutableBytesWritable key, Result value, Context > > > context) > > > > throws IOException, InterruptedException { > > > > } > > > > > > > > Also, when the map got stuck on a region, I tried to scan this region > > > > (using > > > > simple scan from a Java main) and it worked fine. > > > > > > > > Any ideas ? > > > > > > > > Thanks, > > > > Lior > > > > > > > > > >
Re: M/R scan problem
For #2, see TableInputFormatBase.getSplits(): * Calculates the splits that will serve as input for the map tasks. The * number of splits matches the number of regions in a table. On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter wrote: > 1. yes - I configure my job using this line: > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, > ScanMapper.class, Text.class, MapWritable.class, job) > > which internally uses TableInputFormat.class > > 2. One split per region ? What do you mean ? How do I do that ? > > 3. hbase version 0.90.2 > > 4. no exceptions. the logs are very clean. > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu wrote: > > > Do you use TableInputFormat ? > > To scan large number of rows, it would be better to produce one Split per > > region. > > > > What HBase version do you use ? > > Do you find any exception in master / region server logs around the > moment > > of timeout ? > > > > Cheers > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter > > wrote: > > > > > Hi all, > > > I'm running a scan using the M/R framework. > > > My table contains hundreds of millions of rows and I'm scanning using > > > start/stop key about 50 million rows. > > > > > > The problem is that some map tasks get stuck and the task manager kills > > > these maps after 600 seconds. When retrying the task everything works > > fine > > > (sometimes). > > > > > > To verify that the problem is in hbase (and not in the map code) I > > removed > > > all the code from my map function, so it looks like this: > > > public void map(ImmutableBytesWritable key, Result value, Context > > context) > > > throws IOException, InterruptedException { > > > } > > > > > > Also, when the map got stuck on a region, I tried to scan this region > > > (using > > > simple scan from a Java main) and it worked fine. > > > > > > Any ideas ? > > > > > > Thanks, > > > Lior > > > > > >
Re: M/R scan problem
1. yes - I configure my job using this line: TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan, ScanMapper.class, Text.class, MapWritable.class, job) which internally uses TableInputFormat.class 2. One split per region ? What do you mean ? How do I do that ? 3. hbase version 0.90.2 4. no exceptions. the logs are very clean. On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu wrote: > Do you use TableInputFormat ? > To scan large number of rows, it would be better to produce one Split per > region. > > What HBase version do you use ? > Do you find any exception in master / region server logs around the moment > of timeout ? > > Cheers > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter > wrote: > > > Hi all, > > I'm running a scan using the M/R framework. > > My table contains hundreds of millions of rows and I'm scanning using > > start/stop key about 50 million rows. > > > > The problem is that some map tasks get stuck and the task manager kills > > these maps after 600 seconds. When retrying the task everything works > fine > > (sometimes). > > > > To verify that the problem is in hbase (and not in the map code) I > removed > > all the code from my map function, so it looks like this: > > public void map(ImmutableBytesWritable key, Result value, Context > context) > > throws IOException, InterruptedException { > > } > > > > Also, when the map got stuck on a region, I tried to scan this region > > (using > > simple scan from a Java main) and it worked fine. > > > > Any ideas ? > > > > Thanks, > > Lior > > >
Re: M/R scan problem
Do you use TableInputFormat ? To scan large number of rows, it would be better to produce one Split per region. What HBase version do you use ? Do you find any exception in master / region server logs around the moment of timeout ? Cheers On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter wrote: > Hi all, > I'm running a scan using the M/R framework. > My table contains hundreds of millions of rows and I'm scanning using > start/stop key about 50 million rows. > > The problem is that some map tasks get stuck and the task manager kills > these maps after 600 seconds. When retrying the task everything works fine > (sometimes). > > To verify that the problem is in hbase (and not in the map code) I removed > all the code from my map function, so it looks like this: > public void map(ImmutableBytesWritable key, Result value, Context context) > throws IOException, InterruptedException { > } > > Also, when the map got stuck on a region, I tried to scan this region > (using > simple scan from a Java main) and it worked fine. > > Any ideas ? > > Thanks, > Lior >
M/R scan problem
Hi all, I'm running a scan using the M/R framework. My table contains hundreds of millions of rows and I'm scanning using start/stop key about 50 million rows. The problem is that some map tasks get stuck and the task manager kills these maps after 600 seconds. When retrying the task everything works fine (sometimes). To verify that the problem is in hbase (and not in the map code) I removed all the code from my map function, so it looks like this: public void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException { } Also, when the map got stuck on a region, I tried to scan this region (using simple scan from a Java main) and it worked fine. Any ideas ? Thanks, Lior
Re: HBase filtered scan problem
Thank you very much St. Ack. It sounds like we have to create other filer. Iulia On 05/12/2011 08:07 PM, Stack wrote: On Thu, May 12, 2011 at 6:42 AM, Iulia Zidaru wrote: Hi, Thank you for your answer St. Ack. Yes, both coordinates are the same. It is impossible for the filter to decide that a value is old. I still don't understand why the HBase server has both values or how long does it keep both. Well its hard to 'overwrite' if one value is in the memstore and the other is out on the filesystem. It'll do the clean up on major compaction. The fiilter should be able to pick up ordering hints from its context; its just not doing it. The same thing happens if puts have different timestamps. With the filter you mean? I'd think the filter should distingush these. St.Ack
Re: HBase filtered scan problem
On Thu, May 12, 2011 at 6:42 AM, Iulia Zidaru wrote: > Hi, > > Thank you for your answer St. Ack. > Yes, both coordinates are the same. It is impossible for the filter to > decide that a value is old. I still don't understand why the HBase server > has both values or how long does it keep both. Well its hard to 'overwrite' if one value is in the memstore and the other is out on the filesystem. It'll do the clean up on major compaction. The fiilter should be able to pick up ordering hints from its context; its just not doing it. > The same thing happens if > puts have different timestamps. > With the filter you mean? I'd think the filter should distingush these. St.Ack
Re: HBase filtered scan problem
Hi, Thank you for your answer St. Ack. Yes, both coordinates are the same. It is impossible for the filter to decide that a value is old. I still don't understand why the HBase server has both values or how long does it keep both. The same thing happens if puts have different timestamps. Regards, Iulia On 05/11/2011 08:05 PM, Stack wrote: On Wed, May 11, 2011 at 2:05 AM, Iulia Zidaru wrote: Hi, I'll try to rephrase the problem... We have a table where we add an empty value.(The same thing happen also if we have a value). Afterward we put a value inside.(Same put, just other value). When scanning for empty values (first values inserted), the result is wrong because the filter gets called for both values (the empty which maches and the not empty which doesn't match). The table has only one version. It looks like the heap object in StoreScanner has both objects. Do you have any idea if this is a normal behavior and if we can avoid this somehow? Both entries exist in the hbase server, yes. The coordinates for both are the same? If exactly the same row/cf/qualifier/timestamp then its going to be hard to distingush between the two entries. The filter is probably not smart enough to take insertion order into account. St.Ack
Re: HBase filtered scan problem
On Wed, May 11, 2011 at 2:05 AM, Iulia Zidaru wrote: > Hi, > I'll try to rephrase the problem... > We have a table where we add an empty value.(The same thing happen also if > we have a value). > Afterward we put a value inside.(Same put, just other value). When scanning > for empty values (first values inserted), the result is wrong because the > filter gets called for both values (the empty which maches and the not empty > which doesn't match). The table has only one version. It looks like the heap > object in StoreScanner has both objects. Do you have any idea if this is a > normal behavior and if we can avoid this somehow? > Both entries exist in the hbase server, yes. The coordinates for both are the same? If exactly the same row/cf/qualifier/timestamp then its going to be hard to distingush between the two entries. The filter is probably not smart enough to take insertion order into account. St.Ack
Re: HBase filtered scan problem
Hi, I'll try to rephrase the problem... We have a table where we add an empty value.(The same thing happen also if we have a value). Afterward we put a value inside.(Same put, just other value). When scanning for empty values (first values inserted), the result is wrong because the filter gets called for both values (the empty which maches and the not empty which doesn't match). The table has only one version. It looks like the heap object in StoreScanner has both objects. Do you have any idea if this is a normal behavior and if we can avoid this somehow? Thank you, Iulia On 05/10/2011 03:56 PM, Stefan Comanita wrote: Hi all, I want to do a scan on a number of rows, each row having multiple columns, and I want to filter out some of this columns based on their values per example, if I have the following rows: plainRow:col:value1 column=T:19, timestamp=19, value= plainRow:col:value1 column=T:2, timestamp=2, value=U plainRow:col:value1 column=T:3, timestamp=3, value=U plainRow:col:value1 column=T:4, timestamp=4, value= and secondRow:col:value1 column=T:1, timestamp=1, value= secondRow:col:value1 column=T:2, timestamp=2, value= secondRow:col:value1 column=T:3, timestamp=3, value=U secondRow:col:value1 column=T:4, timestamp=4, value= and I want to select all the rows but just with the columns that don't have the value "U", something like: plainRow:col:value1 column=T:19, timestamp=19, value= plainRow:col:value1 column=T:4, timestamp=4, value= secondRow:col:value1 column=T:1, timestamp=1, value= secondRow:col:value1 column=T:2, timestamp=2, value=secondRow:col:value1 column=T:4, timestamp=4, value= and to achieve this, i try the following: Scan scan = new Scan(); scan.setStartRow(stringToBytes(rowIdentifier)); scan.setStopRow(stringToBytes(rowIdentifier + Constants.MAX_CHAR)); scan.addFamily(Constants.TERM_VECT_COLUMN_FAMILY); if(includeFilter) { Filter filter = new ValueFilter(CompareOp.EQUAL, new BinaryComparator(stringToBytes("U"))); scan.setFilter(filter); } and if i execute this scan I get the rows with the columns having the value "U", which is correct, but when i set CompareOp.NOT_EQUAL and i expect to get the other columns it doesnt work the way i want, it give me back all the rows, including the one which have the value "U", the same happens when i use: Filter filter = new ValueFilter(CompareOp.EQUAL, new BinaryComparator(stringToBytes(""))); I mention that the columns have the values "U" and "" (empty string), and that i also saw the same behaivior with the RegexComparator and SubstringComparator. Any idea would be very much appreciated, sorry for the long mail, thank you. Stefan Comanita -- Iulia Zidaru Java Developer 1&1 Internet AG - Bucharest/Romania - Web Components Romania 18 Mircea Eliade St Sect 1, Bucharest RO Bucharest, 012015 iulia.zid...@1and1.ro 0040 31 223 9153
HBase filtered scan problem
Hi all, I want to do a scan on a number of rows, each row having multiple columns, and I want to filter out some of this columns based on their values per example, if I have the following rows: plainRow:col:value1 column=T:19, timestamp=19, value= plainRow:col:value1 column=T:2, timestamp=2, value=U plainRow:col:value1 column=T:3, timestamp=3, value=U plainRow:col:value1 column=T:4, timestamp=4, value= and secondRow:col:value1 column=T:1, timestamp=1, value= secondRow:col:value1 column=T:2, timestamp=2, value= secondRow:col:value1 column=T:3, timestamp=3, value=U secondRow:col:value1 column=T:4, timestamp=4, value= and I want to select all the rows but just with the columns that don't have the value "U", something like: plainRow:col:value1 column=T:19, timestamp=19, value= plainRow:col:value1 column=T:4, timestamp=4, value= secondRow:col:value1 column=T:1, timestamp=1, value= secondRow:col:value1 column=T:2, timestamp=2, value= secondRow:col:value1 column=T:4, timestamp=4, value= and to achieve this, i try the following: Scan scan = new Scan(); scan.setStartRow(stringToBytes(rowIdentifier)); scan.setStopRow(stringToBytes(rowIdentifier + Constants.MAX_CHAR)); scan.addFamily(Constants.TERM_VECT_COLUMN_FAMILY); if(includeFilter) { Filter filter = new ValueFilter(CompareOp.EQUAL, new BinaryComparator(stringToBytes("U"))); scan.setFilter(filter); } and if i execute this scan I get the rows with the columns having the value "U", which is correct, but when i set CompareOp.NOT_EQUAL and i expect to get the other columns it doesnt work the way i want, it give me back all the rows, including the one which have the value "U", the same happens when i use: Filter filter = new ValueFilter(CompareOp.EQUAL, new BinaryComparator(stringToBytes(""))); I mention that the columns have the values "U" and "" (empty string), and that i also saw the same behaivior with the RegexComparator and SubstringComparator. Any idea would be very much appreciated, sorry for the long mail, thank you. Stefan Comanita