Re: Scan problem

2018-03-21 Thread Yang Zhang
Thanks all of you,  and your answer help me a lot.

2018-03-19 22:31 GMT+08:00 Saad Mufti :

> Another option if you have enough disk space/off heap memory space is to
> enable bucket cache to cache even more of your data, and set the
> PREFETCH_ON_OPEN => true option on the column families you want always
> cache. That way HBase will prefetch your data into the bucket cache and
> your scan won't have that initial slowdown. Or if you want to do it
> globally for all column families, set the configuration flag
> "hbase.rs.prefetchblocksonopen" to "true". Keep in mind though that if you
> do this, you should either have enough bucket cache space for all your
> data, otherwise there will be a lot of useless eviction activity at HBase
> startup and even later.
>
> Also, where a region is located will also be heavily impacted by which
> region balancer you have chosen and how you have tuned it in terms of how
> often to run and other parameters. A split region will stay initially at
> least on the same region server but your balancer if and when run can move
> it (an indeed any region) elsewhere to satisfy its criteria.
>
> Cheers.
>
> 
> Saad
>
>
> On Mon, Mar 19, 2018 at 1:14 AM, ramkrishna vasudevan <
> ramkrishna.s.vasude...@gmail.com> wrote:
>
> > Hi
> >
> > First regarding the scans,
> >
> > Generally the data resides in the store files which is in HDFS. So
> probably
> > the first scan that you are doing is reading from HDFS which involves
> disk
> > reads. Once the blocks are read, they are cached in the Block cache of
> > HBase. So your further reads go through that and hence you see further
> > speed up in the scans.
> >
> > >> And another question about region split, I want to know which
> > RegionServer
> > will load the new region afther splited ,
> > Will they be the same One with the old region?
> > Yes . Generally same region server hosts it.
> >
> > In master the code is here,
> > https://github.com/apache/hbase/blob/master/hbase-
> > server/src/main/java/org/apache/hadoop/hbase/master/assignment/
> > SplitTableRegionProcedure.java
> >
> > You may need to understand the entire flow to know how the regions are
> > opened after a split.
> >
> > Regards
> > Ram
> >
> > On Sat, Mar 17, 2018 at 9:02 PM, Yang Zhang 
> > wrote:
> >
> > > Hello everyone
> > >
> > > I try to do many Scan use RegionScanner in coprocessor, and
> > ervery
> > > time ,the first Scan cost  about 10 times than the other,
> > > I don't know why this will happen
> > >
> > > OneBucket Scan cost is : 8794 ms Num is : 710
> > > OneBucket Scan cost is : 91 ms Num is : 776
> > > OneBucket Scan cost is : 87 ms Num is : 808
> > > OneBucket Scan cost is : 105 ms Num is : 748
> > > OneBucket Scan cost is : 68 ms Num is : 200
> > >
> > >
> > > And another question about region split, I want to know which
> > RegionServer
> > > will load the new region afther splited ,
> > > Will they be the same One with the old region?  Anyone know where I can
> > > find the code to learn about that?
> > >
> > >
> > > Thanks for your help
> > >
> >
>


Re: Scan problem

2018-03-19 Thread Saad Mufti
Another option if you have enough disk space/off heap memory space is to
enable bucket cache to cache even more of your data, and set the
PREFETCH_ON_OPEN => true option on the column families you want always
cache. That way HBase will prefetch your data into the bucket cache and
your scan won't have that initial slowdown. Or if you want to do it
globally for all column families, set the configuration flag
"hbase.rs.prefetchblocksonopen" to "true". Keep in mind though that if you
do this, you should either have enough bucket cache space for all your
data, otherwise there will be a lot of useless eviction activity at HBase
startup and even later.

Also, where a region is located will also be heavily impacted by which
region balancer you have chosen and how you have tuned it in terms of how
often to run and other parameters. A split region will stay initially at
least on the same region server but your balancer if and when run can move
it (an indeed any region) elsewhere to satisfy its criteria.

Cheers.


Saad


On Mon, Mar 19, 2018 at 1:14 AM, ramkrishna vasudevan <
ramkrishna.s.vasude...@gmail.com> wrote:

> Hi
>
> First regarding the scans,
>
> Generally the data resides in the store files which is in HDFS. So probably
> the first scan that you are doing is reading from HDFS which involves disk
> reads. Once the blocks are read, they are cached in the Block cache of
> HBase. So your further reads go through that and hence you see further
> speed up in the scans.
>
> >> And another question about region split, I want to know which
> RegionServer
> will load the new region afther splited ,
> Will they be the same One with the old region?
> Yes . Generally same region server hosts it.
>
> In master the code is here,
> https://github.com/apache/hbase/blob/master/hbase-
> server/src/main/java/org/apache/hadoop/hbase/master/assignment/
> SplitTableRegionProcedure.java
>
> You may need to understand the entire flow to know how the regions are
> opened after a split.
>
> Regards
> Ram
>
> On Sat, Mar 17, 2018 at 9:02 PM, Yang Zhang 
> wrote:
>
> > Hello everyone
> >
> > I try to do many Scan use RegionScanner in coprocessor, and
> ervery
> > time ,the first Scan cost  about 10 times than the other,
> > I don't know why this will happen
> >
> > OneBucket Scan cost is : 8794 ms Num is : 710
> > OneBucket Scan cost is : 91 ms Num is : 776
> > OneBucket Scan cost is : 87 ms Num is : 808
> > OneBucket Scan cost is : 105 ms Num is : 748
> > OneBucket Scan cost is : 68 ms Num is : 200
> >
> >
> > And another question about region split, I want to know which
> RegionServer
> > will load the new region afther splited ,
> > Will they be the same One with the old region?  Anyone know where I can
> > find the code to learn about that?
> >
> >
> > Thanks for your help
> >
>


Re: Scan problem

2018-03-18 Thread ramkrishna vasudevan
Hi

First regarding the scans,

Generally the data resides in the store files which is in HDFS. So probably
the first scan that you are doing is reading from HDFS which involves disk
reads. Once the blocks are read, they are cached in the Block cache of
HBase. So your further reads go through that and hence you see further
speed up in the scans.

>> And another question about region split, I want to know which
RegionServer
will load the new region afther splited ,
Will they be the same One with the old region?
Yes . Generally same region server hosts it.

In master the code is here,
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java

You may need to understand the entire flow to know how the regions are
opened after a split.

Regards
Ram

On Sat, Mar 17, 2018 at 9:02 PM, Yang Zhang  wrote:

> Hello everyone
>
> I try to do many Scan use RegionScanner in coprocessor, and ervery
> time ,the first Scan cost  about 10 times than the other,
> I don't know why this will happen
>
> OneBucket Scan cost is : 8794 ms Num is : 710
> OneBucket Scan cost is : 91 ms Num is : 776
> OneBucket Scan cost is : 87 ms Num is : 808
> OneBucket Scan cost is : 105 ms Num is : 748
> OneBucket Scan cost is : 68 ms Num is : 200
>
>
> And another question about region split, I want to know which RegionServer
> will load the new region afther splited ,
> Will they be the same One with the old region?  Anyone know where I can
> find the code to learn about that?
>
>
> Thanks for your help
>


Scan problem

2018-03-17 Thread Yang Zhang
Hello everyone

I try to do many Scan use RegionScanner in coprocessor, and ervery
time ,the first Scan cost  about 10 times than the other,
I don't know why this will happen

OneBucket Scan cost is : 8794 ms Num is : 710
OneBucket Scan cost is : 91 ms Num is : 776
OneBucket Scan cost is : 87 ms Num is : 808
OneBucket Scan cost is : 105 ms Num is : 748
OneBucket Scan cost is : 68 ms Num is : 200


And another question about region split, I want to know which RegionServer
will load the new region afther splited ,
Will they be the same One with the old region?  Anyone know where I can
find the code to learn about that?


Thanks for your help


Re: M/R scan problem

2011-07-04 Thread Michel Segel
Did a quick trim...

Sorry to jump in on the tail end of this...
Two things you may want to look at...

Are you timing out because you haven't updated your status within the task or 
are you taking 600seconds to complete a single map() iteration.

You can test this by tracking to see how long you are spending in each map 
iteration and printing out the result if it is longer than 2 mins... 

Also try updating your status in each iteration by sending a unique status 
update like current system time...
...


Sent from a remote device. Please excuse any typos...

Mike Segel

On Jul 4, 2011, at 12:35 PM, Ted Yu  wrote:

> Although connection count may not be the root cause, please read
> http://zhihongyu.blogspot.com/2011/04/managing-connections-in-hbase-090-and.htmlif
> you have time.
> 0.92.0 would do a much better job of managing connections.
> 
> On Mon, Jul 4, 2011 at 10:14 AM, Lior Schachter  wrote:
>> 


Re: M/R scan problem

2011-07-04 Thread Ted Yu
Although connection count may not be the root cause, please read
http://zhihongyu.blogspot.com/2011/04/managing-connections-in-hbase-090-and.htmlif
you have time.
0.92.0 would do a much better job of managing connections.

On Mon, Jul 4, 2011 at 10:14 AM, Lior Schachter  wrote:

> I will increase the number of connections to 1000.
>
> Thanks !
>
> Lior
>
>
>
>
> On Mon, Jul 4, 2011 at 8:12 PM, Ted Yu  wrote:
>
> > The reason I asked about HBaseURLsDaysAggregator.java was that I see no
> > HBase (client) code in call stack.
> > I have little clue for the problem you experienced.
> >
> > There may be more than one connection to zookeeper from one map task.
> > So it doesn't hurt if you increase
> hbase.zookeeper.property.maxClientCnxns
> >
> > Cheers
> >
> > On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter 
> > wrote:
> >
> > > 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 :
> > are
> > > not important since even when I removed all my map code the tasks got
> > stuck
> > > (but the thread dumps were generated after I revived the code). If you
> > > think
> > > its important I'll remove the map code again and re-generate the thread
> > > dumps...
> > >
> > > 2. 82 maps were launched but only 36 ran simultaneously.
> > >
> > > 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it
> ?
> > >
> > > Thanks,
> > > Lior
> > >
> > >
> > > On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu  wrote:
> > >
> > > > In the future, provide full dump using pastebin.com
> > > > Write snippet of log in email.
> > > >
> > > > Can you tell us what the following lines are about ?
> > > > HBaseURLsDaysAggregator.java:124
> > > > HBaseURLsDaysAggregator.java:131
> > > >
> > > > How many mappers were launched ?
> > > >
> > > > What value is used for hbase.zookeeper.property.maxClientCnxns ?
> > > > You may need to increase the value for above setting.
> > > >
> > > > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter 
> > > > wrote:
> > > >
> > > > > I used kill -3, following the thread dump:
> > > > >
> > > > > ...
> > > > >
> > > > >
> > > > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu 
> wrote:
> > > > >
> > > > > > I wasn't clear in my previous email.
> > > > > > It was not answer to why map tasks got stuck.
> > > > > > TableInputFormatBase.getSplits() is being called already.
> > > > > >
> > > > > > Can you try getting jstack of one of the map tasks before task
> > > tracker
> > > > > > kills
> > > > > > it ?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter <
> > li...@infolinks.com>
> > > > > > wrote:
> > > > > >
> > > > > > > 1. Currently every map gets one region. So I don't understand
> > what
> > > > > > > difference will it make using the splits.
> > > > > > > 2. How should I use the TableInputFormatBase.getSplits() ?
> Could
> > > not
> > > > > find
> > > > > > > examples for that.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Lior
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu 
> > > wrote:
> > > > > > >
> > > > > > > > For #2, see TableInputFormatBase.getSplits():
> > > > > > > >   * Calculates the splits that will serve as input for the
> map
> > > > tasks.
> > > > > > The
> > > > > > > >   * number of splits matches the number of regions in a
> table.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter <
> > > > li...@infolinks.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > 1. yes - I configure my job using this line:
> > > > > > > > >
> > > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
> > > > > > > scan,
> > > > > > > > > ScanMapper.class, Text.class, MapWritable.class, job)
> > > > > > > > >
> > > > > > > > > which internally uses TableInputFormat.class
> > > > > > > > >
> > > > > > > > > 2. One split per region ? What do you mean ? How do I do
> that
> > ?
> > > > > > > > >
> > > > > > > > > 3. hbase version 0.90.2
> > > > > > > > >
> > > > > > > > > 4. no exceptions. the logs are very clean.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu <
> yuzhih...@gmail.com>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Do you use TableInputFormat ?
> > > > > > > > > > To scan large number of rows, it would be better to
> produce
> > > one
> > > > > > Split
> > > > > > > > per
> > > > > > > > > > region.
> > > > > > > > > >
> > > > > > > > > > What HBase version do you use ?
> > > > > > > > > > Do you find any exception in master / region server logs
> > > around
> > > > > the
> > > > > > > > > moment
> > > > > > > > > > of timeout ?
> > > > > > > > > >
> > > > > > > > > > Cheers
> > > > > > > > > >
> > > > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <
> > > > > > li...@infolinks.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > > I'm running a scan using the M/R framework.
> > > > > > > > > > > My t

Re: M/R scan problem

2011-07-04 Thread Lior Schachter
I will increase the number of connections to 1000.

Thanks !

Lior




On Mon, Jul 4, 2011 at 8:12 PM, Ted Yu  wrote:

> The reason I asked about HBaseURLsDaysAggregator.java was that I see no
> HBase (client) code in call stack.
> I have little clue for the problem you experienced.
>
> There may be more than one connection to zookeeper from one map task.
> So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns
>
> Cheers
>
> On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter 
> wrote:
>
> > 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 :
> are
> > not important since even when I removed all my map code the tasks got
> stuck
> > (but the thread dumps were generated after I revived the code). If you
> > think
> > its important I'll remove the map code again and re-generate the thread
> > dumps...
> >
> > 2. 82 maps were launched but only 36 ran simultaneously.
> >
> > 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ?
> >
> > Thanks,
> > Lior
> >
> >
> > On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu  wrote:
> >
> > > In the future, provide full dump using pastebin.com
> > > Write snippet of log in email.
> > >
> > > Can you tell us what the following lines are about ?
> > > HBaseURLsDaysAggregator.java:124
> > > HBaseURLsDaysAggregator.java:131
> > >
> > > How many mappers were launched ?
> > >
> > > What value is used for hbase.zookeeper.property.maxClientCnxns ?
> > > You may need to increase the value for above setting.
> > >
> > > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter 
> > > wrote:
> > >
> > > > I used kill -3, following the thread dump:
> > > >
> > > > ...
> > > >
> > > >
> > > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu  wrote:
> > > >
> > > > > I wasn't clear in my previous email.
> > > > > It was not answer to why map tasks got stuck.
> > > > > TableInputFormatBase.getSplits() is being called already.
> > > > >
> > > > > Can you try getting jstack of one of the map tasks before task
> > tracker
> > > > > kills
> > > > > it ?
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter <
> li...@infolinks.com>
> > > > > wrote:
> > > > >
> > > > > > 1. Currently every map gets one region. So I don't understand
> what
> > > > > > difference will it make using the splits.
> > > > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could
> > not
> > > > find
> > > > > > examples for that.
> > > > > >
> > > > > > Thanks,
> > > > > > Lior
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu 
> > wrote:
> > > > > >
> > > > > > > For #2, see TableInputFormatBase.getSplits():
> > > > > > >   * Calculates the splits that will serve as input for the map
> > > tasks.
> > > > > The
> > > > > > >   * number of splits matches the number of regions in a table.
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter <
> > > li...@infolinks.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > 1. yes - I configure my job using this line:
> > > > > > > >
> > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
> > > > > > scan,
> > > > > > > > ScanMapper.class, Text.class, MapWritable.class, job)
> > > > > > > >
> > > > > > > > which internally uses TableInputFormat.class
> > > > > > > >
> > > > > > > > 2. One split per region ? What do you mean ? How do I do that
> ?
> > > > > > > >
> > > > > > > > 3. hbase version 0.90.2
> > > > > > > >
> > > > > > > > 4. no exceptions. the logs are very clean.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu 
> > > > wrote:
> > > > > > > >
> > > > > > > > > Do you use TableInputFormat ?
> > > > > > > > > To scan large number of rows, it would be better to produce
> > one
> > > > > Split
> > > > > > > per
> > > > > > > > > region.
> > > > > > > > >
> > > > > > > > > What HBase version do you use ?
> > > > > > > > > Do you find any exception in master / region server logs
> > around
> > > > the
> > > > > > > > moment
> > > > > > > > > of timeout ?
> > > > > > > > >
> > > > > > > > > Cheers
> > > > > > > > >
> > > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <
> > > > > li...@infolinks.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > > I'm running a scan using the M/R framework.
> > > > > > > > > > My table contains hundreds of millions of rows and I'm
> > > scanning
> > > > > > using
> > > > > > > > > > start/stop key about 50 million rows.
> > > > > > > > > >
> > > > > > > > > > The problem is that some map tasks get stuck and the task
> > > > manager
> > > > > > > kills
> > > > > > > > > > these maps after 600 seconds. When retrying the task
> > > everything
> > > > > > works
> > > > > > > > > fine
> > > > > > > > > > (sometimes).
> > > > > > > > > >
> > > > > > > > > > To verify that the problem is in hbase (and not in the
> map
> > > > code)
> > > > > I
> > > > > > > > > removed
> > > > > > > > 

Re: M/R scan problem

2011-07-04 Thread Ted Yu
>From master UI, click 'zk dump'
:60010/zk.jsp would show you the active connections. See if the count
reaches 300 when map tasks run.

On Mon, Jul 4, 2011 at 10:12 AM, Ted Yu  wrote:

> The reason I asked about HBaseURLsDaysAggregator.java was that I see no
> HBase (client) code in call stack.
> I have little clue for the problem you experienced.
>
> There may be more than one connection to zookeeper from one map task.
> So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns
>
> Cheers
>
>
> On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter wrote:
>
>> 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 :
>> are
>> not important since even when I removed all my map code the tasks got
>> stuck
>> (but the thread dumps were generated after I revived the code). If you
>> think
>> its important I'll remove the map code again and re-generate the thread
>> dumps...
>>
>> 2. 82 maps were launched but only 36 ran simultaneously.
>>
>> 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ?
>>
>> Thanks,
>> Lior
>>
>>
>> On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu  wrote:
>>
>> > In the future, provide full dump using pastebin.com
>> > Write snippet of log in email.
>> >
>> > Can you tell us what the following lines are about ?
>> > HBaseURLsDaysAggregator.java:124
>> > HBaseURLsDaysAggregator.java:131
>> >
>> > How many mappers were launched ?
>> >
>> > What value is used for hbase.zookeeper.property.maxClientCnxns ?
>> > You may need to increase the value for above setting.
>> >
>> > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter 
>> > wrote:
>> >
>> > > I used kill -3, following the thread dump:
>> > >
>> > > ...
>> > >
>> > >
>> > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu  wrote:
>> > >
>> > > > I wasn't clear in my previous email.
>> > > > It was not answer to why map tasks got stuck.
>> > > > TableInputFormatBase.getSplits() is being called already.
>> > > >
>> > > > Can you try getting jstack of one of the map tasks before task
>> tracker
>> > > > kills
>> > > > it ?
>> > > >
>> > > > Thanks
>> > > >
>> > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter > >
>> > > > wrote:
>> > > >
>> > > > > 1. Currently every map gets one region. So I don't understand what
>> > > > > difference will it make using the splits.
>> > > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could
>> not
>> > > find
>> > > > > examples for that.
>> > > > >
>> > > > > Thanks,
>> > > > > Lior
>> > > > >
>> > > > >
>> > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu 
>> wrote:
>> > > > >
>> > > > > > For #2, see TableInputFormatBase.getSplits():
>> > > > > >   * Calculates the splits that will serve as input for the map
>> > tasks.
>> > > > The
>> > > > > >   * number of splits matches the number of regions in a table.
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter <
>> > li...@infolinks.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > 1. yes - I configure my job using this line:
>> > > > > > >
>> > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
>> > > > > scan,
>> > > > > > > ScanMapper.class, Text.class, MapWritable.class, job)
>> > > > > > >
>> > > > > > > which internally uses TableInputFormat.class
>> > > > > > >
>> > > > > > > 2. One split per region ? What do you mean ? How do I do that
>> ?
>> > > > > > >
>> > > > > > > 3. hbase version 0.90.2
>> > > > > > >
>> > > > > > > 4. no exceptions. the logs are very clean.
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu 
>> > > wrote:
>> > > > > > >
>> > > > > > > > Do you use TableInputFormat ?
>> > > > > > > > To scan large number of rows, it would be better to produce
>> one
>> > > > Split
>> > > > > > per
>> > > > > > > > region.
>> > > > > > > >
>> > > > > > > > What HBase version do you use ?
>> > > > > > > > Do you find any exception in master / region server logs
>> around
>> > > the
>> > > > > > > moment
>> > > > > > > > of timeout ?
>> > > > > > > >
>> > > > > > > > Cheers
>> > > > > > > >
>> > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <
>> > > > li...@infolinks.com>
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hi all,
>> > > > > > > > > I'm running a scan using the M/R framework.
>> > > > > > > > > My table contains hundreds of millions of rows and I'm
>> > scanning
>> > > > > using
>> > > > > > > > > start/stop key about 50 million rows.
>> > > > > > > > >
>> > > > > > > > > The problem is that some map tasks get stuck and the task
>> > > manager
>> > > > > > kills
>> > > > > > > > > these maps after 600 seconds. When retrying the task
>> > everything
>> > > > > works
>> > > > > > > > fine
>> > > > > > > > > (sometimes).
>> > > > > > > > >
>> > > > > > > > > To verify that the problem is in hbase (and not in the map
>> > > code)
>> > > > I
>> > > > > > > > removed
>> > > > > > > > > all the code from my map function, so it looks like this:
>> > > > > > > > > public void ma

Re: M/R scan problem

2011-07-04 Thread Ted Yu
The reason I asked about HBaseURLsDaysAggregator.java was that I see no
HBase (client) code in call stack.
I have little clue for the problem you experienced.

There may be more than one connection to zookeeper from one map task.
So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns

Cheers

On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter  wrote:

> 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are
> not important since even when I removed all my map code the tasks got stuck
> (but the thread dumps were generated after I revived the code). If you
> think
> its important I'll remove the map code again and re-generate the thread
> dumps...
>
> 2. 82 maps were launched but only 36 ran simultaneously.
>
> 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ?
>
> Thanks,
> Lior
>
>
> On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu  wrote:
>
> > In the future, provide full dump using pastebin.com
> > Write snippet of log in email.
> >
> > Can you tell us what the following lines are about ?
> > HBaseURLsDaysAggregator.java:124
> > HBaseURLsDaysAggregator.java:131
> >
> > How many mappers were launched ?
> >
> > What value is used for hbase.zookeeper.property.maxClientCnxns ?
> > You may need to increase the value for above setting.
> >
> > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter 
> > wrote:
> >
> > > I used kill -3, following the thread dump:
> > >
> > > ...
> > >
> > >
> > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu  wrote:
> > >
> > > > I wasn't clear in my previous email.
> > > > It was not answer to why map tasks got stuck.
> > > > TableInputFormatBase.getSplits() is being called already.
> > > >
> > > > Can you try getting jstack of one of the map tasks before task
> tracker
> > > > kills
> > > > it ?
> > > >
> > > > Thanks
> > > >
> > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter 
> > > > wrote:
> > > >
> > > > > 1. Currently every map gets one region. So I don't understand what
> > > > > difference will it make using the splits.
> > > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could
> not
> > > find
> > > > > examples for that.
> > > > >
> > > > > Thanks,
> > > > > Lior
> > > > >
> > > > >
> > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu 
> wrote:
> > > > >
> > > > > > For #2, see TableInputFormatBase.getSplits():
> > > > > >   * Calculates the splits that will serve as input for the map
> > tasks.
> > > > The
> > > > > >   * number of splits matches the number of regions in a table.
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter <
> > li...@infolinks.com>
> > > > > > wrote:
> > > > > >
> > > > > > > 1. yes - I configure my job using this line:
> > > > > > >
> > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
> > > > > scan,
> > > > > > > ScanMapper.class, Text.class, MapWritable.class, job)
> > > > > > >
> > > > > > > which internally uses TableInputFormat.class
> > > > > > >
> > > > > > > 2. One split per region ? What do you mean ? How do I do that ?
> > > > > > >
> > > > > > > 3. hbase version 0.90.2
> > > > > > >
> > > > > > > 4. no exceptions. the logs are very clean.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu 
> > > wrote:
> > > > > > >
> > > > > > > > Do you use TableInputFormat ?
> > > > > > > > To scan large number of rows, it would be better to produce
> one
> > > > Split
> > > > > > per
> > > > > > > > region.
> > > > > > > >
> > > > > > > > What HBase version do you use ?
> > > > > > > > Do you find any exception in master / region server logs
> around
> > > the
> > > > > > > moment
> > > > > > > > of timeout ?
> > > > > > > >
> > > > > > > > Cheers
> > > > > > > >
> > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <
> > > > li...@infolinks.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > > I'm running a scan using the M/R framework.
> > > > > > > > > My table contains hundreds of millions of rows and I'm
> > scanning
> > > > > using
> > > > > > > > > start/stop key about 50 million rows.
> > > > > > > > >
> > > > > > > > > The problem is that some map tasks get stuck and the task
> > > manager
> > > > > > kills
> > > > > > > > > these maps after 600 seconds. When retrying the task
> > everything
> > > > > works
> > > > > > > > fine
> > > > > > > > > (sometimes).
> > > > > > > > >
> > > > > > > > > To verify that the problem is in hbase (and not in the map
> > > code)
> > > > I
> > > > > > > > removed
> > > > > > > > > all the code from my map function, so it looks like this:
> > > > > > > > > public void map(ImmutableBytesWritable key, Result value,
> > > Context
> > > > > > > > context)
> > > > > > > > > throws IOException, InterruptedException {
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > Also, when the map got stuck on a region, I tried to scan
> > this
> > > > > region
> > > > > > > > > (using
> > > > > > > > > simple scan from a Java main

Re: M/R scan problem

2011-07-04 Thread Lior Schachter
1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 : are
not important since even when I removed all my map code the tasks got stuck
(but the thread dumps were generated after I revived the code). If you think
its important I'll remove the map code again and re-generate the thread
dumps...

2. 82 maps were launched but only 36 ran simultaneously.

3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ?

Thanks,
Lior


On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu  wrote:

> In the future, provide full dump using pastebin.com
> Write snippet of log in email.
>
> Can you tell us what the following lines are about ?
> HBaseURLsDaysAggregator.java:124
> HBaseURLsDaysAggregator.java:131
>
> How many mappers were launched ?
>
> What value is used for hbase.zookeeper.property.maxClientCnxns ?
> You may need to increase the value for above setting.
>
> On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter 
> wrote:
>
> > I used kill -3, following the thread dump:
> >
> > ...
> >
> >
> > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu  wrote:
> >
> > > I wasn't clear in my previous email.
> > > It was not answer to why map tasks got stuck.
> > > TableInputFormatBase.getSplits() is being called already.
> > >
> > > Can you try getting jstack of one of the map tasks before task tracker
> > > kills
> > > it ?
> > >
> > > Thanks
> > >
> > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter 
> > > wrote:
> > >
> > > > 1. Currently every map gets one region. So I don't understand what
> > > > difference will it make using the splits.
> > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could not
> > find
> > > > examples for that.
> > > >
> > > > Thanks,
> > > > Lior
> > > >
> > > >
> > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu  wrote:
> > > >
> > > > > For #2, see TableInputFormatBase.getSplits():
> > > > >   * Calculates the splits that will serve as input for the map
> tasks.
> > > The
> > > > >   * number of splits matches the number of regions in a table.
> > > > >
> > > > >
> > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter <
> li...@infolinks.com>
> > > > > wrote:
> > > > >
> > > > > > 1. yes - I configure my job using this line:
> > > > > >
> TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
> > > > scan,
> > > > > > ScanMapper.class, Text.class, MapWritable.class, job)
> > > > > >
> > > > > > which internally uses TableInputFormat.class
> > > > > >
> > > > > > 2. One split per region ? What do you mean ? How do I do that ?
> > > > > >
> > > > > > 3. hbase version 0.90.2
> > > > > >
> > > > > > 4. no exceptions. the logs are very clean.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu 
> > wrote:
> > > > > >
> > > > > > > Do you use TableInputFormat ?
> > > > > > > To scan large number of rows, it would be better to produce one
> > > Split
> > > > > per
> > > > > > > region.
> > > > > > >
> > > > > > > What HBase version do you use ?
> > > > > > > Do you find any exception in master / region server logs around
> > the
> > > > > > moment
> > > > > > > of timeout ?
> > > > > > >
> > > > > > > Cheers
> > > > > > >
> > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <
> > > li...@infolinks.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > > I'm running a scan using the M/R framework.
> > > > > > > > My table contains hundreds of millions of rows and I'm
> scanning
> > > > using
> > > > > > > > start/stop key about 50 million rows.
> > > > > > > >
> > > > > > > > The problem is that some map tasks get stuck and the task
> > manager
> > > > > kills
> > > > > > > > these maps after 600 seconds. When retrying the task
> everything
> > > > works
> > > > > > > fine
> > > > > > > > (sometimes).
> > > > > > > >
> > > > > > > > To verify that the problem is in hbase (and not in the map
> > code)
> > > I
> > > > > > > removed
> > > > > > > > all the code from my map function, so it looks like this:
> > > > > > > > public void map(ImmutableBytesWritable key, Result value,
> > Context
> > > > > > > context)
> > > > > > > > throws IOException, InterruptedException {
> > > > > > > > }
> > > > > > > >
> > > > > > > > Also, when the map got stuck on a region, I tried to scan
> this
> > > > region
> > > > > > > > (using
> > > > > > > > simple scan from a Java main) and it worked fine.
> > > > > > > >
> > > > > > > > Any ideas ?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Lior
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: M/R scan problem

2011-07-04 Thread Ted Yu
In the future, provide full dump using pastebin.com
Write snippet of log in email.

Can you tell us what the following lines are about ?
HBaseURLsDaysAggregator.java:124
HBaseURLsDaysAggregator.java:131

How many mappers were launched ?

What value is used for hbase.zookeeper.property.maxClientCnxns ?
You may need to increase the value for above setting.

On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter  wrote:

> I used kill -3, following the thread dump:
>
> ...
>
>
> On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu  wrote:
>
> > I wasn't clear in my previous email.
> > It was not answer to why map tasks got stuck.
> > TableInputFormatBase.getSplits() is being called already.
> >
> > Can you try getting jstack of one of the map tasks before task tracker
> > kills
> > it ?
> >
> > Thanks
> >
> > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter 
> > wrote:
> >
> > > 1. Currently every map gets one region. So I don't understand what
> > > difference will it make using the splits.
> > > 2. How should I use the TableInputFormatBase.getSplits() ? Could not
> find
> > > examples for that.
> > >
> > > Thanks,
> > > Lior
> > >
> > >
> > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu  wrote:
> > >
> > > > For #2, see TableInputFormatBase.getSplits():
> > > >   * Calculates the splits that will serve as input for the map tasks.
> > The
> > > >   * number of splits matches the number of regions in a table.
> > > >
> > > >
> > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter 
> > > > wrote:
> > > >
> > > > > 1. yes - I configure my job using this line:
> > > > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
> > > scan,
> > > > > ScanMapper.class, Text.class, MapWritable.class, job)
> > > > >
> > > > > which internally uses TableInputFormat.class
> > > > >
> > > > > 2. One split per region ? What do you mean ? How do I do that ?
> > > > >
> > > > > 3. hbase version 0.90.2
> > > > >
> > > > > 4. no exceptions. the logs are very clean.
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu 
> wrote:
> > > > >
> > > > > > Do you use TableInputFormat ?
> > > > > > To scan large number of rows, it would be better to produce one
> > Split
> > > > per
> > > > > > region.
> > > > > >
> > > > > > What HBase version do you use ?
> > > > > > Do you find any exception in master / region server logs around
> the
> > > > > moment
> > > > > > of timeout ?
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <
> > li...@infolinks.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > > I'm running a scan using the M/R framework.
> > > > > > > My table contains hundreds of millions of rows and I'm scanning
> > > using
> > > > > > > start/stop key about 50 million rows.
> > > > > > >
> > > > > > > The problem is that some map tasks get stuck and the task
> manager
> > > > kills
> > > > > > > these maps after 600 seconds. When retrying the task everything
> > > works
> > > > > > fine
> > > > > > > (sometimes).
> > > > > > >
> > > > > > > To verify that the problem is in hbase (and not in the map
> code)
> > I
> > > > > > removed
> > > > > > > all the code from my map function, so it looks like this:
> > > > > > > public void map(ImmutableBytesWritable key, Result value,
> Context
> > > > > > context)
> > > > > > > throws IOException, InterruptedException {
> > > > > > > }
> > > > > > >
> > > > > > > Also, when the map got stuck on a region, I tried to scan this
> > > region
> > > > > > > (using
> > > > > > > simple scan from a Java main) and it worked fine.
> > > > > > >
> > > > > > > Any ideas ?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Lior
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: M/R scan problem

2011-07-04 Thread Lior Schachter
I used kill -3, following the thread dump:

Full thread dump Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode):

"IPC Client (47) connection to /127.0.0.1:59759 from hadoop" daemon
prio=10 tid=0x2aaab05ca800 nid=0x4eaf in Object.wait()
[0x403c1000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0xf9dba860> (a 
org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:403)
- locked <0xf9dba860> (a 
org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:445)

"SpillThread" daemon prio=10 tid=0x2aaab0585000 nid=0x4c99 waiting
on condition [0x404c2000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xf9af0c38> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1169)

"main-EventThread" daemon prio=10 tid=0x2aaab035d000 nid=0x4c95
waiting on condition [0x41207000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xf9af5f58> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)

"main-SendThread(hadoop09.infolinks.local:2181)" daemon prio=10
tid=0x2aaab035c000 nid=0x4c94 runnable [0x40815000]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
- locked <0xf9af61a8> (a sun.nio.ch.Util$2)
- locked <0xf9af61b8> (a java.util.Collections$UnmodifiableSet)
- locked <0xf9af6160> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)

"communication thread" daemon prio=10 tid=0x4d02
nid=0x4c93 waiting on condition [0x42497000]
   java.lang.Thread.State: RUNNABLE
at java.util.Hashtable.put(Hashtable.java:420)
- locked <0xf9dbaa58> (a java.util.Hashtable)
at org.apache.hadoop.ipc.Client$Connection.addCall(Client.java:225)
- locked <0xf9dba860> (a 
org.apache.hadoop.ipc.Client$Connection)
at org.apache.hadoop.ipc.Client$Connection.access$1600(Client.java:176)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:854)
at org.apache.hadoop.ipc.Client.call(Client.java:720)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at org.apache.hadoop.mapred.$Proxy0.ping(Unknown Source)
at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:548)
at java.lang.Thread.run(Thread.java:662)

"Thread for syncLogs" daemon prio=10 tid=0x2aaab02e9800 nid=0x4c90
runnable [0x40714000]
   java.lang.Thread.State: RUNNABLE
at java.util.Arrays.copyOf(Arrays.java:2882)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
at java.lang.StringBuilder.append(StringBuilder.java:119)
at java.io.UnixFileSystem.resolve(UnixFileSystem.java:93)
at java.io.File.(File.java:312)
at org.apache.hadoop.mapred.TaskLog.getTaskLogFile(TaskLog.java:72)
at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:180)
at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:230)
- locked <0xeea92fc0> (a java.lang.Class for
org.apache.hadoop.mapred.TaskLog)
at org.apache.hadoop.mapred.Child$2.run(Child.java:89)

"Low Memory Detector" daemon prio=10 tid=0x2aaab0001800 nid=0x4c86
runnable [0x]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x4cb4e800 nid=0x4c85
waiting on condition [0x]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x4cb4b000 nid=0x4c84
waiti

Re: M/R scan problem

2011-07-04 Thread Ted Yu
I wasn't clear in my previous email.
It was not answer to why map tasks got stuck.
TableInputFormatBase.getSplits() is being called already.

Can you try getting jstack of one of the map tasks before task tracker kills
it ?

Thanks

On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter  wrote:

> 1. Currently every map gets one region. So I don't understand what
> difference will it make using the splits.
> 2. How should I use the TableInputFormatBase.getSplits() ? Could not find
> examples for that.
>
> Thanks,
> Lior
>
>
> On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu  wrote:
>
> > For #2, see TableInputFormatBase.getSplits():
> >   * Calculates the splits that will serve as input for the map tasks. The
> >   * number of splits matches the number of regions in a table.
> >
> >
> > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter 
> > wrote:
> >
> > > 1. yes - I configure my job using this line:
> > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
> scan,
> > > ScanMapper.class, Text.class, MapWritable.class, job)
> > >
> > > which internally uses TableInputFormat.class
> > >
> > > 2. One split per region ? What do you mean ? How do I do that ?
> > >
> > > 3. hbase version 0.90.2
> > >
> > > 4. no exceptions. the logs are very clean.
> > >
> > >
> > >
> > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu  wrote:
> > >
> > > > Do you use TableInputFormat ?
> > > > To scan large number of rows, it would be better to produce one Split
> > per
> > > > region.
> > > >
> > > > What HBase version do you use ?
> > > > Do you find any exception in master / region server logs around the
> > > moment
> > > > of timeout ?
> > > >
> > > > Cheers
> > > >
> > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > > I'm running a scan using the M/R framework.
> > > > > My table contains hundreds of millions of rows and I'm scanning
> using
> > > > > start/stop key about 50 million rows.
> > > > >
> > > > > The problem is that some map tasks get stuck and the task manager
> > kills
> > > > > these maps after 600 seconds. When retrying the task everything
> works
> > > > fine
> > > > > (sometimes).
> > > > >
> > > > > To verify that the problem is in hbase (and not in the map code) I
> > > > removed
> > > > > all the code from my map function, so it looks like this:
> > > > > public void map(ImmutableBytesWritable key, Result value, Context
> > > > context)
> > > > > throws IOException, InterruptedException {
> > > > > }
> > > > >
> > > > > Also, when the map got stuck on a region, I tried to scan this
> region
> > > > > (using
> > > > > simple scan from a Java main) and it worked fine.
> > > > >
> > > > > Any ideas ?
> > > > >
> > > > > Thanks,
> > > > > Lior
> > > > >
> > > >
> > >
> >
>


Re: M/R scan problem

2011-07-04 Thread Lior Schachter
1. Currently every map gets one region. So I don't understand what
difference will it make using the splits.
2. How should I use the TableInputFormatBase.getSplits() ? Could not find
examples for that.

Thanks,
Lior


On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu  wrote:

> For #2, see TableInputFormatBase.getSplits():
>   * Calculates the splits that will serve as input for the map tasks. The
>   * number of splits matches the number of regions in a table.
>
>
> On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter 
> wrote:
>
> > 1. yes - I configure my job using this line:
> > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan,
> > ScanMapper.class, Text.class, MapWritable.class, job)
> >
> > which internally uses TableInputFormat.class
> >
> > 2. One split per region ? What do you mean ? How do I do that ?
> >
> > 3. hbase version 0.90.2
> >
> > 4. no exceptions. the logs are very clean.
> >
> >
> >
> > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu  wrote:
> >
> > > Do you use TableInputFormat ?
> > > To scan large number of rows, it would be better to produce one Split
> per
> > > region.
> > >
> > > What HBase version do you use ?
> > > Do you find any exception in master / region server logs around the
> > moment
> > > of timeout ?
> > >
> > > Cheers
> > >
> > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
> > > wrote:
> > >
> > > > Hi all,
> > > > I'm running a scan using the M/R framework.
> > > > My table contains hundreds of millions of rows and I'm scanning using
> > > > start/stop key about 50 million rows.
> > > >
> > > > The problem is that some map tasks get stuck and the task manager
> kills
> > > > these maps after 600 seconds. When retrying the task everything works
> > > fine
> > > > (sometimes).
> > > >
> > > > To verify that the problem is in hbase (and not in the map code) I
> > > removed
> > > > all the code from my map function, so it looks like this:
> > > > public void map(ImmutableBytesWritable key, Result value, Context
> > > context)
> > > > throws IOException, InterruptedException {
> > > > }
> > > >
> > > > Also, when the map got stuck on a region, I tried to scan this region
> > > > (using
> > > > simple scan from a Java main) and it worked fine.
> > > >
> > > > Any ideas ?
> > > >
> > > > Thanks,
> > > > Lior
> > > >
> > >
> >
>


Re: M/R scan problem

2011-07-04 Thread Ted Yu
For #2, see TableInputFormatBase.getSplits():
   * Calculates the splits that will serve as input for the map tasks. The
   * number of splits matches the number of regions in a table.


On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter  wrote:

> 1. yes - I configure my job using this line:
> TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan,
> ScanMapper.class, Text.class, MapWritable.class, job)
>
> which internally uses TableInputFormat.class
>
> 2. One split per region ? What do you mean ? How do I do that ?
>
> 3. hbase version 0.90.2
>
> 4. no exceptions. the logs are very clean.
>
>
>
> On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu  wrote:
>
> > Do you use TableInputFormat ?
> > To scan large number of rows, it would be better to produce one Split per
> > region.
> >
> > What HBase version do you use ?
> > Do you find any exception in master / region server logs around the
> moment
> > of timeout ?
> >
> > Cheers
> >
> > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
> > wrote:
> >
> > > Hi all,
> > > I'm running a scan using the M/R framework.
> > > My table contains hundreds of millions of rows and I'm scanning using
> > > start/stop key about 50 million rows.
> > >
> > > The problem is that some map tasks get stuck and the task manager kills
> > > these maps after 600 seconds. When retrying the task everything works
> > fine
> > > (sometimes).
> > >
> > > To verify that the problem is in hbase (and not in the map code) I
> > removed
> > > all the code from my map function, so it looks like this:
> > > public void map(ImmutableBytesWritable key, Result value, Context
> > context)
> > > throws IOException, InterruptedException {
> > > }
> > >
> > > Also, when the map got stuck on a region, I tried to scan this region
> > > (using
> > > simple scan from a Java main) and it worked fine.
> > >
> > > Any ideas ?
> > >
> > > Thanks,
> > > Lior
> > >
> >
>


Re: M/R scan problem

2011-07-04 Thread Lior Schachter
1. yes - I configure my job using this line:
TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan,
ScanMapper.class, Text.class, MapWritable.class, job)

which internally uses TableInputFormat.class

2. One split per region ? What do you mean ? How do I do that ?

3. hbase version 0.90.2

4. no exceptions. the logs are very clean.



On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu  wrote:

> Do you use TableInputFormat ?
> To scan large number of rows, it would be better to produce one Split per
> region.
>
> What HBase version do you use ?
> Do you find any exception in master / region server logs around the moment
> of timeout ?
>
> Cheers
>
> On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter 
> wrote:
>
> > Hi all,
> > I'm running a scan using the M/R framework.
> > My table contains hundreds of millions of rows and I'm scanning using
> > start/stop key about 50 million rows.
> >
> > The problem is that some map tasks get stuck and the task manager kills
> > these maps after 600 seconds. When retrying the task everything works
> fine
> > (sometimes).
> >
> > To verify that the problem is in hbase (and not in the map code) I
> removed
> > all the code from my map function, so it looks like this:
> > public void map(ImmutableBytesWritable key, Result value, Context
> context)
> > throws IOException, InterruptedException {
> > }
> >
> > Also, when the map got stuck on a region, I tried to scan this region
> > (using
> > simple scan from a Java main) and it worked fine.
> >
> > Any ideas ?
> >
> > Thanks,
> > Lior
> >
>


Re: M/R scan problem

2011-07-04 Thread Ted Yu
Do you use TableInputFormat ?
To scan large number of rows, it would be better to produce one Split per
region.

What HBase version do you use ?
Do you find any exception in master / region server logs around the moment
of timeout ?

Cheers

On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter  wrote:

> Hi all,
> I'm running a scan using the M/R framework.
> My table contains hundreds of millions of rows and I'm scanning using
> start/stop key about 50 million rows.
>
> The problem is that some map tasks get stuck and the task manager kills
> these maps after 600 seconds. When retrying the task everything works fine
> (sometimes).
>
> To verify that the problem is in hbase (and not in the map code) I removed
> all the code from my map function, so it looks like this:
> public void map(ImmutableBytesWritable key, Result value, Context context)
> throws IOException, InterruptedException {
> }
>
> Also, when the map got stuck on a region, I tried to scan this region
> (using
> simple scan from a Java main) and it worked fine.
>
> Any ideas ?
>
> Thanks,
> Lior
>


M/R scan problem

2011-07-04 Thread Lior Schachter
Hi all,
I'm running a scan using the M/R framework.
My table contains hundreds of millions of rows and I'm scanning using
start/stop key about 50 million rows.

The problem is that some map tasks get stuck and the task manager kills
these maps after 600 seconds. When retrying the task everything works fine
(sometimes).

To verify that the problem is in hbase (and not in the map code) I removed
all the code from my map function, so it looks like this:
public void map(ImmutableBytesWritable key, Result value, Context context)
throws IOException, InterruptedException {
}

Also, when the map got stuck on a region, I tried to scan this region (using
simple scan from a Java main) and it worked fine.

Any ideas ?

Thanks,
Lior


Re: HBase filtered scan problem

2011-05-23 Thread Iulia Zidaru

 Thank you very much St. Ack.
It sounds like we have to create other filer.
Iulia

On 05/12/2011 08:07 PM, Stack wrote:

On Thu, May 12, 2011 at 6:42 AM, Iulia Zidaru  wrote:

  Hi,

Thank you for your answer St. Ack.
Yes, both coordinates are the same. It is impossible for the filter to
decide that a value is old. I still don't understand why the HBase server
has both values or how long does it keep both.

Well its hard to 'overwrite' if one value is in the memstore and the
other is out on the filesystem.

It'll do the clean up on major compaction.

The fiilter should be able to pick up ordering hints from its context;
its just not doing it.


The same thing happens if
puts have different timestamps.


With the filter you mean?  I'd think the filter should distingush these.
St.Ack




Re: HBase filtered scan problem

2011-05-12 Thread Stack
On Thu, May 12, 2011 at 6:42 AM, Iulia Zidaru  wrote:
>  Hi,
>
> Thank you for your answer St. Ack.
> Yes, both coordinates are the same. It is impossible for the filter to
> decide that a value is old. I still don't understand why the HBase server
> has both values or how long does it keep both.

Well its hard to 'overwrite' if one value is in the memstore and the
other is out on the filesystem.

It'll do the clean up on major compaction.

The fiilter should be able to pick up ordering hints from its context;
its just not doing it.

> The same thing happens if
> puts have different timestamps.
>

With the filter you mean?  I'd think the filter should distingush these.
St.Ack


Re: HBase filtered scan problem

2011-05-12 Thread Iulia Zidaru

 Hi,

Thank you for your answer St. Ack.
Yes, both coordinates are the same. It is impossible for the filter to 
decide that a value is old. I still don't understand why the HBase 
server has both values or how long does it keep both. The same thing 
happens if puts have different timestamps.


Regards,
Iulia

On 05/11/2011 08:05 PM, Stack wrote:

On Wed, May 11, 2011 at 2:05 AM, Iulia Zidaru  wrote:

  Hi,
I'll try to rephrase the problem...
We have a table where we add an empty value.(The same thing happen also if
we have a value).
Afterward we put a value inside.(Same put, just other value). When scanning
for empty values (first values inserted), the result is wrong because the
filter gets called for both values (the empty which maches and the not empty
which doesn't match). The table has only one version. It looks like the heap
object in StoreScanner has both objects. Do you have any idea if this is a
normal behavior and if we can avoid this somehow?


Both entries exist in the hbase server, yes.

The coordinates for both are the same?  If exactly the same
row/cf/qualifier/timestamp then its going to be hard to distingush
between the two entries.  The filter is probably not smart enough to
take insertion order into account.

St.Ack





Re: HBase filtered scan problem

2011-05-11 Thread Stack
On Wed, May 11, 2011 at 2:05 AM, Iulia Zidaru  wrote:
>  Hi,
> I'll try to rephrase the problem...
> We have a table where we add an empty value.(The same thing happen also if
> we have a value).
> Afterward we put a value inside.(Same put, just other value). When scanning
> for empty values (first values inserted), the result is wrong because the
> filter gets called for both values (the empty which maches and the not empty
> which doesn't match). The table has only one version. It looks like the heap
> object in StoreScanner has both objects. Do you have any idea if this is a
> normal behavior and if we can avoid this somehow?
>

Both entries exist in the hbase server, yes.

The coordinates for both are the same?  If exactly the same
row/cf/qualifier/timestamp then its going to be hard to distingush
between the two entries.  The filter is probably not smart enough to
take insertion order into account.

St.Ack


Re: HBase filtered scan problem

2011-05-11 Thread Iulia Zidaru

 Hi,
I'll try to rephrase the problem...
We have a table where we add an empty value.(The same thing happen also 
if we have a value).
Afterward we put a value inside.(Same put, just other value). When 
scanning for empty values (first values inserted), the result is wrong 
because the filter gets called for both values (the empty which maches 
and the not empty which doesn't match). The table has only one version. 
It looks like the heap object in StoreScanner has both objects. Do you 
have any idea if this is a normal behavior and if we can avoid this somehow?


Thank you,
Iulia

On 05/10/2011 03:56 PM, Stefan Comanita wrote:

Hi all,

I want to do a scan on a number of rows, each row having multiple columns, and 
I want to filter out some of this columns based on their values per example, if 
I have the following rows:

plainRow:col:value1 column=T:19, timestamp=19, value=   
plainRow:col:value1 column=T:2, timestamp=2, value=U
plainRow:col:value1 column=T:3, timestamp=3, value=U
plainRow:col:value1 column=T:4, timestamp=4, value=


and

secondRow:col:value1 column=T:1, timestamp=1, value=   
secondRow:col:value1 column=T:2, timestamp=2, value=
secondRow:col:value1 column=T:3, timestamp=3, value=U   
secondRow:col:value1 column=T:4, timestamp=4, value=



and I want to select all the rows but just with the columns that don't have the value 
"U", something like:

plainRow:col:value1 column=T:19, timestamp=19, value=   
plainRow:col:value1 column=T:4, timestamp=4, value=
secondRow:col:value1 column=T:1, timestamp=1, value=   
secondRow:col:value1 column=T:2, timestamp=2, value=secondRow:col:value1 column=T:4, timestamp=4, value=


and to achieve this, i try the following:

Scan scan = new Scan();
 
scan.setStartRow(stringToBytes(rowIdentifier));

scan.setStopRow(stringToBytes(rowIdentifier + Constants.MAX_CHAR));
scan.addFamily(Constants.TERM_VECT_COLUMN_FAMILY);

if(includeFilter) {
 Filter filter = new ValueFilter(CompareOp.EQUAL,
 new BinaryComparator(stringToBytes("U")));
 scan.setFilter(filter);

}

and if i execute this scan I get the rows with the columns having the value "U", which is 
correct, but when i set CompareOp.NOT_EQUAL and i expect to get the other columns it doesnt work 
the way i want, it give me back all the rows, including the one which have the value "U", 
the same happens when i use:
Filter filter = new ValueFilter(CompareOp.EQUAL, new 
BinaryComparator(stringToBytes("")));

I mention that the columns have the values "U" and "" (empty string), and that 
i also saw the same behaivior with the RegexComparator and SubstringComparator.

Any idea would be very much appreciated, sorry for the long mail, thank you.

Stefan Comanita



--
Iulia Zidaru
Java Developer

1&1 Internet AG - Bucharest/Romania - Web Components Romania
18 Mircea Eliade St
Sect 1, Bucharest
RO Bucharest, 012015
iulia.zid...@1and1.ro
0040 31 223 9153

 



HBase filtered scan problem

2011-05-10 Thread Stefan Comanita
Hi all, 

I want to do a scan on a number of rows, each row having multiple columns, and 
I want to filter out some of this columns based on their values per example, if 
I have the following rows:

plainRow:col:value1 column=T:19, timestamp=19, 
value= 
  
plainRow:col:value1 column=T:2, timestamp=2, 
value=U  
  
plainRow:col:value1 column=T:3, timestamp=3, 
value=U  
  
plainRow:col:value1 column=T:4, timestamp=4, value=

and

secondRow:col:value1 column=T:1, timestamp=1, 
value= 
  
secondRow:col:value1 column=T:2, timestamp=2, 
value=  
  
secondRow:col:value1 column=T:3, timestamp=3, 
value=U 
  
secondRow:col:value1 column=T:4, timestamp=4, value=


and I want to select all the rows but just with the columns that don't have the 
value "U", something like:

plainRow:col:value1 column=T:19, timestamp=19, 
value= 
  
plainRow:col:value1 column=T:4, timestamp=4, value=
secondRow:col:value1 column=T:1, timestamp=1, 
value= 
  
secondRow:col:value1 column=T:2, timestamp=2, 
value=   
 secondRow:col:value1 column=T:4, timestamp=4, value=

and to achieve this, i try the following:

Scan scan = new Scan();
    
scan.setStartRow(stringToBytes(rowIdentifier));
scan.setStopRow(stringToBytes(rowIdentifier + Constants.MAX_CHAR));
scan.addFamily(Constants.TERM_VECT_COLUMN_FAMILY);

if(includeFilter) {
    Filter filter = new ValueFilter(CompareOp.EQUAL, 
    new BinaryComparator(stringToBytes("U")));    
    scan.setFilter(filter);
}

and if i execute this scan I get the rows with the columns having the value 
"U", which is correct, but when i set CompareOp.NOT_EQUAL and i expect to get 
the other columns it doesnt work the way i want, it give me back all the rows, 
including the one which have the value "U", the same happens when i use: 
Filter filter = new ValueFilter(CompareOp.EQUAL, new 
BinaryComparator(stringToBytes(""))); 

I mention that the columns have the values "U" and "" (empty string), and that 
i also saw the same behaivior with the RegexComparator and SubstringComparator.

Any idea would be very much appreciated, sorry for the long mail, thank you.

Stefan Comanita