Re: HBASE-138: Under load, regions become extremely large and eventually cause region servers to become unresponsive

2008-02-11 Thread stack
Mind doing a 'select * from .META.;' in the HQL screen on your master?  
Going by the below, the .META. is corrupt: i.e. restart didn't fix it 
and its same row that its complaing about.  Stick the resulting page 
into the issue if you don't mind.


Thanks,
St.Ack


Marc Harris wrote:

The client does not try to upload the same row again and again. The
hbase client tries a few times internally, but then if the exception
gets out to the client application, it is logged and the application
moves on. The client application's log (store.log) actually shows some
successes in among the failures.
  
My reading of the log file is not the same as yours. It looks to me as

if each row is tried 5 times, throwing WREs each time, before moving on
to another row. All the errors do seem to be regarding the same region
though
( pagefetch,http://fun.twilightwap.com/rate.asp?joke_id=183&rating=0
wap2 20080102055026,1202660655358,
startKey='http://fun.twilightwap.com/rate.asp?joke_id=183&rating=0 wap2
20080102055026',
getEndKey()='http://fun.twilightwap.com/rate.asp?joke_id=183&rating=0
wap2 20080102055026).
  



I tried stopping the client application, and restarting it at the point
where it failed, with no success. I tried restarting the region server
and master server too, also without success.

- Marc

P.S. Should this discussion be happening in JIRA or here or both?

On Mon, 2008-02-11 at 11:27 -0800, stack wrote:

  

Marc Harris wrote:


Logs sent via yousendit.com.

  
  
Thanks for the logs.  I took a quick look.  Upload seems to be going a 
long fine until we start getting the WrongRegionException.  In issue 
HBASE-428, you say your client is single-threaded.   Is it think-headed 
too (smile) in that it unrelentingly keeps trying the same row over and 
over?  (The log seems to have prob. w/ same row over and over again).


Guessing as to what is up, either the client cache of regions is messed 
up or the .META. table has become corrupt somehow -- it doesn't have 
list of all regions (Perhaps it didn't get a split update or some such).


If the former, I wonder what would happen if you took your load off, 
killed the client, then resumed at the problematic row?  If things 
started to work again, would seem to point at client-side issue.




Maybe "re-architect" was not an accurate representation of what I am
doing. We currently do not have a solution that allows us to add rows to
our system in arbitrary order and then analyze them, either in order or
using map-reduce. A year or so ago we tried an RDBMS, and based on that
experience, and some comments from Doug Cutting,decided that an RDBMS
had no change of being able to support this kind of functionality.

In terms of performance parameters, the 200 rows/sec that was achieved
for the first 500K rows was quite sufficient. I don't have a good answer
because after all these rows get loaded there will be numerous
map/reduce jobs that execute on them. I would guess that some vague
parameters are:

- In 3 days, load 100Gb of data representing 10M "units" split over 3
tables each of which is split over 3 column families. Some fraction of
these "units" will be replacements for existing ones (same key) some
will be new
- Several map-reduce jobs that mostly involve reading the data for each
"unit" and then writing a few small pieces of data (a few bytes) for
each "unit". Probably some more interesting maps too, but I don't know
yet.
- At least 2 map-reduce jobs that delete units.
  
  

These numbers look reasonable to me.  Lets try and make it work.


Am I correct when I say that using 4 region servers will just delay the
problem by a factor of 4, or have I misunderstood the underlying cause?

  
  

Yes.

The factor might be > 4 but effectively, if an issue using single 
server, then same issue will arise with N nodes.


St.Ack



  




Re: HBASE-138: Under load, regions become extremely large and eventually cause region servers to become unresponsive

2008-02-11 Thread Marc Harris
The client does not try to upload the same row again and again. The
hbase client tries a few times internally, but then if the exception
gets out to the client application, it is logged and the application
moves on. The client application's log (store.log) actually shows some
successes in among the failures.

My reading of the log file is not the same as yours. It looks to me as
if each row is tried 5 times, throwing WREs each time, before moving on
to another row. All the errors do seem to be regarding the same region
though
( pagefetch,http://fun.twilightwap.com/rate.asp?joke_id=183&rating=0
wap2 20080102055026,1202660655358,
startKey='http://fun.twilightwap.com/rate.asp?joke_id=183&rating=0 wap2
20080102055026',
getEndKey()='http://fun.twilightwap.com/rate.asp?joke_id=183&rating=0
wap2 20080102055026).

I tried stopping the client application, and restarting it at the point
where it failed, with no success. I tried restarting the region server
and master server too, also without success.

- Marc

P.S. Should this discussion be happening in JIRA or here or both?

On Mon, 2008-02-11 at 11:27 -0800, stack wrote:

> Marc Harris wrote:
> > Logs sent via yousendit.com.
> >
> >   
> Thanks for the logs.  I took a quick look.  Upload seems to be going a 
> long fine until we start getting the WrongRegionException.  In issue 
> HBASE-428, you say your client is single-threaded.   Is it think-headed 
> too (smile) in that it unrelentingly keeps trying the same row over and 
> over?  (The log seems to have prob. w/ same row over and over again).
> 
> Guessing as to what is up, either the client cache of regions is messed 
> up or the .META. table has become corrupt somehow -- it doesn't have 
> list of all regions (Perhaps it didn't get a split update or some such).
> 
> If the former, I wonder what would happen if you took your load off, 
> killed the client, then resumed at the problematic row?  If things 
> started to work again, would seem to point at client-side issue.
> 
> > Maybe "re-architect" was not an accurate representation of what I am
> > doing. We currently do not have a solution that allows us to add rows to
> > our system in arbitrary order and then analyze them, either in order or
> > using map-reduce. A year or so ago we tried an RDBMS, and based on that
> > experience, and some comments from Doug Cutting,decided that an RDBMS
> > had no change of being able to support this kind of functionality.
> >
> > In terms of performance parameters, the 200 rows/sec that was achieved
> > for the first 500K rows was quite sufficient. I don't have a good answer
> > because after all these rows get loaded there will be numerous
> > map/reduce jobs that execute on them. I would guess that some vague
> > parameters are:
> >
> > - In 3 days, load 100Gb of data representing 10M "units" split over 3
> > tables each of which is split over 3 column families. Some fraction of
> > these "units" will be replacements for existing ones (same key) some
> > will be new
> > - Several map-reduce jobs that mostly involve reading the data for each
> > "unit" and then writing a few small pieces of data (a few bytes) for
> > each "unit". Probably some more interesting maps too, but I don't know
> > yet.
> > - At least 2 map-reduce jobs that delete units.
> >   
> 
> These numbers look reasonable to me.  Lets try and make it work.
> > Am I correct when I say that using 4 region servers will just delay the
> > problem by a factor of 4, or have I misunderstood the underlying cause?
> >
> >   
> Yes.
> 
> The factor might be > 4 but effectively, if an issue using single 
> server, then same issue will arise with N nodes.
> 
> St.Ack


Re: Doubt in RegExpRowFilter and RowFilters in general

2008-02-11 Thread stack
Have you tried enabling DEBUG-level logging?  Filters have lots of 
logging around state changes.  Might help figure this issue.  You might 
need to add extra logging around line #2401 in HStore.


(I just spent some time trying to bend my head around whats going on.  
Filters are run at the Store level.  It looks like that in 
RegExpRowFilter, a map is made on construction of column to value.  If 
value matches, filter returns false, so cell should be added in each 
family.  I don't see anything obviously wrong in here).


St.Ack


David Alves wrote:

St.Ack

Thanks for your reply.

When I use RegExpRowFilter with only one (either one) of the conditions
it works (the rows are passed onto the Map/Reduce task) but there is
still a problem because only one of them column is present in the
resulting MapWritable (I'm using my own tableinputformat) from the
scanner.
So I still use the filter to check for more rows (build a scanner with
one of the conditions, the rarest one, iterate through to try and find
the other) but not in the tableinputformat itself (I just discard the
unwanted values in the Mapper) which is a performance hit (if it would
be the scanner the row wouldn't simply be sent to the master right,
therefore less traffic is distributed mode?), but no big deal.
I seems to me that when the filter is applied only the column that
matches (or the one that doesn't match I'm not sure at the moment) is
passed to the scanner result.

As to the second point I'm running HBase in local mode for development
and the DEBUG log for the HMaster shows nothing, my process simply hangs
indefinitely.

When I'll have some free time I'll try to look into the sources, and
pinpoint the problem more accurately.

David

On Mon, 2008-02-11 at 10:36 -0800, stack wrote:
  

David:

IMO, filters are a bit of sweet functionality but they are 
not easy to use.  They also have seen little exercise so you are 
probably tripping over bugs.  That said, I know they basically 
work.


I'd suggest you progress from basic filtering toward the filter you'd 
like to implement.   Does the RegExpRowFilter do the right thing when 
filtering one column only?


On the ClassNotFoundException, yeah, it should be coming out on the 
client.  Can you see it in the server logs?  Do you get any exceptions 
client-side?


St.Ack



David Alves wrote:


Hi Again

In my previous example I seem to have misplaced a "new" keyword (new
myvalue1.getBytes() where it should have been myvalue1.getBytes()).

On another note my program hangs when I supply my own filter to the
scanner (I suppose it's clear that the nodes don't know my class so
there should be a ClassNotFoundException right?).

Regards
David Alves 



On Mon, 2008-02-11 at 16:51 +, David Alves wrote: 
  
  

Hi Guys
In my previous email I might have misunderstood the roles of the
RowFilterInterfaces so I'll pose my question more clearly (since the
last one wasn't in question form :)).
I save a setup when a table has to columns belonging to different
column families (Table A cf1:a cf2:b));

I'm trying to build a filter so that a scanner only returns the rows
where cf1:a = myvalue1 and cf2:b = myvalue2.

I've build a RegExpRowFilter like this;
Map conditionalsMap = new HashMap();
conditionalsMap.put(new Text(cf1:a), new myvalue1.getBytes());
conditionalsMap.put(new Text(cf2:b), myvalue2.getBytes());
return new RegExpRowFilter(".*", conditionalsMap);

My problem is this filter always fails when I know for sure that there
are rows whose columns match my values.

I'm building the the scanner like this (the purpose in this case is to
find if there are more values that match my filter):

final Text startKey = this.htable.getStartKeys()[0];
HScannerInterface scanner = htable.obtainScanner(new 
Text[] {new
Text(cf1:a), new Text(cf2:b)}, startKey, rowFilterInterface);
return scanner.iterator().hasNext();

Can anyone give me a hand please.

Thanks in advance
David Alves





  
  


  




Re: HBASE-138: Under load, regions become extremely large and eventually cause region servers to become unresponsive

2008-02-11 Thread stack

Marc Harris wrote:

Logs sent via yousendit.com.

  
Thanks for the logs.  I took a quick look.  Upload seems to be going a 
long fine until we start getting the WrongRegionException.  In issue 
HBASE-428, you say your client is single-threaded.   Is it think-headed 
too (smile) in that it unrelentingly keeps trying the same row over and 
over?  (The log seems to have prob. w/ same row over and over again).


Guessing as to what is up, either the client cache of regions is messed 
up or the .META. table has become corrupt somehow -- it doesn't have 
list of all regions (Perhaps it didn't get a split update or some such).


If the former, I wonder what would happen if you took your load off, 
killed the client, then resumed at the problematic row?  If things 
started to work again, would seem to point at client-side issue.



Maybe "re-architect" was not an accurate representation of what I am
doing. We currently do not have a solution that allows us to add rows to
our system in arbitrary order and then analyze them, either in order or
using map-reduce. A year or so ago we tried an RDBMS, and based on that
experience, and some comments from Doug Cutting,decided that an RDBMS
had no change of being able to support this kind of functionality.

In terms of performance parameters, the 200 rows/sec that was achieved
for the first 500K rows was quite sufficient. I don't have a good answer
because after all these rows get loaded there will be numerous
map/reduce jobs that execute on them. I would guess that some vague
parameters are:

- In 3 days, load 100Gb of data representing 10M "units" split over 3
tables each of which is split over 3 column families. Some fraction of
these "units" will be replacements for existing ones (same key) some
will be new
- Several map-reduce jobs that mostly involve reading the data for each
"unit" and then writing a few small pieces of data (a few bytes) for
each "unit". Probably some more interesting maps too, but I don't know
yet.
- At least 2 map-reduce jobs that delete units.
  


These numbers look reasonable to me.  Lets try and make it work.

Am I correct when I say that using 4 region servers will just delay the
problem by a factor of 4, or have I misunderstood the underlying cause?

  

Yes.

The factor might be > 4 but effectively, if an issue using single 
server, then same issue will arise with N nodes.


St.Ack


Re: Doubt in RegExpRowFilter and RowFilters in general

2008-02-11 Thread David Alves
St.Ack

Thanks for your reply.

When I use RegExpRowFilter with only one (either one) of the conditions
it works (the rows are passed onto the Map/Reduce task) but there is
still a problem because only one of them column is present in the
resulting MapWritable (I'm using my own tableinputformat) from the
scanner.
So I still use the filter to check for more rows (build a scanner with
one of the conditions, the rarest one, iterate through to try and find
the other) but not in the tableinputformat itself (I just discard the
unwanted values in the Mapper) which is a performance hit (if it would
be the scanner the row wouldn't simply be sent to the master right,
therefore less traffic is distributed mode?), but no big deal.
I seems to me that when the filter is applied only the column that
matches (or the one that doesn't match I'm not sure at the moment) is
passed to the scanner result.

As to the second point I'm running HBase in local mode for development
and the DEBUG log for the HMaster shows nothing, my process simply hangs
indefinitely.

When I'll have some free time I'll try to look into the sources, and
pinpoint the problem more accurately.

David

On Mon, 2008-02-11 at 10:36 -0800, stack wrote:
> David:
> 
> IMO, filters are a bit of sweet functionality but they are 
> not easy to use.  They also have seen little exercise so you are 
> probably tripping over bugs.  That said, I know they basically 
> work.
> 
> I'd suggest you progress from basic filtering toward the filter you'd 
> like to implement.   Does the RegExpRowFilter do the right thing when 
> filtering one column only?
> 
> On the ClassNotFoundException, yeah, it should be coming out on the 
> client.  Can you see it in the server logs?  Do you get any exceptions 
> client-side?
> 
> St.Ack
> 
> 
> 
> David Alves wrote:
> > Hi Again
> >
> > In my previous example I seem to have misplaced a "new" keyword (new
> > myvalue1.getBytes() where it should have been myvalue1.getBytes()).
> >
> > On another note my program hangs when I supply my own filter to the
> > scanner (I suppose it's clear that the nodes don't know my class so
> > there should be a ClassNotFoundException right?).
> >
> > Regards
> > David Alves 
> >
> >
> > On Mon, 2008-02-11 at 16:51 +, David Alves wrote: 
> >   
> >> Hi Guys
> >>In my previous email I might have misunderstood the roles of the
> >> RowFilterInterfaces so I'll pose my question more clearly (since the
> >> last one wasn't in question form :)).
> >>I save a setup when a table has to columns belonging to different
> >> column families (Table A cf1:a cf2:b));
> >>
> >> I'm trying to build a filter so that a scanner only returns the rows
> >> where cf1:a = myvalue1 and cf2:b = myvalue2.
> >>
> >> I've build a RegExpRowFilter like this;
> >> Map conditionalsMap = new HashMap();
> >>conditionalsMap.put(new Text(cf1:a), new myvalue1.getBytes());
> >>conditionalsMap.put(new Text(cf2:b), myvalue2.getBytes());
> >>return new RegExpRowFilter(".*", conditionalsMap);
> >>
> >> My problem is this filter always fails when I know for sure that there
> >> are rows whose columns match my values.
> >>
> >> I'm building the the scanner like this (the purpose in this case is to
> >> find if there are more values that match my filter):
> >>
> >> final Text startKey = this.htable.getStartKeys()[0];
> >>HScannerInterface scanner = htable.obtainScanner(new 
> >> Text[] {new
> >> Text(cf1:a), new Text(cf2:b)}, startKey, rowFilterInterface);
> >>return scanner.iterator().hasNext();
> >>
> >> Can anyone give me a hand please.
> >>
> >> Thanks in advance
> >> David Alves
> >>
> >>
> >>
> >> 
> >
> >   
> 



Re: Doubt in RegExpRowFilter and RowFilters in general

2008-02-11 Thread stack

David:

IMO, filters are a bit of sweet functionality but they are 
not easy to use.  They also have seen little exercise so you are 
probably tripping over bugs.  That said, I know they basically 
work.


I'd suggest you progress from basic filtering toward the filter you'd 
like to implement.   Does the RegExpRowFilter do the right thing when 
filtering one column only?


On the ClassNotFoundException, yeah, it should be coming out on the 
client.  Can you see it in the server logs?  Do you get any exceptions 
client-side?


St.Ack



David Alves wrote:

Hi Again

In my previous example I seem to have misplaced a "new" keyword (new
myvalue1.getBytes() where it should have been myvalue1.getBytes()).

On another note my program hangs when I supply my own filter to the
scanner (I suppose it's clear that the nodes don't know my class so
there should be a ClassNotFoundException right?).

Regards
David Alves 



On Mon, 2008-02-11 at 16:51 +, David Alves wrote: 
  

Hi Guys
In my previous email I might have misunderstood the roles of the
RowFilterInterfaces so I'll pose my question more clearly (since the
last one wasn't in question form :)).
I save a setup when a table has to columns belonging to different
column families (Table A cf1:a cf2:b));

I'm trying to build a filter so that a scanner only returns the rows
where cf1:a = myvalue1 and cf2:b = myvalue2.

I've build a RegExpRowFilter like this;
Map conditionalsMap = new HashMap();
conditionalsMap.put(new Text(cf1:a), new myvalue1.getBytes());
conditionalsMap.put(new Text(cf2:b), myvalue2.getBytes());
return new RegExpRowFilter(".*", conditionalsMap);

My problem is this filter always fails when I know for sure that there
are rows whose columns match my values.

I'm building the the scanner like this (the purpose in this case is to
find if there are more values that match my filter):

final Text startKey = this.htable.getStartKeys()[0];
HScannerInterface scanner = htable.obtainScanner(new 
Text[] {new
Text(cf1:a), new Text(cf2:b)}, startKey, rowFilterInterface);
return scanner.iterator().hasNext();

Can anyone give me a hand please.

Thanks in advance
David Alves






  




Re: Doubt in RegExpRowFilter and RowFilters in general

2008-02-11 Thread David Alves
Hi Again

In my previous example I seem to have misplaced a "new" keyword (new
myvalue1.getBytes() where it should have been myvalue1.getBytes()).

On another note my program hangs when I supply my own filter to the
scanner (I suppose it's clear that the nodes don't know my class so
there should be a ClassNotFoundException right?).

Regards
David Alves 


On Mon, 2008-02-11 at 16:51 +, David Alves wrote: 
> Hi Guys
>   In my previous email I might have misunderstood the roles of the
> RowFilterInterfaces so I'll pose my question more clearly (since the
> last one wasn't in question form :)).
>   I save a setup when a table has to columns belonging to different
> column families (Table A cf1:a cf2:b));
> 
> I'm trying to build a filter so that a scanner only returns the rows
> where cf1:a = myvalue1 and cf2:b = myvalue2.
> 
> I've build a RegExpRowFilter like this;
> Map conditionalsMap = new HashMap();
>   conditionalsMap.put(new Text(cf1:a), new myvalue1.getBytes());
>   conditionalsMap.put(new Text(cf2:b), myvalue2.getBytes());
>   return new RegExpRowFilter(".*", conditionalsMap);
> 
> My problem is this filter always fails when I know for sure that there
> are rows whose columns match my values.
> 
> I'm building the the scanner like this (the purpose in this case is to
> find if there are more values that match my filter):
> 
> final Text startKey = this.htable.getStartKeys()[0];
>   HScannerInterface scanner = htable.obtainScanner(new 
> Text[] {new
> Text(cf1:a), new Text(cf2:b)}, startKey, rowFilterInterface);
>   return scanner.iterator().hasNext();
> 
> Can anyone give me a hand please.
> 
> Thanks in advance
> David Alves
> 
> 
> 



Doubt in RegExpRowFilter and RowFilters in general

2008-02-11 Thread David Alves
Hi Guys
In my previous email I might have misunderstood the roles of the
RowFilterInterfaces so I'll pose my question more clearly (since the
last one wasn't in question form :)).
I save a setup when a table has to columns belonging to different
column families (Table A cf1:a cf2:b));

I'm trying to build a filter so that a scanner only returns the rows
where cf1:a = myvalue1 and cf2:b = myvalue2.

I've build a RegExpRowFilter like this;
Map conditionalsMap = new HashMap();
conditionalsMap.put(new Text(cf1:a), new myvalue1.getBytes());
conditionalsMap.put(new Text(cf2:b), myvalue2.getBytes());
return new RegExpRowFilter(".*", conditionalsMap);

My problem is this filter always fails when I know for sure that there
are rows whose columns match my values.

I'm building the the scanner like this (the purpose in this case is to
find if there are more values that match my filter):

final Text startKey = this.htable.getStartKeys()[0];
HScannerInterface scanner = htable.obtainScanner(new 
Text[] {new
Text(cf1:a), new Text(cf2:b)}, startKey, rowFilterInterface);
return scanner.iterator().hasNext();

Can anyone give me a hand please.

Thanks in advance
David Alves





Re: Evaluating HBase 3

2008-02-11 Thread Charles Kaminski
St.Ack and Bryan,

Turns out it was inconsistant testing on our part. 
When we tested with HBase Shell on the server and got
similar results, we thought we were ruling out any
issues with machines connecteding to the cluster. 

The posts questioning HBase Shell as a good test
prompted us to go back and take a more indepth review.

Thanks again!

--- stack <[EMAIL PROTECTED]> wrote:

> Lets try and figure out whats going on Charles.
> 
> The figures on the end of this page have us random
> reading bigger values 
> out of a table of 1M rows at somewhere between 150
> and 300 rows a 
> second, dependent on hbase version (Whats your
> version?)
> 
> Want to send us the code your java apps are using to
> access hbase so we 
> can check it out?
> 
> Thanks,
> St.Ack
> 
> 
> Charles Kaminski wrote:
> > Hi St.Ack,
> >
> > Thanks for the response.  The performance changes
> > below are consistent with what we find in our java
> > app.  We used Hbase Shell directly on the server
> to
> > rule out anything we might be doing wrong.
> >
> >
> > --- stack <[EMAIL PROTECTED]> wrote:
> >
> >   
> >> You are using the shell to do your fetching?  Try
> >> writing a little java 
> >> program.
> >> St.Ack
> >>
> >>
> >> Charles Kaminski wrote:
> >> 
> >>> Hi All,
> >>>
> >>> We're running into sever performance issues. 
> I'm
> >>> hoping that there is something simple we can do
> to
> >>> resolve the issues.  Any help would be
> >>>   
> >> appreciated.
> >> 
> >>> Here's what we did:
> >>> 1.  Loaded 1,000 records into a table with only
> >>>   
> >> two
> >> 
> >>> columns - row and content:.  Row data is 12
> bytes
> >>>   
> >> and
> >> 
> >>> content: data is 23 bytes long.
> >>> 2. Using HBase, selected a single record based
> on
> >>>   
> >> row
> >> 
> >>> in the where clause.  Did this for a few
> different
> >>> records.  Performance was consistantly 0.01
> >>>   
> >> seconds as
> >> 
> >>> reported by Hbase.
> >>> 3. Loaded 1,000,000 records into the same table.
> 
> >>>   
> >> This
> >> 
> >>> took 248 seconds using random row values.
> >>> 4. Ran the exact same select statments again as
> in
> >>> step 2.  These consistantly took 2 to 3 seconds
> to
> >>> return a single record.
> >>>
> >>> 2 to 3 seconds to return a single record using a
> >>>   
> >> key
> >> 
> >>> value suggests a major issue with our setup. 
> I'm
> >>> hoping you agree and can point us to something
> >>>   
> >> we're
> >> 
> >>> doing wrong.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>  
> >>>   
> >
>

> >   
> >>> Looking for last minute shopping deals?  
> >>> Find them fast with Yahoo! Search. 
> >>>   
> >
>
http://tools.search.yahoo.com/newsearch/category.php?category=shopping
> >   
> >>>   
> >>>   
> >> 
> >
> >
> >
> >  
>

> > Be a better friend, newshound, and 
> > know-it-all with Yahoo! Mobile.  Try it now. 
>
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> 
> >
> >   
> 
> 



  

Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  
http://tools.search.yahoo.com/newsearch/category.php?category=shopping