Re: Suspected memory leak

2011-12-01 Thread bijieshan
Thank you all. 
I think it's the same problem with the link provided by Stack. Because the 
heap-size is stabilized, but the non-heap size keep growing. So I think not the 
problem of the CMS GC bug. 
And we have known the content of the problem memory section, all the records 
contains the info like below:
"|www.hostname02087075.comlhggmdjapwpfvkqvxgnskzzydiywoacjnpljkarlehrnzzbpbxc||460|||Agent"
"BBZHtable_UFDR_058,048342220093168-02570"


Jieshan.

-邮件原件-
发件人: Kihwal Lee [mailto:kih...@yahoo-inc.com] 
发送时间: 2011年12月2日 4:20
收件人: d...@hbase.apache.org
抄送: Ramakrishna s vasudevan; user@hbase.apache.org
主题: Re: Suspected memory leak

Adding to the excellent write-up by Jonathan:
Since finalizer is involved, it takes two GC cycles to collect them.  Due to a 
bug/bugs in the CMS GC, collection may not happen and the heap can grow really 
big.  See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7112034 for 
details.

Koji tried "-XX:-CMSConcurrentMTEnabled" and confirmed that all the socket 
related objects were being collected properly. This option forces the 
concurrent marker to be one thread. This was for HDFS, but I think the same 
applies here.

Kihwal

On 12/1/11 1:26 PM, "Stack"  wrote:

Make sure its not the issue that Jonathan Payne identifiied a while
back: 
https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357#
St.Ack



Re: Problem in configuring pseudo distributed mode

2011-12-01 Thread Mohammad Tariq
Hello Christopher,

I don't have 127.0.1.1 in my hosts file.

Regards,
    Mohammad Tariq



On Thu, Dec 1, 2011 at 6:42 PM, Christopher Dorner
 wrote:
> Your hosts file (/etc/hosts) should contain only sth like
> 127.0.0.1 localhost
> Or
> 127.0.0.1 
>
> It should not contain sth like
> 127.0.1.1 localhost.
>
> And i think u need to reboot after changing it. Hope that helps.
>
> Regards,
> Christopher
> Am 01.12.2011 13:24 schrieb "Mohammad Tariq" :
>
>> Hello list,
>>
>>    Even after following the directions provided by you guys and Hbase
>> book and several other blogs and posts I am not able to run Hbase in
>> pseudo distributed mode.And I think there is some problem with the
>> hosts file.I would highly appreciate if someone who has done it
>> properly could share his/her hosts and hbase-site.xml files???
>>
>> Regards,
>>     Mohammad Tariq
>>


OLAP-ish incremental BI capabilities for hbase, looking for collaborators.

2011-12-01 Thread Dmitriy Lyubimov
Hello,

We (Inadco) are looking for users and developers to engage in our open
project code named, for the lack of better name, "hbase-lattice" in
order to mutually benefit and eventually develop a mature Hbase-based
BI real time OLAP-ish solution.

The basic premise is to use Cuboid Lattice -like model (hence the
name) to precompute cuboids manually specified (as opposed to
statistically inferred) and then build query optimization based on
those. (similarly to how DBA analyzes query use cases and decides
which indexes to build where). On top of it, we threw in both
declarative api and query language support (our reporting system and
realtime analytics in platform is actually using query language, to
facilitate our developers there).

We aim to answer aggregate queries over cube of facts with very short
hbase scans over the same projection table so analytical reply could
be produced in under 1 ms+ network overhead.

Another goal is short latency for fact availability. So we employ
incremental MR fact compilation (~ 3-5 mins since fact event average,
depending on the # of cuboids and cluster size).

At this point we have this project in production with minimum
capabilities we need, in kind of pilot mode still, but it is to
replace our previous manual incremental projection work on a permanent
basis very soon (as soon as we finish migrating our reporting).
There's a long todo list and certainly any given particular 3rd party
would likely find voids it would have liked to be filled in.

Right now we have integration only with Jasper reports tool but
eventually it shouldn't be too difficult to wrap the client into jdbc
contract as well and enable practically any use (it's just since we
integrate tightly both reporting and RT platforms we don't have a need
for jdbc per se at the moment).

The project readme document is here:
https://github.com/dlyubimov/HBase-Lattice/blob/dev-0.1.x/docs/readme.pdf?raw=true
or here 
https://github.com/inadco/HBase-Lattice/blob/dev-0.1.x/docs/readme.pdf?raw=true

Aside from capabilities mentioned in this document, we also support
custom aggregate functions which we can plug directly into model
definiton. (so we can develop custom aggregate function set rather
quickly and easily, as it stands).

The cube update is incremental with a two-step (sequentially)
generated pig job (so perhaps compiler cycle could be ~5 minutes since
actual event). We process and aggregate our impression and click and
other fact streams with it. The model can be updated with some
backward compatibility conventions similar to protobuf conventions (as
in add stuff, but don't change type) and the changes could be pushed
into production system to take effect immediately henceforth without
any need to redeploy any code.

Operationally, i tested it for 1.2 bln/day rather wide event fact
streams packed as protobuf messages inside sequence files on 6 nodes,
and the compilation was not even breaking a sweat. Obviously, my data
highly aggregates over time dimensions which correspond to time of the
event, so hbase update load actually is pretty light due to high
degree of aggregation. But the biggest benefit is that one can scale
horizontally the number of facts handled per unit of time pretty
impressively.

This is optimized for time series data for the most part, so
consequently one will see very limited and time oriented support for
dimension types and hierarchies at the moment.

Generally i think the need is very common for BI solutions over big
data time series such as impression or request logs but surprisingly I
did not find a well-maintained hbase solution for that (although i did
see either stale or less capable attempts out there -- i certainly
have missed stuff floating around), so hence.

We are planning to maintain the project for a long time as a part of
our production system. Please email me if there's interest as either
user or collaborator. I think i saw a couple of emails on this list
looking for a solution for a similar problem.

This is partly inspired by and intended as a complementary solution to
Tsuna's Open TSDB (so big thanks to StumbleUpon's people for an
example of how to handle time series data).

Thanks.
-Dmitriy


Re: Atomicity questions

2011-12-01 Thread lars hofhansl
ZK is mostly for orchestrating between the master and regionservers.



- Original Message -
From: Mohit Anchlia 
To: user@hbase.apache.org; lars hofhansl 
Cc: 
Sent: Thursday, December 1, 2011 3:57 PM
Subject: Re: Atomicity questions

Thanks that makes it more clear. I also looked at mvcc code as you pointed out.

So I am wondering where ZK is used specifically.

On Thu, Dec 1, 2011 at 3:37 PM, lars hofhansl  wrote:
> Nope, not using ZK, that would not scale down to the cell level.
> You'll probably have to stare at the code in 
> MultiVersionConsistencyControlfor a while (I know I had to).
>
> The basic flow of a write operation is this:
> 1. lock the row
>
> 2. persist change to the write ahead log
> 3. get a "writenumber" from mvcc (this is basically a timestamp)
>
> 4. apply change to the memstore (using that write number).
> 5. advance the readpoint (maximum timestamp of changes that reads will see) 
> -- this is the point where readers see the change
> 6. unlock the row
>
> (7. when memstore is full, flush it to a new disk file, but is done 
> asynchronously, and not really important, although it has some complicated 
> implications when the flush happens while there are readers reading from an 
> old read point)
>
>
> The above is relaxed sometimes for idempotent operations.
>
> -- Lars
>
>
> - Original Message -
> From: Mohit Anchlia 
> To: user@hbase.apache.org; lars hofhansl 
> Cc:
> Sent: Thursday, December 1, 2011 3:03 PM
> Subject: Re: Atomicity questions
>
> Thanks. I'll try and take a look, but I haven't worked with zookeeper
> before. Does it use zookeeper for any of ACID functionality?
>
> On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl  wrote:
>> Hi Mohit,
>>
>> the best way to study this is to look at MultiVersionConsistencyControl.java 
>> (since you are asking how this handled internally).
>>
>> In a nutshell this ensures that read operations don't see writes that are 
>> not completed, by (1) defining a thread read point that is rolled forward 
>> only after a completed operations and (2) assigning a special timestamp (not 
>> the timestamp that you set from the client API) to all KeyValues.
>>
>> -- Lars
>>
>>
>> - Original Message -
>> From: Mohit Anchlia 
>> To: user@hbase.apache.org
>> Cc:
>> Sent: Thursday, December 1, 2011 2:22 PM
>> Subject: Atomicity questions
>>
>> I have some questions about ACID after reading this page,
>> http://hbase.apache.org/acid-semantics.html
>>
>> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or
>> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1".
>>
>> How is this internally handled in hbase such that above is possible?
>>
>>
>
>



Re: Atomicity questions

2011-12-01 Thread Mohit Anchlia
Thanks that makes it more clear. I also looked at mvcc code as you pointed out.

So I am wondering where ZK is used specifically.

On Thu, Dec 1, 2011 at 3:37 PM, lars hofhansl  wrote:
> Nope, not using ZK, that would not scale down to the cell level.
> You'll probably have to stare at the code in 
> MultiVersionConsistencyControlfor a while (I know I had to).
>
> The basic flow of a write operation is this:
> 1. lock the row
>
> 2. persist change to the write ahead log
> 3. get a "writenumber" from mvcc (this is basically a timestamp)
>
> 4. apply change to the memstore (using that write number).
> 5. advance the readpoint (maximum timestamp of changes that reads will see) 
> -- this is the point where readers see the change
> 6. unlock the row
>
> (7. when memstore is full, flush it to a new disk file, but is done 
> asynchronously, and not really important, although it has some complicated 
> implications when the flush happens while there are readers reading from an 
> old read point)
>
>
> The above is relaxed sometimes for idempotent operations.
>
> -- Lars
>
>
> - Original Message -
> From: Mohit Anchlia 
> To: user@hbase.apache.org; lars hofhansl 
> Cc:
> Sent: Thursday, December 1, 2011 3:03 PM
> Subject: Re: Atomicity questions
>
> Thanks. I'll try and take a look, but I haven't worked with zookeeper
> before. Does it use zookeeper for any of ACID functionality?
>
> On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl  wrote:
>> Hi Mohit,
>>
>> the best way to study this is to look at MultiVersionConsistencyControl.java 
>> (since you are asking how this handled internally).
>>
>> In a nutshell this ensures that read operations don't see writes that are 
>> not completed, by (1) defining a thread read point that is rolled forward 
>> only after a completed operations and (2) assigning a special timestamp (not 
>> the timestamp that you set from the client API) to all KeyValues.
>>
>> -- Lars
>>
>>
>> - Original Message -
>> From: Mohit Anchlia 
>> To: user@hbase.apache.org
>> Cc:
>> Sent: Thursday, December 1, 2011 2:22 PM
>> Subject: Atomicity questions
>>
>> I have some questions about ACID after reading this page,
>> http://hbase.apache.org/acid-semantics.html
>>
>> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or
>> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1".
>>
>> How is this internally handled in hbase such that above is possible?
>>
>>
>
>


Re: Atomicity questions

2011-12-01 Thread lars hofhansl
Nope, not using ZK, that would not scale down to the cell level.
You'll probably have to stare at the code in MultiVersionConsistencyControlfor 
a while (I know I had to).

The basic flow of a write operation is this:
1. lock the row

2. persist change to the write ahead log
3. get a "writenumber" from mvcc (this is basically a timestamp)

4. apply change to the memstore (using that write number).
5. advance the readpoint (maximum timestamp of changes that reads will see) -- 
this is the point where readers see the change
6. unlock the row

(7. when memstore is full, flush it to a new disk file, but is done 
asynchronously, and not really important, although it has some complicated 
implications when the flush happens while there are readers reading from an old 
read point)


The above is relaxed sometimes for idempotent operations.

-- Lars


- Original Message -
From: Mohit Anchlia 
To: user@hbase.apache.org; lars hofhansl 
Cc: 
Sent: Thursday, December 1, 2011 3:03 PM
Subject: Re: Atomicity questions

Thanks. I'll try and take a look, but I haven't worked with zookeeper
before. Does it use zookeeper for any of ACID functionality?

On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl  wrote:
> Hi Mohit,
>
> the best way to study this is to look at MultiVersionConsistencyControl.java 
> (since you are asking how this handled internally).
>
> In a nutshell this ensures that read operations don't see writes that are not 
> completed, by (1) defining a thread read point that is rolled forward only 
> after a completed operations and (2) assigning a special timestamp (not the 
> timestamp that you set from the client API) to all KeyValues.
>
> -- Lars
>
>
> - Original Message -
> From: Mohit Anchlia 
> To: user@hbase.apache.org
> Cc:
> Sent: Thursday, December 1, 2011 2:22 PM
> Subject: Atomicity questions
>
> I have some questions about ACID after reading this page,
> http://hbase.apache.org/acid-semantics.html
>
> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or
> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1".
>
> How is this internally handled in hbase such that above is possible?
>
>



Re: regions and tables

2011-12-01 Thread Jean-Daniel Cryans
Excellent question.

I would say that if you are planning to have thousands of tables with
the same schema then instead you should use one table with prefixed
rows.

The 20 regions / region server is a general guideline that works best
in the single tenant case, meaning that you have only 1 table and it's
perfectly distributed. My first answer brings you back to that form.

In the multi-tenant case where every table is different not only in
the nature of the data they contain but also in their usage patterns,
the answer basically is YMMV. There really is no universal answer at
the moment. At SU we have >250 tables and we have ~200 regions per
region server, works well for us.

J-D

On Thu, Dec 1, 2011 at 12:26 PM, Sam Seigal  wrote:
> So is it fair to say that the number of tables one can create is also
> bounded by the number of regions that the cluster can support ?
>
> For example, given 5 region servers  and keeping 20 regions / region
> server - with 5 tables, I am restricted to only being able to scale a
> single table to 20 regions across the cluster - this might be fine.
> However, for 20 tables, I can only scale upto 5 regions / table across
> the cluster - which might not be a good idea.  Comments ?
>
>
> On Thu, Dec 1, 2011 at 5:31 AM, Doug Meil  
> wrote:
>> To expand on what Lars said, there is an example of how this is layed out
>> on disk...
>>
>> http://hbase.apache.org/book.html#trouble.namenode.disk
>>
>> ... regions distribute the table, so two different tables will be
>> distributed by separate sets of regions.
>>
>>
>>
>>
>> On 12/1/11 3:14 AM, "Lars George"  wrote:
>>
>>>Hi Sam,
>>>
>>>You need to handle them all separately. The note - I assume - was solely
>>>explaining the fact that the "load" of a region server is defined by the
>>>number of regions it hosts, not the number of tables. If you want to
>>>precreate the regions for one or more than one table is the same work:
>>>create the tables (one by one) with the list of split points.
>>>
>>>Lars
>>>
>>>On Dec 1, 2011, at 7:50 AM, Sam Seigal wrote:
>>>
 HI,

 I had a question about the relationship  between regions and tables.

 Is there a way to pre-create regions for multiple tables ? or each
 table has its own set of regions managed independently ?

 I read on one of the threads that there is really no limit on the
 number of tables, but that we need to be careful about is the number
 of regions. Does this mean that the regions can be pre created for
 multiple tables ?

 Thank you,

 Sam
>>>
>>>
>>
>>


Re: Atomicity questions

2011-12-01 Thread Stack
On Thu, Dec 1, 2011 at 3:03 PM, Mohit Anchlia  wrote:
> Thanks. I'll try and take a look, but I haven't worked with zookeeper
> before. Does it use zookeeper for any of ACID functionality?
>

No.
St.Ack


Re: hbase sandbox at ImageShack.

2011-12-01 Thread Stack
On Thu, Dec 1, 2011 at 2:34 PM, Jack Levin  wrote:
> Hello All.   I've setup an hbase (0.90.4) sandbox running on servers
> where we have some excess capacity.  Feel free to play with it, e.g.
> create tables, run load tests, benchmarks, essentially do whatever you
> want, just don't put your production services there, because while we
> do have it up due to excess capacity, we may have to reclaim the
> hardware at some point.
> (don't worry about slamming it hard, those servers are running on
> non-production zone of our network).
>


Nice one Jack.  Thanks.
St.Ack


Re: Atomicity questions

2011-12-01 Thread Mohit Anchlia
Thanks. I'll try and take a look, but I haven't worked with zookeeper
before. Does it use zookeeper for any of ACID functionality?

On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl  wrote:
> Hi Mohit,
>
> the best way to study this is to look at MultiVersionConsistencyControl.java 
> (since you are asking how this handled internally).
>
> In a nutshell this ensures that read operations don't see writes that are not 
> completed, by (1) defining a thread read point that is rolled forward only 
> after a completed operations and (2) assigning a special timestamp (not the 
> timestamp that you set from the client API) to all KeyValues.
>
> -- Lars
>
>
> - Original Message -
> From: Mohit Anchlia 
> To: user@hbase.apache.org
> Cc:
> Sent: Thursday, December 1, 2011 2:22 PM
> Subject: Atomicity questions
>
> I have some questions about ACID after reading this page,
> http://hbase.apache.org/acid-semantics.html
>
> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or
> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1".
>
> How is this internally handled in hbase such that above is possible?
>
>


Re: Atomicity questions

2011-12-01 Thread lars hofhansl
Hi Mohit,

the best way to study this is to look at MultiVersionConsistencyControl.java 
(since you are asking how this handled internally).

In a nutshell this ensures that read operations don't see writes that are not 
completed, by (1) defining a thread read point that is rolled forward only 
after a completed operations and (2) assigning a special timestamp (not the 
timestamp that you set from the client API) to all KeyValues.

-- Lars


- Original Message -
From: Mohit Anchlia 
To: user@hbase.apache.org
Cc: 
Sent: Thursday, December 1, 2011 2:22 PM
Subject: Atomicity questions

I have some questions about ACID after reading this page,
http://hbase.apache.org/acid-semantics.html

- Atomicity point 5 : row must either be "a=1,b=1,c=1" or
"a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1".

How is this internally handled in hbase such that above is possible?



hbase sandbox at ImageShack.

2011-12-01 Thread Jack Levin
Hello All.   I've setup an hbase (0.90.4) sandbox running on servers
where we have some excess capacity.  Feel free to play with it, e.g.
create tables, run load tests, benchmarks, essentially do whatever you
want, just don't put your production services there, because while we
do have it up due to excess capacity, we may have to reclaim the
hardware at some point.
(don't worry about slamming it hard, those servers are running on
non-production zone of our network).

To create a client that can interface with the cluster, just download
hbase 0.90.4, and compile your code against the jars the come with it
(applies to the client below on paste bin).

I've created a very simple benchmark (proof of concept client) and
installed it in micro EC2 instance (you can find the code here
http://pastebin.com/BxSs2daY):

[ec2-user@ip-10-160-135-246 ~]$ java Hello  2>> /dev/null
Enter the number of rows you want to Put and Get : 1
Enter the row value payload : 12345
Writing 1 rows took 41470 milliseconds (<--- 4.1 ms per row not too bad!)
Reading 1 rows took 43731 milliseconds

[ec2-user@ip-10-160-135-246 ~]$ hbase/bin/hbase shell
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011

hbase(main):001:0> list
TABLE
myTable
1 row(s) in 0.9190 seconds

hbase(main):002:0>

The Hbase zookeeper quorum has only one address
"img700.imageshack.us:2181", see code above on how to interface with
it.

If you find this setup interesting or useful, or have questions about
please email me;  Otherwise have fun!

-Jack

(PS.  Don't delete other people's tables and don't expose data you
don't want to be exposed, the cluster is read/write enabled for _all_,
we will carry no liabilities for anything whatsoever :)


Re: Suspected memory leak

2011-12-01 Thread Kihwal Lee
Adding to the excellent write-up by Jonathan:
Since finalizer is involved, it takes two GC cycles to collect them.  Due to a 
bug/bugs in the CMS GC, collection may not happen and the heap can grow really 
big.  See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7112034 for 
details.

Koji tried "-XX:-CMSConcurrentMTEnabled" and confirmed that all the socket 
related objects were being collected properly. This option forces the 
concurrent marker to be one thread. This was for HDFS, but I think the same 
applies here.

Kihwal

On 12/1/11 1:26 PM, "Stack"  wrote:

Make sure its not the issue that Jonathan Payne identifiied a while
back: 
https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357#
St.Ack



Atomicity questions

2011-12-01 Thread Mohit Anchlia
I have some questions about ACID after reading this page,
http://hbase.apache.org/acid-semantics.html

- Atomicity point 5 : row must either be "a=1,b=1,c=1" or
"a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1".

How is this internally handled in hbase such that above is possible?


Re: regions and tables

2011-12-01 Thread Sam Seigal
So is it fair to say that the number of tables one can create is also
bounded by the number of regions that the cluster can support ?

For example, given 5 region servers  and keeping 20 regions / region
server - with 5 tables, I am restricted to only being able to scale a
single table to 20 regions across the cluster - this might be fine.
However, for 20 tables, I can only scale upto 5 regions / table across
the cluster - which might not be a good idea.  Comments ?


On Thu, Dec 1, 2011 at 5:31 AM, Doug Meil  wrote:
> To expand on what Lars said, there is an example of how this is layed out
> on disk...
>
> http://hbase.apache.org/book.html#trouble.namenode.disk
>
> ... regions distribute the table, so two different tables will be
> distributed by separate sets of regions.
>
>
>
>
> On 12/1/11 3:14 AM, "Lars George"  wrote:
>
>>Hi Sam,
>>
>>You need to handle them all separately. The note - I assume - was solely
>>explaining the fact that the "load" of a region server is defined by the
>>number of regions it hosts, not the number of tables. If you want to
>>precreate the regions for one or more than one table is the same work:
>>create the tables (one by one) with the list of split points.
>>
>>Lars
>>
>>On Dec 1, 2011, at 7:50 AM, Sam Seigal wrote:
>>
>>> HI,
>>>
>>> I had a question about the relationship  between regions and tables.
>>>
>>> Is there a way to pre-create regions for multiple tables ? or each
>>> table has its own set of regions managed independently ?
>>>
>>> I read on one of the threads that there is really no limit on the
>>> number of tables, but that we need to be careful about is the number
>>> of regions. Does this mean that the regions can be pre created for
>>> multiple tables ?
>>>
>>> Thank you,
>>>
>>> Sam
>>
>>
>
>


Re: Scan Metrics in Ganglia

2011-12-01 Thread Doug Meil

This can be a bit tricky because of the scan caching, for example...

http://hbase.apache.org/book.html#rs_metrics

12.4.2.14. hbase.regionserver.requests

Total number of read and write requests.  Requests correspond to
RegionServer
 RPC calls, thus a single Get will result in 1 request, but a Scan with
caching set to 1000 will result in 1 request for each 'next' call (i.e.,
 not each row).  A bulk-load request will constitute 1 request per
HFile.






On 12/1/11 2:23 PM, "sagar naik"  wrote:

>Hi,
>I can see metrics for get calls (number of get , avg time for get)
>However, I could not do so for scan calls
>
>Please let me know how can I measure
>
>Thanks
>-Sagar
>




Re: Suspected memory leak

2011-12-01 Thread Stack
Make sure its not the issue that Jonathan Payne identifiied a while
back: 
https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357#
St.Ack


Scan Metrics in Ganglia

2011-12-01 Thread sagar naik
Hi,
I can see metrics for get calls (number of get , avg time for get)
However, I could not do so for scan calls

Please let me know how can I measure

Thanks
-Sagar


Re: Constant error when putting large data into HBase

2011-12-01 Thread Jean-Daniel Cryans
Here's my take on the issue.

> I monitored the
> process and when any node fails, it has not used all the heaps yet.
> So it is not a heap space problem.

I disagree. Unless you load a region server heap with more data than
there's heap available (loading batches of humongous rows for
example), it will not fill it. It doesn't mean you have enough heap,
because HBase will take precautions in order to not run out of memory.
In your case, you have a lot of block cache trashing:

2011-12-01 17:05:49,084 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
eviction started; Attempting to free 79.68 MB of total=677.18 MB
2011-12-01 17:05:49,087 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
eviction completed; freed=79.72 MB, total=597.78 MB, single=372.13 MB,
multi=298.71 MB, memory=0 KB
2011-12-01 17:05:50,069 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
eviction started; Attempting to free 79.67 MB of total=677.17 MB
2011-12-01 17:05:50,084 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
eviction completed; freed=79.67 MB, total=597.75 MB, single=372.05 MB,
multi=298.71 MB, memory=0 KB
etc

This is the kind of precautions I'm talking about. BTW in MR jobs you
should always disable the block cache like showed in this example:
http://hbase.apache.org/book/mapreduce.example.html#mapreduce.example.read

scan.setCacheBlocks(false);  // don't set to true for MR jobs

I don't know if this is related to your current job, not clear from
your description of the job if the mapping is done on HBase.

> And finally, according to the logs I pasted, I see other lines with DEBUG
> or INFO. So I thought this was okay.
> Is there a way to change WARN level log to some other level log? If you'd
> let me know, I will paste another set of logs.

The connection reset stuff is interesting, and this warning indeed
points that somethings weird. It would be interesting to see some
tasks logs (not the TaskTracker, nor the JobTracker, they are usually
of little while debugging this type of problem). In any case what it
means is that the client (the map or reduce task, or even some other
client you have) reset the connection, so the region server just drops
it.

> The regionserver that contains that specific region fails. That is the
> point. If I move that region to another regionserver using hbase shell,
> then that regionserver fails.
> With the same log output.

You haven't shown us a log output of a dying region server yet.
Actually from those logs I don't even see a lot of importing going on,
just a lot of reading. Look for ERROR level logging, then grab
everything that's around that and post it here please (go up in the
log to the point where it looks like normal logging, usually the ERROR
will get log after some important lines).

It would also be interesting to see the full reducer task log.

J-D

On Thu, Dec 1, 2011 at 12:48 AM, edward choi  wrote:
> Hi,
> I've had a problem that has been killing for some days now.
> I am using CDH3 update2 version of Hadoop and Hbase.
> When I do a large amount of bulk loading into Hbase, some node always die.
> It's not just one particular node.
> But one of many nodes fail to serve eventually.
>
> I set 4 gigs of heap space for master, and regionservers. I monitored the
> process and when any node fails, it has not used all the heaps yet.
> So it is not a heap space problem.
>
> Below is what I get when I perform bulk put using MapReduce.
>


Re: Strategies for aggregating data in a HBase table

2011-12-01 Thread Jean-Daniel Cryans
Or you could just prefix the row keys. Not sure if this is needed
natively, or as a tool on top of HBase. Hive for example could do
exactly that for you when Hive partitions are implemented for HBase.

J-D

On Wed, Nov 30, 2011 at 1:34 PM, Sam Seigal  wrote:
> What about "partitioning" at a table level. For example, create 12
> tables for the given year. Design the row keys however you like, let's
> say using SHA/MD hashes. Place transactions in the appropriate table
> and then do aggregations based on that table alone (this is assuming
> you won't get transactions with timestamps in the past going back a
> month). The idea is to archive the tables for a given year and start
> fresh the next. This is acceptable in my use case. I am in the process
> of trying this out, so do not have any performance numbers, issues yet
> ... Experts can comment.
>
> On a further note, having HBase support this natively i.e. one more
> level of partitioning above the row key , but below a table can be
> beneficial for use cases like these ones. Comments ... ?
>
> On Wed, Nov 30, 2011 at 11:53 AM, Jean-Daniel Cryans
>  wrote:
>> Inline.
>>
>> J-D
>>
>> On Mon, Nov 28, 2011 at 1:55 AM, Steinmaurer Thomas
>>  wrote:
>>> Hello,
>>> ...
>>>
>>> While it is an option processing the entire HBase table e.g. every night
>>> when we go live, it probably isn't an option when data volume grows over
>>> the years. So, what options are there for some kind of incremental
>>> aggregating only new data?
>>
>> Yeah you don't want to go there.
>>
>>>
>>> - Perhaps using versioning (internal timestamp) might be an option?
>>
>> I guess you could do rollups and ditch the raw data, if you don't need it.
>>
>>>
>>> - Perhaps having some kind of HBase (daily) staging table which is
>>> truncated after aggregating data is an option?
>>
>> If you do the aggregations nightly then you won't have "access to
>> aggregated data very quickly".
>>
>>>
>>> - How could Co-processors help here (at the time of the Go-Live, they
>>> might be available in e.g. Cloudera)?
>>
>> Coprocessors are more like an internal HBase tool, so don't put all
>> your eggs there until you play with them. What you could do is get the
>> 0.92.0 RC0 tarball and try them out :)
>>
>>> Any ideas/comments are appreciated.
>>
>> Normally data is stored in a way that's not easy to query in a batch
>> or analytics mode, so an ETL step is introduced. You'll probably need
>> to do the same, as in you could asynchronously stream your data to
>> other HBase tables or Hive or Pig via logs or replication and then
>> directly insert it into the format it needs to be or stage it for
>> later aggregations. If you explore those avenues I'm sure you'll find
>> concepts that are very very similar to those you listed regarding
>> RDBMS.
>>
>> You could also keep live counts using atomic increments, you'd issue
>> those at write time or async.
>>
>> Hope this helps,
>>
>> J-D


Re: hbase-regionserver1: bash: {HBASE_HOME}/bin/hbase-daemon.sh: No such file or directory

2011-12-01 Thread Jean-Daniel Cryans
So since I don't see the rest of the log I'll have to assume that the
region server was never able to connect to the master. Connection
refused could be a firewall, start the master and then try to telnet
from the other machines to master:6.

J-D

On Thu, Dec 1, 2011 at 6:45 AM, Vamshi Krishna  wrote:
> I found in the logs of region server machines, i found this error (on both
> regionserver machines)
>
> 2011-11-30 14:43:42,447 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
> hbase-master/10.0.1.54:60020 could not be reached after 1 tries, giving up.
> *2011-11-30 14:44:37,762* WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to
> master. Retrying. Error was:
> java.net.ConnectException: Connection refused
>    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>    at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>    at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
>    at
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
>    at
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
>    at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
>    at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>    at $Proxy5.getProtocolVersion(Unknown Source)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1462)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1515)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1499)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
>    at java.lang.Thread.run(Thread.java:662)
>  2011-11-30 14:44:40,768 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to
> Master server at hbase-master:6
> *2011-11-30 14:45:40,847* WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to
> master. Retrying. Error was:
> java.net.ConnectException: Connection refused
>    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>    at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>    at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
>    at
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
>    at
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
>    at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
>    at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>    at $Proxy5.getProtocolVersion(Unknown Source)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1462)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1515)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1499)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
>    at java.lang.Thread.run(Thread.java:662)
>
>
> and the same error is observed in the whole log repeatedly. After seeing it
> what  i understand is that some how master started HRegionServer daemons on
> the machines but from then onwards the RegionServer machines are not able
> to communicate with master. If we observe it is trying to communicate with
> master for evry one minute.
>
> But i am not understanding where to check and modify the things.. please
> help. i feel all connections are OK.
>
> On Thu, Dec 1, 2011 at 12:28 AM, Jean-Daniel Cryans 
> wrote:
>
>> stop-hbase.sh only tells the master to stop, which in turn will tell
>> the region servers to stop. If they are still running, it might be
>> because of an error. Look at their logs to figure what's going on.
>>
>> J-D
>>
>> On Tue, Nov 29, 2011 at 10:46 PM, Vamshi Krishna 
>> wrote:
>> > hey soryy for posting multiple times.
>> > J-D, As you said, i refered to my regionserver log, there i found
>> >              Could not resolve the DNS name of vamshikrishna-desktop
>> > so i added an alias ' vamshikrishna-desktop ' to its correspond

RE: Suspected memory leak

2011-12-01 Thread Vladimir Rodionov
You can create several heap dumps of JVM process in question and compare heap 
allocations
To create heap dump:

jmap pid

To analize:
1. jhat
2. visualvm
3. any commercial profiler

One note: -Xmn12G ??? How long is your minor collections GC pauses?

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodio...@carrieriq.com


From: Ramkrishna S Vasudevan [ramkrishna.vasude...@huawei.com]
Sent: Wednesday, November 30, 2011 6:51 PM
To: user@hbase.apache.org; d...@hbase.apache.org
Subject: RE: Suspected memory leak

Adding dev list to get some suggestions.

Regards
Ram


-Original Message-
From: Shrijeet Paliwal [mailto:shrij...@rocketfuel.com]
Sent: Thursday, December 01, 2011 8:08 AM
To: user@hbase.apache.org
Cc: Gaojinchao; Chenjian
Subject: Re: Suspected memory leak

Jieshan,
We backported https://issues.apache.org/jira/browse/HBASE-2937 to 0.90.3

-Shrijeet


2011/11/30 bijieshan 

> Hi Shrijeet,
>
> I think that's jira relevant to trunk, but not for 90.X. For there's no
> timeout mechanism in 90.X. Right?
> We found this problem in 90.x.
>
> Thanks,
>
> Jieshan.
>
> -邮件原件-
> 发件人: Shrijeet Paliwal [mailto:shrij...@rocketfuel.com]
> 发送时间: 2011年12月1日 10:26
> 收件人: user@hbase.apache.org
> 抄送: Gaojinchao; Chenjian
> 主题: Re: Suspected memory leak
>
> Gaojinchao,
>
> I had filed this some time ago,
> https://issues.apache.org/jira/browse/HBASE-4633
> But after some recent insights on our application code, I am inclined to
> think leak (or memory 'hold') is in our application. But it will be good
to
> check out either way.
> I need to update the jira with my saga. See if the description of issue I
> posted there, matches yours. If not, may be you can update with your story
> in detail.
>
> -Shrijeet
>
> 2011/11/30 Gaojinchao 
>
> > I have noticed some memory leak problems in my HBase client.
> > RES has increased to 27g
> > PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> > 12676 root  20   0 30.8g  27g 5092 S2 57.5 587:57.76
> > /opt/java/jre/bin/java -Djava.library.path=lib/.
> >
> > But I am not sure the leak comes from HBase Client jar itself or just
our
> > client code.
> >
> > This is some parameters of jvm.
> > :-Xms15g -Xmn12g -Xmx15g -XX:PermSize=64m -XX:+UseParNewGC
> > -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=65
> > -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=1
> > -XX:+CMSParallelRemarkEnabled
> >
> > Who has experience in this case? , I need continue to dig :)
> >
> >
> >
> > 发件人: Gaojinchao
> > 发送时间: 2011年11月30日 11:02
> > 收件人: user@hbase.apache.org
> > 主题: Suspected memory leak
> >
> > In HBaseClient proceess, I found heap has been increased.
> > I used command ’cat smaps’ to get the heap size.
> > It seems in case when the threads pool in HTable has released the no
> using
> > thread, if you use putlist api to put data again, the memory is
> increased.
> >
> > Who has experience in this case?
> >
> > Below is the heap of Hbase client:
> > C3S31:/proc/18769 # cat smaps
> > 4010a000-4709d000 rwxp  00:00 0
> >  [heap]
> > Size: 114252 kB
> > Rss:  114044 kB
> > Pss:  114044 kB
> >
> > 4010a000-4709d000 rwxp  00:00 0
> >  [heap]
> > Size: 114252 kB
> > Rss:  114044 kB
> > Pss:  114044 kB
> >
> > 4010a000-48374000 rwxp  00:00 0
> >  [heap]
> > Size: 133544 kB
> > Rss:  16 kB
> > Pss:  16 kB
> >
> > 4010a000-49f2 rwxp  00:00 0
> >  [heap]
> > Size: 161880 kB
> > Rss:  161672 kB
> > Pss:  161672 kB
> >
> > 4010a000-4c5de000 rwxp  00:00 0
> >  [heap]
> > Size: 201552 kB
> > Rss:  201344 kB
> > Pss:  201344 kB
> >
>


Confidentiality Notice:  The information contained in this message, including 
any attachments hereto, may be confidential and is intended to be read only by 
the individual or entity to whom this message is addressed. If the reader of 
this message is not the intended recipient or an agent or designee of the 
intended recipient, please note that any review, use, disclosure or 
distribution of this message or its attachments, in any form, is strictly 
prohibited.  If you have received this message in error, please immediately 
notify the sender and/or notificati...@carrieriq.com and delete or destroy any 
copy of this message and its attachments.


Re: Performance characteristics of scans using timestamp as the filter

2011-12-01 Thread Doug Meil

Scans work on startRow/stopRow...

http://hbase.apache.org/book.html#scan

... you can also select by timestamp *within the startRow/stopRow
selection*, but this isn't intended to quickly select rows by timestamp
irrespective of their keys.




On 12/1/11 9:03 AM, "Srikanth P. Shreenivas"
 wrote:

>So, will it be safe to assume that Scan queries with TimeRange will
>perform well and will read only necessary portions of the tables instead
>of doing full table scan?
>
>I have run into a situation, wherein I would like to find out all rows
>that got create/updated on during a time range.
>I was hoping that I could to time range scan.
>
>Regards,
>Srikanth
>
>
>
>-Original Message-
>From: Stuti Awasthi [mailto:stutiawas...@hcl.com]
>Sent: Monday, October 10, 2011 3:44 PM
>To: user@hbase.apache.org
>Subject: RE: Performance characteristics of scans using timestamp as the
>filter
>
>Yes its true.
>Your cluster time should be in sync for reliable functioning.
>
>-Original Message-
>From: Steinmaurer Thomas [mailto:thomas.steinmau...@scch.at]
>Sent: Monday, October 10, 2011 3:04 PM
>To: user@hbase.apache.org
>Subject: RE: Performance characteristics of scans using timestamp as the
>filter
>
>Isn't a synchronized time along all nodes a general requirement for
>running the cluster reliably?
>
>Regards,
>Thomas
>
>-Original Message-
>From: Stuti Awasthi [mailto:stutiawas...@hcl.com]
>Sent: Montag, 10. Oktober 2011 11:18
>To: user@hbase.apache.org
>Subject: RE: Performance characteristics of scans using timestamp as the
>filter
>
>Steinmaurer,
>
>I have done a little POC with Timerange scan and it worked fine for me.
>Another thing to note is time should be same on all machines of your
>cluster of Hbase.
>
>-Original Message-
>From: Steinmaurer Thomas [mailto:thomas.steinmau...@scch.at]
>Sent: Monday, October 10, 2011 2:32 PM
>To: user@hbase.apache.org
>Subject: RE: Performance characteristics of scans using timestamp as the
>filter
>
>Hello,
>
>others have stated that one shouldn't try to use timestamps, although I
>haven't figured out why? If it's reliability, which means, rows are
>omitted, even if they should be included in a timerange-based scan, then
>this might be a good argument. ;-)
>
>One thing is that the timestamp AFAIK changes when you update a row even
>cell values didn't change.
>
>Regards,
>Thomas
>
>-Original Message-
>From: Stuti Awasthi [mailto:stutiawas...@hcl.com]
>Sent: Montag, 10. Oktober 2011 10:07
>To: user@hbase.apache.org
>Subject: RE: Performance characteristics of scans using timestamp as the
>filter
>
>Hi Saurabh,
>
>AFAIK you can also scan on the basis of Timestamp Range. This can provide
>you data update in that timestamp range. You do not need to keep
>timestamp in you row key.
>
>-Original Message-
>From: saurabh@gmail.com [mailto:saurabh@gmail.com] On Behalf Of
>Sam Seigal
>Sent: Monday, October 10, 2011 1:20 PM
>To: user@hbase.apache.org
>Subject: Re: Performance characteristics of scans using timestamp as the
>filter
>
>Is it possible to do incremental processing without putting the timestamp
>in the leading part of the row key in a more efficient manner i.e.
>process data that came within the last hour/ 2 hour etc ? I can't seem to
>find a good answer to this question myself.
>
>On Mon, Oct 10, 2011 at 12:09 AM, Steinmaurer Thomas <
>thomas.steinmau...@scch.at> wrote:
>
>> Leif,
>>
>> we are pretty much in the same boat with a custom timestamp at the end
>
>> of a three-part rowkey, so basically we end up with reading all data
>> when processing daily batches. Beside performance aspects, have you
>> seen that using internals timestamps for scans etc... work reliable?
>>
>> Or did you come up with another solution to your problem?
>>
>> Thanks,
>> Thomas
>>
>> -Original Message-
>> From: Leif Wickland [mailto:leifwickl...@gmail.com]
>> Sent: Freitag, 09. September 2011 20:33
>> To: user@hbase.apache.org
>> Subject: Performance characteristics of scans using timestamp as the
>> filter
>>
>> (Apologies if this has been answered before.  I couldn't find anything
>
>> in the archives quite along these lines.)
>>
>> I have a process which writes to HBase as new data arrives.  I'd like
>> to run a map-reduce periodically, say daily, that takes the new items
>as input.
>>  A naive approach would use a scan which grabs all of the rows that
>> have a timestamp in a specified interval as the input to a MapReduce.
>> I tested a scenario like that with 10s of GB of data and it seemed to
>perform OK.
>>  Should I expected that approach to continue to perform reasonably
>> well when I have TBs of data?
>>
>> From what I understand of the HBase architecture, I don't see a reason
>
>> that the the scan approach would continue to perform well as the data
>> grows.  It seems like I may have to keep a log of modified keys and
>> use that as the map-reduce input, instead.
>>
>> Thanks,
>>
>> Leif Wickland
>>
>
>::DISCLAIMER::
>--

Re: hbase-regionserver1: bash: {HBASE_HOME}/bin/hbase-daemon.sh: No such file or directory

2011-12-01 Thread Vamshi Krishna
I found in the logs of region server machines, i found this error (on both
regionserver machines)

2011-11-30 14:43:42,447 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
hbase-master/10.0.1.54:60020 could not be reached after 1 tries, giving up.
*2011-11-30 14:44:37,762* WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to
master. Retrying. Error was:
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
at
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
at $Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1462)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1515)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1499)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
at java.lang.Thread.run(Thread.java:662)
 2011-11-30 14:44:40,768 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to
Master server at hbase-master:6
*2011-11-30 14:45:40,847* WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to
master. Retrying. Error was:
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
at
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
at $Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1462)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1515)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1499)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
at java.lang.Thread.run(Thread.java:662)


and the same error is observed in the whole log repeatedly. After seeing it
what  i understand is that some how master started HRegionServer daemons on
the machines but from then onwards the RegionServer machines are not able
to communicate with master. If we observe it is trying to communicate with
master for evry one minute.

But i am not understanding where to check and modify the things.. please
help. i feel all connections are OK.

On Thu, Dec 1, 2011 at 12:28 AM, Jean-Daniel Cryans wrote:

> stop-hbase.sh only tells the master to stop, which in turn will tell
> the region servers to stop. If they are still running, it might be
> because of an error. Look at their logs to figure what's going on.
>
> J-D
>
> On Tue, Nov 29, 2011 at 10:46 PM, Vamshi Krishna 
> wrote:
> > hey soryy for posting multiple times.
> > J-D, As you said, i refered to my regionserver log, there i found
> >  Could not resolve the DNS name of vamshikrishna-desktop
> > so i added an alias ' vamshikrishna-desktop ' to its corresponding IP
> > address in /etc/hosts.  So, from then master is able to run HRegionServer
> > daemon in the regionserver machines also.
> >
> > But the ONLY problem now is when i stop hbase on my master node by
> running
> > bin/stop-hbase.sh, all hbase daemons are stopping on matser node but NOT
> on
> > regionserver nodes.The HRegionServer daemon is still running on the other
> > regionserver machines.
> > I think the HRegionServer daemons on al

RE: Performance characteristics of scans using timestamp as the filter

2011-12-01 Thread Srikanth P. Shreenivas
So, will it be safe to assume that Scan queries with TimeRange will perform 
well and will read only necessary portions of the tables instead of doing full 
table scan?

I have run into a situation, wherein I would like to find out all rows that got 
create/updated on during a time range.
I was hoping that I could to time range scan.

Regards,
Srikanth



-Original Message-
From: Stuti Awasthi [mailto:stutiawas...@hcl.com]
Sent: Monday, October 10, 2011 3:44 PM
To: user@hbase.apache.org
Subject: RE: Performance characteristics of scans using timestamp as the filter

Yes its true.
Your cluster time should be in sync for reliable functioning.

-Original Message-
From: Steinmaurer Thomas [mailto:thomas.steinmau...@scch.at]
Sent: Monday, October 10, 2011 3:04 PM
To: user@hbase.apache.org
Subject: RE: Performance characteristics of scans using timestamp as the filter

Isn't a synchronized time along all nodes a general requirement for running the 
cluster reliably?

Regards,
Thomas

-Original Message-
From: Stuti Awasthi [mailto:stutiawas...@hcl.com]
Sent: Montag, 10. Oktober 2011 11:18
To: user@hbase.apache.org
Subject: RE: Performance characteristics of scans using timestamp as the filter

Steinmaurer,

I have done a little POC with Timerange scan and it worked fine for me.
Another thing to note is time should be same on all machines of your cluster of 
Hbase.

-Original Message-
From: Steinmaurer Thomas [mailto:thomas.steinmau...@scch.at]
Sent: Monday, October 10, 2011 2:32 PM
To: user@hbase.apache.org
Subject: RE: Performance characteristics of scans using timestamp as the filter

Hello,

others have stated that one shouldn't try to use timestamps, although I haven't 
figured out why? If it's reliability, which means, rows are omitted, even if 
they should be included in a timerange-based scan, then this might be a good 
argument. ;-)

One thing is that the timestamp AFAIK changes when you update a row even cell 
values didn't change.

Regards,
Thomas

-Original Message-
From: Stuti Awasthi [mailto:stutiawas...@hcl.com]
Sent: Montag, 10. Oktober 2011 10:07
To: user@hbase.apache.org
Subject: RE: Performance characteristics of scans using timestamp as the filter

Hi Saurabh,

AFAIK you can also scan on the basis of Timestamp Range. This can provide you 
data update in that timestamp range. You do not need to keep timestamp in you 
row key.

-Original Message-
From: saurabh@gmail.com [mailto:saurabh@gmail.com] On Behalf Of Sam 
Seigal
Sent: Monday, October 10, 2011 1:20 PM
To: user@hbase.apache.org
Subject: Re: Performance characteristics of scans using timestamp as the filter

Is it possible to do incremental processing without putting the timestamp in 
the leading part of the row key in a more efficient manner i.e. process data 
that came within the last hour/ 2 hour etc ? I can't seem to find a good answer 
to this question myself.

On Mon, Oct 10, 2011 at 12:09 AM, Steinmaurer Thomas < 
thomas.steinmau...@scch.at> wrote:

> Leif,
>
> we are pretty much in the same boat with a custom timestamp at the end

> of a three-part rowkey, so basically we end up with reading all data
> when processing daily batches. Beside performance aspects, have you
> seen that using internals timestamps for scans etc... work reliable?
>
> Or did you come up with another solution to your problem?
>
> Thanks,
> Thomas
>
> -Original Message-
> From: Leif Wickland [mailto:leifwickl...@gmail.com]
> Sent: Freitag, 09. September 2011 20:33
> To: user@hbase.apache.org
> Subject: Performance characteristics of scans using timestamp as the
> filter
>
> (Apologies if this has been answered before.  I couldn't find anything

> in the archives quite along these lines.)
>
> I have a process which writes to HBase as new data arrives.  I'd like
> to run a map-reduce periodically, say daily, that takes the new items
as input.
>  A naive approach would use a scan which grabs all of the rows that
> have a timestamp in a specified interval as the input to a MapReduce.
> I tested a scenario like that with 10s of GB of data and it seemed to
perform OK.
>  Should I expected that approach to continue to perform reasonably
> well when I have TBs of data?
>
> From what I understand of the HBase architecture, I don't see a reason

> that the the scan approach would continue to perform well as the data
> grows.  It seems like I may have to keep a log of modified keys and
> use that as the map-reduce input, instead.
>
> Thanks,
>
> Leif Wickland
>

::DISCLAIMER::

---

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. 
Any views or opinions presented in this email are solely those of the author 
and may not necessarily reflect the opinions o

Re: regions and tables

2011-12-01 Thread Doug Meil
To expand on what Lars said, there is an example of how this is layed out
on disk...

http://hbase.apache.org/book.html#trouble.namenode.disk

... regions distribute the table, so two different tables will be
distributed by separate sets of regions.




On 12/1/11 3:14 AM, "Lars George"  wrote:

>Hi Sam,
>
>You need to handle them all separately. The note - I assume - was solely
>explaining the fact that the "load" of a region server is defined by the
>number of regions it hosts, not the number of tables. If you want to
>precreate the regions for one or more than one table is the same work:
>create the tables (one by one) with the list of split points.
>
>Lars
>
>On Dec 1, 2011, at 7:50 AM, Sam Seigal wrote:
>
>> HI,
>> 
>> I had a question about the relationship  between regions and tables.
>> 
>> Is there a way to pre-create regions for multiple tables ? or each
>> table has its own set of regions managed independently ?
>> 
>> I read on one of the threads that there is really no limit on the
>> number of tables, but that we need to be careful about is the number
>> of regions. Does this mean that the regions can be pre created for
>> multiple tables ?
>> 
>> Thank you,
>> 
>> Sam
>
>




Re: Unable to create version file

2011-12-01 Thread Lars George
Could you please pastebin your Hadoop, HBase and ZooKeeper config files? 

Lars

On Dec 1, 2011, at 11:23 AM, Mohammad Tariq wrote:

> Today when I issued bin/start-hbase.sh I ran into the following error -
> 
> Thu Dec  1 15:47:30 IST 2011 Starting master on ubuntu
> ulimit -n 1024
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32
> GMT
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:host.name=ubuntu.ubuntu-domain
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:java.version=1.6.0_26
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:java.vendor=Sun Microsystems Inc.
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:java.class.path=/home/solr/hbase-0.90.4/bin/../conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/solr/hbase-0.90.4/bin/..:/home/solr/hbase-0.90.4/bin/../hbase-0.90.4.jar:/home/solr/hbase-0.90.4/bin/../hbase-0.90.4-tests.jar:/home/solr/hbase-0.90.4/bin/../lib/activation-1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/asm-3.1.jar:/home/solr/hbase-0.90.4/bin/../lib/avro-1.3.3.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-cli-1.2.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-codec-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-el-1.0.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-httpclient-3.1.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-lang-2.5.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-logging-1.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-net-1.4.1.jar:/home/solr/hbase-0.90.4/bin/../lib/core-3.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/guava-r06.jar:/home/solr/hbase-0.90.4/bin/../lib/hadoop-core-0.20-append-r1056497.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-core-asl-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-jaxrs-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-mapper-asl-1.4.2.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-xc-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jasper-compiler-5.5.23.jar:/home/solr/hbase-0.90.4/bin/../lib/jasper-runtime-5.5.23.jar:/home/solr/hbase-0.90.4/bin/../lib/jaxb-api-2.1.jar:/home/solr/hbase-0.90.4/bin/../lib/jaxb-impl-2.1.12.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-core-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-json-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-server-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jettison-1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/jetty-6.1.26.jar:/home/solr/hbase-0.90.4/bin/../lib/jetty-util-6.1.26.jar:/home/solr/hbase-0.90.4/bin/../lib/jruby-complete-1.6.0.jar:/home/solr/hbase-0.90.4/bin/../lib/jsp-2.1-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/jsp-api-2.1-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/jsr311-api-1.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/log4j-1.2.16.jar:/home/solr/hbase-0.90.4/bin/../lib/protobuf-java-2.3.0.jar:/home/solr/hbase-0.90.4/bin/../lib/servlet-api-2.5-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/slf4j-api-1.5.8.jar:/home/solr/hbase-0.90.4/bin/../lib/slf4j-log4j12-1.5.8.jar:/home/solr/hbase-0.90.4/bin/../lib/stax-api-1.0.1.jar:/home/solr/hbase-0.90.4/bin/../lib/thrift-0.2.0.jar:/home/solr/hbase-0.90.4/bin/../lib/xmlenc-0.52.jar:/home/solr/hbase-0.90.4/bin/../lib/zookeeper-3.3.2.jar
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.26/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:java.io.tmpdir=/tmp
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:java.compiler=
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:os.name=Linux
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:os.arch=amd64
> 2011-12-01 15:47:31,158 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:os.version=3.0.0-13-generic
> 2011-12-01 15:47:31,159 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:user.name=solr
> 2011-12-01 15:47:31,159 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:user.home=/home/solr
> 2011-12-01 15:47:31,159 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Server
> environment:user.dir=/home/solr/hbase-0.90.4
> 2011-12-01 15:47:31,169 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Created server with
> tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 4 datadi

Re: Problem in configuring pseudo distributed mode

2011-12-01 Thread Christopher Dorner
Your hosts file (/etc/hosts) should contain only sth like
127.0.0.1 localhost
Or
127.0.0.1 

It should not contain sth like
127.0.1.1 localhost.

And i think u need to reboot after changing it. Hope that helps.

Regards,
Christopher
Am 01.12.2011 13:24 schrieb "Mohammad Tariq" :

> Hello list,
>
>Even after following the directions provided by you guys and Hbase
> book and several other blogs and posts I am not able to run Hbase in
> pseudo distributed mode.And I think there is some problem with the
> hosts file.I would highly appreciate if someone who has done it
> properly could share his/her hosts and hbase-site.xml files???
>
> Regards,
> Mohammad Tariq
>


Problem in configuring pseudo distributed mode

2011-12-01 Thread Mohammad Tariq
Hello list,

Even after following the directions provided by you guys and Hbase
book and several other blogs and posts I am not able to run Hbase in
pseudo distributed mode.And I think there is some problem with the
hosts file.I would highly appreciate if someone who has done it
properly could share his/her hosts and hbase-site.xml files???

Regards,
    Mohammad Tariq


Unable to create version file

2011-12-01 Thread Mohammad Tariq
Today when I issued bin/start-hbase.sh I ran into the following error -

Thu Dec  1 15:47:30 IST 2011 Starting master on ubuntu
ulimit -n 1024
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32
GMT
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:host.name=ubuntu.ubuntu-domain
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:java.version=1.6.0_26
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:java.vendor=Sun Microsystems Inc.
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:java.class.path=/home/solr/hbase-0.90.4/bin/../conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/solr/hbase-0.90.4/bin/..:/home/solr/hbase-0.90.4/bin/../hbase-0.90.4.jar:/home/solr/hbase-0.90.4/bin/../hbase-0.90.4-tests.jar:/home/solr/hbase-0.90.4/bin/../lib/activation-1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/asm-3.1.jar:/home/solr/hbase-0.90.4/bin/../lib/avro-1.3.3.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-cli-1.2.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-codec-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-el-1.0.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-httpclient-3.1.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-lang-2.5.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-logging-1.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-net-1.4.1.jar:/home/solr/hbase-0.90.4/bin/../lib/core-3.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/guava-r06.jar:/home/solr/hbase-0.90.4/bin/../lib/hadoop-core-0.20-append-r1056497.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-core-asl-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-jaxrs-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-mapper-asl-1.4.2.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-xc-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jasper-compiler-5.5.23.jar:/home/solr/hbase-0.90.4/bin/../lib/jasper-runtime-5.5.23.jar:/home/solr/hbase-0.90.4/bin/../lib/jaxb-api-2.1.jar:/home/solr/hbase-0.90.4/bin/../lib/jaxb-impl-2.1.12.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-core-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-json-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-server-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jettison-1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/jetty-6.1.26.jar:/home/solr/hbase-0.90.4/bin/../lib/jetty-util-6.1.26.jar:/home/solr/hbase-0.90.4/bin/../lib/jruby-complete-1.6.0.jar:/home/solr/hbase-0.90.4/bin/../lib/jsp-2.1-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/jsp-api-2.1-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/jsr311-api-1.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/log4j-1.2.16.jar:/home/solr/hbase-0.90.4/bin/../lib/protobuf-java-2.3.0.jar:/home/solr/hbase-0.90.4/bin/../lib/servlet-api-2.5-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/slf4j-api-1.5.8.jar:/home/solr/hbase-0.90.4/bin/../lib/slf4j-log4j12-1.5.8.jar:/home/solr/hbase-0.90.4/bin/../lib/stax-api-1.0.1.jar:/home/solr/hbase-0.90.4/bin/../lib/thrift-0.2.0.jar:/home/solr/hbase-0.90.4/bin/../lib/xmlenc-0.52.jar:/home/solr/hbase-0.90.4/bin/../lib/zookeeper-3.3.2.jar
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.26/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:java.io.tmpdir=/tmp
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:java.compiler=
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:os.name=Linux
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:os.arch=amd64
2011-12-01 15:47:31,158 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:os.version=3.0.0-13-generic
2011-12-01 15:47:31,159 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:user.name=solr
2011-12-01 15:47:31,159 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:user.home=/home/solr
2011-12-01 15:47:31,159 INFO
org.apache.zookeeper.server.ZooKeeperServer: Server
environment:user.dir=/home/solr/hbase-0.90.4
2011-12-01 15:47:31,169 INFO
org.apache.zookeeper.server.ZooKeeperServer: Created server with
tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 4 datadir
/tmp/hbase-solr/zookeeper/zookeeper/version-2 snapdir
/tmp/hbase-solr/zookeeper/zookeeper/version-2
2011-12-01 15:47:31,185 INFO
org.apache.zookeeper.server.NIOServerCnxn: binding to port
0.0.0.0/0.0.0.0:2181
2011-12-01 15:47:31,189 INFO

Re: Constant error when putting large data into HBase

2011-12-01 Thread Lars George
Hi Ed,

You need to be more precise I am afraid. First of all what does "some node 
always dies" mean? Is the process gone? Which process is gone?
And the "error" you pasted is a WARN level log that *might* indicate some 
trouble, but is *not* the reason the "node has died". Please elaborate.

Also consider posting the last few hundred lines of the process logs to 
pastebin so that someone can look at it.

Thanks,
Lars


On Dec 1, 2011, at 9:48 AM, edward choi wrote:

> Hi,
> I've had a problem that has been killing for some days now.
> I am using CDH3 update2 version of Hadoop and Hbase.
> When I do a large amount of bulk loading into Hbase, some node always die.
> It's not just one particular node.
> But one of many nodes fail to serve eventually.
> 
> I set 4 gigs of heap space for master, and regionservers. I monitored the
> process and when any node fails, it has not used all the heaps yet.
> So it is not a heap space problem.
> 
> Below is what I get when I perform bulk put using MapReduce.
> 
> 
> 11/12/01 17:17:20 INFO mapred.JobClient:  map 100% reduce 100%
> 11/12/01 17:18:31 INFO mapred.JobClient: Task Id :
> attempt_20302113_0034_r_13_0, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
> 1 action: servers with issues: lp171.etri.re.kr:60020,
>at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239)
>at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253)
>at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828)
>at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684)
>at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669)
>at
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
>at
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
>at
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514)
>at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>at etri.qa.mapreduce.PostProcess$PostPro
> attempt_20302113_0034_r_13_0: 2022
> 11/12/01 17:18:36 INFO mapred.JobClient: Task Id :
> attempt_20302113_0034_r_13_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
> 1 action: servers with issues: lp171.etri.re.kr:60020,
>at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239)
>at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253)
>at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828)
>at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684)
>at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669)
>at
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
>at
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
>at
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514)
>at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>at etri.qa.mapreduce.PostProcess$PostPro
> attempt_20302113_0034_r_13_1: 2022
> 11/12/01 17:18:37 INFO mapred.JobClient:  map 100% reduce 95%
> 11/12/01 17:18:44 INFO mapred.JobClient:  map 100% reduce 96%
> 11/12/01 17:18:47 INFO mapred.JobClient:  map 100% reduce 98%
> 11/12/01 17:18:50 INFO mapred.JobClient:  map 100% reduce 99%
> 11/12/01 17:18:53 INFO mapred.JobClient:  map 100% reduce 100%
> 11/12/01 17:20:07 INFO mapred.JobClient: Task Id :
> attempt_20302113_0034_r_13_3, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
> 1 action: servers with issues: lp171.etri.re.kr:60020,
>at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239)
>at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253)
>at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828)
>at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684)
>at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669)
>

Constant error when putting large data into HBase

2011-12-01 Thread edward choi
Hi,
I've had a problem that has been killing for some days now.
I am using CDH3 update2 version of Hadoop and Hbase.
When I do a large amount of bulk loading into Hbase, some node always die.
It's not just one particular node.
But one of many nodes fail to serve eventually.

I set 4 gigs of heap space for master, and regionservers. I monitored the
process and when any node fails, it has not used all the heaps yet.
So it is not a heap space problem.

Below is what I get when I perform bulk put using MapReduce.


11/12/01 17:17:20 INFO mapred.JobClient:  map 100% reduce 100%
11/12/01 17:18:31 INFO mapred.JobClient: Task Id :
attempt_20302113_0034_r_13_0, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
1 action: servers with issues: lp171.etri.re.kr:60020,
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253)
at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669)
at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
at
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at etri.qa.mapreduce.PostProcess$PostPro
attempt_20302113_0034_r_13_0: 2022
11/12/01 17:18:36 INFO mapred.JobClient: Task Id :
attempt_20302113_0034_r_13_1, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
1 action: servers with issues: lp171.etri.re.kr:60020,
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253)
at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669)
at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
at
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at etri.qa.mapreduce.PostProcess$PostPro
attempt_20302113_0034_r_13_1: 2022
11/12/01 17:18:37 INFO mapred.JobClient:  map 100% reduce 95%
11/12/01 17:18:44 INFO mapred.JobClient:  map 100% reduce 96%
11/12/01 17:18:47 INFO mapred.JobClient:  map 100% reduce 98%
11/12/01 17:18:50 INFO mapred.JobClient:  map 100% reduce 99%
11/12/01 17:18:53 INFO mapred.JobClient:  map 100% reduce 100%
11/12/01 17:20:07 INFO mapred.JobClient: Task Id :
attempt_20302113_0034_r_13_3, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
1 action: servers with issues: lp171.etri.re.kr:60020,
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253)
at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669)
at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127)
at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
at
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at etri.qa.mapreduce.PostProcess$PostPro
attempt_20302113_0034_r_13_3: 2022
11/12/01 17:20:09 INFO mapred.JobClient:  map 100% reduce 95%
11/12/01 17:20:09 IN

Re: regions and tables

2011-12-01 Thread Lars George
Hi Sam,

You need to handle them all separately. The note - I assume - was solely 
explaining the fact that the "load" of a region server is defined by the number 
of regions it hosts, not the number of tables. If you want to precreate the 
regions for one or more than one table is the same work: create the tables (one 
by one) with the list of split points. 

Lars

On Dec 1, 2011, at 7:50 AM, Sam Seigal wrote:

> HI,
> 
> I had a question about the relationship  between regions and tables.
> 
> Is there a way to pre-create regions for multiple tables ? or each
> table has its own set of regions managed independently ?
> 
> I read on one of the threads that there is really no limit on the
> number of tables, but that we need to be careful about is the number
> of regions. Does this mean that the regions can be pre created for
> multiple tables ?
> 
> Thank you,
> 
> Sam