Re: Suspected memory leak
Thank you all. I think it's the same problem with the link provided by Stack. Because the heap-size is stabilized, but the non-heap size keep growing. So I think not the problem of the CMS GC bug. And we have known the content of the problem memory section, all the records contains the info like below: "|www.hostname02087075.comlhggmdjapwpfvkqvxgnskzzydiywoacjnpljkarlehrnzzbpbxc||460|||Agent" "BBZHtable_UFDR_058,048342220093168-02570" Jieshan. -邮件原件- 发件人: Kihwal Lee [mailto:kih...@yahoo-inc.com] 发送时间: 2011年12月2日 4:20 收件人: d...@hbase.apache.org 抄送: Ramakrishna s vasudevan; user@hbase.apache.org 主题: Re: Suspected memory leak Adding to the excellent write-up by Jonathan: Since finalizer is involved, it takes two GC cycles to collect them. Due to a bug/bugs in the CMS GC, collection may not happen and the heap can grow really big. See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7112034 for details. Koji tried "-XX:-CMSConcurrentMTEnabled" and confirmed that all the socket related objects were being collected properly. This option forces the concurrent marker to be one thread. This was for HDFS, but I think the same applies here. Kihwal On 12/1/11 1:26 PM, "Stack" wrote: Make sure its not the issue that Jonathan Payne identifiied a while back: https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357# St.Ack
Re: Problem in configuring pseudo distributed mode
Hello Christopher, I don't have 127.0.1.1 in my hosts file. Regards, Mohammad Tariq On Thu, Dec 1, 2011 at 6:42 PM, Christopher Dorner wrote: > Your hosts file (/etc/hosts) should contain only sth like > 127.0.0.1 localhost > Or > 127.0.0.1 > > It should not contain sth like > 127.0.1.1 localhost. > > And i think u need to reboot after changing it. Hope that helps. > > Regards, > Christopher > Am 01.12.2011 13:24 schrieb "Mohammad Tariq" : > >> Hello list, >> >> Even after following the directions provided by you guys and Hbase >> book and several other blogs and posts I am not able to run Hbase in >> pseudo distributed mode.And I think there is some problem with the >> hosts file.I would highly appreciate if someone who has done it >> properly could share his/her hosts and hbase-site.xml files??? >> >> Regards, >> Mohammad Tariq >>
OLAP-ish incremental BI capabilities for hbase, looking for collaborators.
Hello, We (Inadco) are looking for users and developers to engage in our open project code named, for the lack of better name, "hbase-lattice" in order to mutually benefit and eventually develop a mature Hbase-based BI real time OLAP-ish solution. The basic premise is to use Cuboid Lattice -like model (hence the name) to precompute cuboids manually specified (as opposed to statistically inferred) and then build query optimization based on those. (similarly to how DBA analyzes query use cases and decides which indexes to build where). On top of it, we threw in both declarative api and query language support (our reporting system and realtime analytics in platform is actually using query language, to facilitate our developers there). We aim to answer aggregate queries over cube of facts with very short hbase scans over the same projection table so analytical reply could be produced in under 1 ms+ network overhead. Another goal is short latency for fact availability. So we employ incremental MR fact compilation (~ 3-5 mins since fact event average, depending on the # of cuboids and cluster size). At this point we have this project in production with minimum capabilities we need, in kind of pilot mode still, but it is to replace our previous manual incremental projection work on a permanent basis very soon (as soon as we finish migrating our reporting). There's a long todo list and certainly any given particular 3rd party would likely find voids it would have liked to be filled in. Right now we have integration only with Jasper reports tool but eventually it shouldn't be too difficult to wrap the client into jdbc contract as well and enable practically any use (it's just since we integrate tightly both reporting and RT platforms we don't have a need for jdbc per se at the moment). The project readme document is here: https://github.com/dlyubimov/HBase-Lattice/blob/dev-0.1.x/docs/readme.pdf?raw=true or here https://github.com/inadco/HBase-Lattice/blob/dev-0.1.x/docs/readme.pdf?raw=true Aside from capabilities mentioned in this document, we also support custom aggregate functions which we can plug directly into model definiton. (so we can develop custom aggregate function set rather quickly and easily, as it stands). The cube update is incremental with a two-step (sequentially) generated pig job (so perhaps compiler cycle could be ~5 minutes since actual event). We process and aggregate our impression and click and other fact streams with it. The model can be updated with some backward compatibility conventions similar to protobuf conventions (as in add stuff, but don't change type) and the changes could be pushed into production system to take effect immediately henceforth without any need to redeploy any code. Operationally, i tested it for 1.2 bln/day rather wide event fact streams packed as protobuf messages inside sequence files on 6 nodes, and the compilation was not even breaking a sweat. Obviously, my data highly aggregates over time dimensions which correspond to time of the event, so hbase update load actually is pretty light due to high degree of aggregation. But the biggest benefit is that one can scale horizontally the number of facts handled per unit of time pretty impressively. This is optimized for time series data for the most part, so consequently one will see very limited and time oriented support for dimension types and hierarchies at the moment. Generally i think the need is very common for BI solutions over big data time series such as impression or request logs but surprisingly I did not find a well-maintained hbase solution for that (although i did see either stale or less capable attempts out there -- i certainly have missed stuff floating around), so hence. We are planning to maintain the project for a long time as a part of our production system. Please email me if there's interest as either user or collaborator. I think i saw a couple of emails on this list looking for a solution for a similar problem. This is partly inspired by and intended as a complementary solution to Tsuna's Open TSDB (so big thanks to StumbleUpon's people for an example of how to handle time series data). Thanks. -Dmitriy
Re: Atomicity questions
ZK is mostly for orchestrating between the master and regionservers. - Original Message - From: Mohit Anchlia To: user@hbase.apache.org; lars hofhansl Cc: Sent: Thursday, December 1, 2011 3:57 PM Subject: Re: Atomicity questions Thanks that makes it more clear. I also looked at mvcc code as you pointed out. So I am wondering where ZK is used specifically. On Thu, Dec 1, 2011 at 3:37 PM, lars hofhansl wrote: > Nope, not using ZK, that would not scale down to the cell level. > You'll probably have to stare at the code in > MultiVersionConsistencyControlfor a while (I know I had to). > > The basic flow of a write operation is this: > 1. lock the row > > 2. persist change to the write ahead log > 3. get a "writenumber" from mvcc (this is basically a timestamp) > > 4. apply change to the memstore (using that write number). > 5. advance the readpoint (maximum timestamp of changes that reads will see) > -- this is the point where readers see the change > 6. unlock the row > > (7. when memstore is full, flush it to a new disk file, but is done > asynchronously, and not really important, although it has some complicated > implications when the flush happens while there are readers reading from an > old read point) > > > The above is relaxed sometimes for idempotent operations. > > -- Lars > > > - Original Message - > From: Mohit Anchlia > To: user@hbase.apache.org; lars hofhansl > Cc: > Sent: Thursday, December 1, 2011 3:03 PM > Subject: Re: Atomicity questions > > Thanks. I'll try and take a look, but I haven't worked with zookeeper > before. Does it use zookeeper for any of ACID functionality? > > On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl wrote: >> Hi Mohit, >> >> the best way to study this is to look at MultiVersionConsistencyControl.java >> (since you are asking how this handled internally). >> >> In a nutshell this ensures that read operations don't see writes that are >> not completed, by (1) defining a thread read point that is rolled forward >> only after a completed operations and (2) assigning a special timestamp (not >> the timestamp that you set from the client API) to all KeyValues. >> >> -- Lars >> >> >> - Original Message - >> From: Mohit Anchlia >> To: user@hbase.apache.org >> Cc: >> Sent: Thursday, December 1, 2011 2:22 PM >> Subject: Atomicity questions >> >> I have some questions about ACID after reading this page, >> http://hbase.apache.org/acid-semantics.html >> >> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or >> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". >> >> How is this internally handled in hbase such that above is possible? >> >> > >
Re: Atomicity questions
Thanks that makes it more clear. I also looked at mvcc code as you pointed out. So I am wondering where ZK is used specifically. On Thu, Dec 1, 2011 at 3:37 PM, lars hofhansl wrote: > Nope, not using ZK, that would not scale down to the cell level. > You'll probably have to stare at the code in > MultiVersionConsistencyControlfor a while (I know I had to). > > The basic flow of a write operation is this: > 1. lock the row > > 2. persist change to the write ahead log > 3. get a "writenumber" from mvcc (this is basically a timestamp) > > 4. apply change to the memstore (using that write number). > 5. advance the readpoint (maximum timestamp of changes that reads will see) > -- this is the point where readers see the change > 6. unlock the row > > (7. when memstore is full, flush it to a new disk file, but is done > asynchronously, and not really important, although it has some complicated > implications when the flush happens while there are readers reading from an > old read point) > > > The above is relaxed sometimes for idempotent operations. > > -- Lars > > > - Original Message - > From: Mohit Anchlia > To: user@hbase.apache.org; lars hofhansl > Cc: > Sent: Thursday, December 1, 2011 3:03 PM > Subject: Re: Atomicity questions > > Thanks. I'll try and take a look, but I haven't worked with zookeeper > before. Does it use zookeeper for any of ACID functionality? > > On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl wrote: >> Hi Mohit, >> >> the best way to study this is to look at MultiVersionConsistencyControl.java >> (since you are asking how this handled internally). >> >> In a nutshell this ensures that read operations don't see writes that are >> not completed, by (1) defining a thread read point that is rolled forward >> only after a completed operations and (2) assigning a special timestamp (not >> the timestamp that you set from the client API) to all KeyValues. >> >> -- Lars >> >> >> - Original Message - >> From: Mohit Anchlia >> To: user@hbase.apache.org >> Cc: >> Sent: Thursday, December 1, 2011 2:22 PM >> Subject: Atomicity questions >> >> I have some questions about ACID after reading this page, >> http://hbase.apache.org/acid-semantics.html >> >> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or >> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". >> >> How is this internally handled in hbase such that above is possible? >> >> > >
Re: Atomicity questions
Nope, not using ZK, that would not scale down to the cell level. You'll probably have to stare at the code in MultiVersionConsistencyControlfor a while (I know I had to). The basic flow of a write operation is this: 1. lock the row 2. persist change to the write ahead log 3. get a "writenumber" from mvcc (this is basically a timestamp) 4. apply change to the memstore (using that write number). 5. advance the readpoint (maximum timestamp of changes that reads will see) -- this is the point where readers see the change 6. unlock the row (7. when memstore is full, flush it to a new disk file, but is done asynchronously, and not really important, although it has some complicated implications when the flush happens while there are readers reading from an old read point) The above is relaxed sometimes for idempotent operations. -- Lars - Original Message - From: Mohit Anchlia To: user@hbase.apache.org; lars hofhansl Cc: Sent: Thursday, December 1, 2011 3:03 PM Subject: Re: Atomicity questions Thanks. I'll try and take a look, but I haven't worked with zookeeper before. Does it use zookeeper for any of ACID functionality? On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl wrote: > Hi Mohit, > > the best way to study this is to look at MultiVersionConsistencyControl.java > (since you are asking how this handled internally). > > In a nutshell this ensures that read operations don't see writes that are not > completed, by (1) defining a thread read point that is rolled forward only > after a completed operations and (2) assigning a special timestamp (not the > timestamp that you set from the client API) to all KeyValues. > > -- Lars > > > - Original Message - > From: Mohit Anchlia > To: user@hbase.apache.org > Cc: > Sent: Thursday, December 1, 2011 2:22 PM > Subject: Atomicity questions > > I have some questions about ACID after reading this page, > http://hbase.apache.org/acid-semantics.html > > - Atomicity point 5 : row must either be "a=1,b=1,c=1" or > "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". > > How is this internally handled in hbase such that above is possible? > >
Re: regions and tables
Excellent question. I would say that if you are planning to have thousands of tables with the same schema then instead you should use one table with prefixed rows. The 20 regions / region server is a general guideline that works best in the single tenant case, meaning that you have only 1 table and it's perfectly distributed. My first answer brings you back to that form. In the multi-tenant case where every table is different not only in the nature of the data they contain but also in their usage patterns, the answer basically is YMMV. There really is no universal answer at the moment. At SU we have >250 tables and we have ~200 regions per region server, works well for us. J-D On Thu, Dec 1, 2011 at 12:26 PM, Sam Seigal wrote: > So is it fair to say that the number of tables one can create is also > bounded by the number of regions that the cluster can support ? > > For example, given 5 region servers and keeping 20 regions / region > server - with 5 tables, I am restricted to only being able to scale a > single table to 20 regions across the cluster - this might be fine. > However, for 20 tables, I can only scale upto 5 regions / table across > the cluster - which might not be a good idea. Comments ? > > > On Thu, Dec 1, 2011 at 5:31 AM, Doug Meil > wrote: >> To expand on what Lars said, there is an example of how this is layed out >> on disk... >> >> http://hbase.apache.org/book.html#trouble.namenode.disk >> >> ... regions distribute the table, so two different tables will be >> distributed by separate sets of regions. >> >> >> >> >> On 12/1/11 3:14 AM, "Lars George" wrote: >> >>>Hi Sam, >>> >>>You need to handle them all separately. The note - I assume - was solely >>>explaining the fact that the "load" of a region server is defined by the >>>number of regions it hosts, not the number of tables. If you want to >>>precreate the regions for one or more than one table is the same work: >>>create the tables (one by one) with the list of split points. >>> >>>Lars >>> >>>On Dec 1, 2011, at 7:50 AM, Sam Seigal wrote: >>> HI, I had a question about the relationship between regions and tables. Is there a way to pre-create regions for multiple tables ? or each table has its own set of regions managed independently ? I read on one of the threads that there is really no limit on the number of tables, but that we need to be careful about is the number of regions. Does this mean that the regions can be pre created for multiple tables ? Thank you, Sam >>> >>> >> >>
Re: Atomicity questions
On Thu, Dec 1, 2011 at 3:03 PM, Mohit Anchlia wrote: > Thanks. I'll try and take a look, but I haven't worked with zookeeper > before. Does it use zookeeper for any of ACID functionality? > No. St.Ack
Re: hbase sandbox at ImageShack.
On Thu, Dec 1, 2011 at 2:34 PM, Jack Levin wrote: > Hello All. I've setup an hbase (0.90.4) sandbox running on servers > where we have some excess capacity. Feel free to play with it, e.g. > create tables, run load tests, benchmarks, essentially do whatever you > want, just don't put your production services there, because while we > do have it up due to excess capacity, we may have to reclaim the > hardware at some point. > (don't worry about slamming it hard, those servers are running on > non-production zone of our network). > Nice one Jack. Thanks. St.Ack
Re: Atomicity questions
Thanks. I'll try and take a look, but I haven't worked with zookeeper before. Does it use zookeeper for any of ACID functionality? On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl wrote: > Hi Mohit, > > the best way to study this is to look at MultiVersionConsistencyControl.java > (since you are asking how this handled internally). > > In a nutshell this ensures that read operations don't see writes that are not > completed, by (1) defining a thread read point that is rolled forward only > after a completed operations and (2) assigning a special timestamp (not the > timestamp that you set from the client API) to all KeyValues. > > -- Lars > > > - Original Message - > From: Mohit Anchlia > To: user@hbase.apache.org > Cc: > Sent: Thursday, December 1, 2011 2:22 PM > Subject: Atomicity questions > > I have some questions about ACID after reading this page, > http://hbase.apache.org/acid-semantics.html > > - Atomicity point 5 : row must either be "a=1,b=1,c=1" or > "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". > > How is this internally handled in hbase such that above is possible? > >
Re: Atomicity questions
Hi Mohit, the best way to study this is to look at MultiVersionConsistencyControl.java (since you are asking how this handled internally). In a nutshell this ensures that read operations don't see writes that are not completed, by (1) defining a thread read point that is rolled forward only after a completed operations and (2) assigning a special timestamp (not the timestamp that you set from the client API) to all KeyValues. -- Lars - Original Message - From: Mohit Anchlia To: user@hbase.apache.org Cc: Sent: Thursday, December 1, 2011 2:22 PM Subject: Atomicity questions I have some questions about ACID after reading this page, http://hbase.apache.org/acid-semantics.html - Atomicity point 5 : row must either be "a=1,b=1,c=1" or "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". How is this internally handled in hbase such that above is possible?
hbase sandbox at ImageShack.
Hello All. I've setup an hbase (0.90.4) sandbox running on servers where we have some excess capacity. Feel free to play with it, e.g. create tables, run load tests, benchmarks, essentially do whatever you want, just don't put your production services there, because while we do have it up due to excess capacity, we may have to reclaim the hardware at some point. (don't worry about slamming it hard, those servers are running on non-production zone of our network). To create a client that can interface with the cluster, just download hbase 0.90.4, and compile your code against the jars the come with it (applies to the client below on paste bin). I've created a very simple benchmark (proof of concept client) and installed it in micro EC2 instance (you can find the code here http://pastebin.com/BxSs2daY): [ec2-user@ip-10-160-135-246 ~]$ java Hello 2>> /dev/null Enter the number of rows you want to Put and Get : 1 Enter the row value payload : 12345 Writing 1 rows took 41470 milliseconds (<--- 4.1 ms per row not too bad!) Reading 1 rows took 43731 milliseconds [ec2-user@ip-10-160-135-246 ~]$ hbase/bin/hbase shell HBase Shell; enter 'help' for list of supported commands. Type "exit" to leave the HBase Shell Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011 hbase(main):001:0> list TABLE myTable 1 row(s) in 0.9190 seconds hbase(main):002:0> The Hbase zookeeper quorum has only one address "img700.imageshack.us:2181", see code above on how to interface with it. If you find this setup interesting or useful, or have questions about please email me; Otherwise have fun! -Jack (PS. Don't delete other people's tables and don't expose data you don't want to be exposed, the cluster is read/write enabled for _all_, we will carry no liabilities for anything whatsoever :)
Re: Suspected memory leak
Adding to the excellent write-up by Jonathan: Since finalizer is involved, it takes two GC cycles to collect them. Due to a bug/bugs in the CMS GC, collection may not happen and the heap can grow really big. See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7112034 for details. Koji tried "-XX:-CMSConcurrentMTEnabled" and confirmed that all the socket related objects were being collected properly. This option forces the concurrent marker to be one thread. This was for HDFS, but I think the same applies here. Kihwal On 12/1/11 1:26 PM, "Stack" wrote: Make sure its not the issue that Jonathan Payne identifiied a while back: https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357# St.Ack
Atomicity questions
I have some questions about ACID after reading this page, http://hbase.apache.org/acid-semantics.html - Atomicity point 5 : row must either be "a=1,b=1,c=1" or "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1". How is this internally handled in hbase such that above is possible?
Re: regions and tables
So is it fair to say that the number of tables one can create is also bounded by the number of regions that the cluster can support ? For example, given 5 region servers and keeping 20 regions / region server - with 5 tables, I am restricted to only being able to scale a single table to 20 regions across the cluster - this might be fine. However, for 20 tables, I can only scale upto 5 regions / table across the cluster - which might not be a good idea. Comments ? On Thu, Dec 1, 2011 at 5:31 AM, Doug Meil wrote: > To expand on what Lars said, there is an example of how this is layed out > on disk... > > http://hbase.apache.org/book.html#trouble.namenode.disk > > ... regions distribute the table, so two different tables will be > distributed by separate sets of regions. > > > > > On 12/1/11 3:14 AM, "Lars George" wrote: > >>Hi Sam, >> >>You need to handle them all separately. The note - I assume - was solely >>explaining the fact that the "load" of a region server is defined by the >>number of regions it hosts, not the number of tables. If you want to >>precreate the regions for one or more than one table is the same work: >>create the tables (one by one) with the list of split points. >> >>Lars >> >>On Dec 1, 2011, at 7:50 AM, Sam Seigal wrote: >> >>> HI, >>> >>> I had a question about the relationship between regions and tables. >>> >>> Is there a way to pre-create regions for multiple tables ? or each >>> table has its own set of regions managed independently ? >>> >>> I read on one of the threads that there is really no limit on the >>> number of tables, but that we need to be careful about is the number >>> of regions. Does this mean that the regions can be pre created for >>> multiple tables ? >>> >>> Thank you, >>> >>> Sam >> >> > >
Re: Scan Metrics in Ganglia
This can be a bit tricky because of the scan caching, for example... http://hbase.apache.org/book.html#rs_metrics 12.4.2.14. hbase.regionserver.requests Total number of read and write requests. Requests correspond to RegionServer RPC calls, thus a single Get will result in 1 request, but a Scan with caching set to 1000 will result in 1 request for each 'next' call (i.e., not each row). A bulk-load request will constitute 1 request per HFile. On 12/1/11 2:23 PM, "sagar naik" wrote: >Hi, >I can see metrics for get calls (number of get , avg time for get) >However, I could not do so for scan calls > >Please let me know how can I measure > >Thanks >-Sagar >
Re: Suspected memory leak
Make sure its not the issue that Jonathan Payne identifiied a while back: https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357# St.Ack
Scan Metrics in Ganglia
Hi, I can see metrics for get calls (number of get , avg time for get) However, I could not do so for scan calls Please let me know how can I measure Thanks -Sagar
Re: Constant error when putting large data into HBase
Here's my take on the issue. > I monitored the > process and when any node fails, it has not used all the heaps yet. > So it is not a heap space problem. I disagree. Unless you load a region server heap with more data than there's heap available (loading batches of humongous rows for example), it will not fill it. It doesn't mean you have enough heap, because HBase will take precautions in order to not run out of memory. In your case, you have a lot of block cache trashing: 2011-12-01 17:05:49,084 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 79.68 MB of total=677.18 MB 2011-12-01 17:05:49,087 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=79.72 MB, total=597.78 MB, single=372.13 MB, multi=298.71 MB, memory=0 KB 2011-12-01 17:05:50,069 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 79.67 MB of total=677.17 MB 2011-12-01 17:05:50,084 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=79.67 MB, total=597.75 MB, single=372.05 MB, multi=298.71 MB, memory=0 KB etc This is the kind of precautions I'm talking about. BTW in MR jobs you should always disable the block cache like showed in this example: http://hbase.apache.org/book/mapreduce.example.html#mapreduce.example.read scan.setCacheBlocks(false); // don't set to true for MR jobs I don't know if this is related to your current job, not clear from your description of the job if the mapping is done on HBase. > And finally, according to the logs I pasted, I see other lines with DEBUG > or INFO. So I thought this was okay. > Is there a way to change WARN level log to some other level log? If you'd > let me know, I will paste another set of logs. The connection reset stuff is interesting, and this warning indeed points that somethings weird. It would be interesting to see some tasks logs (not the TaskTracker, nor the JobTracker, they are usually of little while debugging this type of problem). In any case what it means is that the client (the map or reduce task, or even some other client you have) reset the connection, so the region server just drops it. > The regionserver that contains that specific region fails. That is the > point. If I move that region to another regionserver using hbase shell, > then that regionserver fails. > With the same log output. You haven't shown us a log output of a dying region server yet. Actually from those logs I don't even see a lot of importing going on, just a lot of reading. Look for ERROR level logging, then grab everything that's around that and post it here please (go up in the log to the point where it looks like normal logging, usually the ERROR will get log after some important lines). It would also be interesting to see the full reducer task log. J-D On Thu, Dec 1, 2011 at 12:48 AM, edward choi wrote: > Hi, > I've had a problem that has been killing for some days now. > I am using CDH3 update2 version of Hadoop and Hbase. > When I do a large amount of bulk loading into Hbase, some node always die. > It's not just one particular node. > But one of many nodes fail to serve eventually. > > I set 4 gigs of heap space for master, and regionservers. I monitored the > process and when any node fails, it has not used all the heaps yet. > So it is not a heap space problem. > > Below is what I get when I perform bulk put using MapReduce. >
Re: Strategies for aggregating data in a HBase table
Or you could just prefix the row keys. Not sure if this is needed natively, or as a tool on top of HBase. Hive for example could do exactly that for you when Hive partitions are implemented for HBase. J-D On Wed, Nov 30, 2011 at 1:34 PM, Sam Seigal wrote: > What about "partitioning" at a table level. For example, create 12 > tables for the given year. Design the row keys however you like, let's > say using SHA/MD hashes. Place transactions in the appropriate table > and then do aggregations based on that table alone (this is assuming > you won't get transactions with timestamps in the past going back a > month). The idea is to archive the tables for a given year and start > fresh the next. This is acceptable in my use case. I am in the process > of trying this out, so do not have any performance numbers, issues yet > ... Experts can comment. > > On a further note, having HBase support this natively i.e. one more > level of partitioning above the row key , but below a table can be > beneficial for use cases like these ones. Comments ... ? > > On Wed, Nov 30, 2011 at 11:53 AM, Jean-Daniel Cryans > wrote: >> Inline. >> >> J-D >> >> On Mon, Nov 28, 2011 at 1:55 AM, Steinmaurer Thomas >> wrote: >>> Hello, >>> ... >>> >>> While it is an option processing the entire HBase table e.g. every night >>> when we go live, it probably isn't an option when data volume grows over >>> the years. So, what options are there for some kind of incremental >>> aggregating only new data? >> >> Yeah you don't want to go there. >> >>> >>> - Perhaps using versioning (internal timestamp) might be an option? >> >> I guess you could do rollups and ditch the raw data, if you don't need it. >> >>> >>> - Perhaps having some kind of HBase (daily) staging table which is >>> truncated after aggregating data is an option? >> >> If you do the aggregations nightly then you won't have "access to >> aggregated data very quickly". >> >>> >>> - How could Co-processors help here (at the time of the Go-Live, they >>> might be available in e.g. Cloudera)? >> >> Coprocessors are more like an internal HBase tool, so don't put all >> your eggs there until you play with them. What you could do is get the >> 0.92.0 RC0 tarball and try them out :) >> >>> Any ideas/comments are appreciated. >> >> Normally data is stored in a way that's not easy to query in a batch >> or analytics mode, so an ETL step is introduced. You'll probably need >> to do the same, as in you could asynchronously stream your data to >> other HBase tables or Hive or Pig via logs or replication and then >> directly insert it into the format it needs to be or stage it for >> later aggregations. If you explore those avenues I'm sure you'll find >> concepts that are very very similar to those you listed regarding >> RDBMS. >> >> You could also keep live counts using atomic increments, you'd issue >> those at write time or async. >> >> Hope this helps, >> >> J-D
Re: hbase-regionserver1: bash: {HBASE_HOME}/bin/hbase-daemon.sh: No such file or directory
So since I don't see the rest of the log I'll have to assume that the region server was never able to connect to the master. Connection refused could be a firewall, start the master and then try to telnet from the other machines to master:6. J-D On Thu, Dec 1, 2011 at 6:45 AM, Vamshi Krishna wrote: > I found in the logs of region server machines, i found this error (on both > regionserver machines) > > 2011-11-30 14:43:42,447 INFO org.apache.hadoop.ipc.HbaseRPC: Server at > hbase-master/10.0.1.54:60020 could not be reached after 1 tries, giving up. > *2011-11-30 14:44:37,762* WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to > master. Retrying. Error was: > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328) > at > org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883) > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750) > at > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) > at $Proxy5.getProtocolVersion(Unknown Source) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) > at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1462) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1515) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1499) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572) > at java.lang.Thread.run(Thread.java:662) > 2011-11-30 14:44:40,768 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to > Master server at hbase-master:6 > *2011-11-30 14:45:40,847* WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to > master. Retrying. Error was: > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328) > at > org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883) > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750) > at > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) > at $Proxy5.getProtocolVersion(Unknown Source) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) > at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1462) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1515) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1499) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572) > at java.lang.Thread.run(Thread.java:662) > > > and the same error is observed in the whole log repeatedly. After seeing it > what i understand is that some how master started HRegionServer daemons on > the machines but from then onwards the RegionServer machines are not able > to communicate with master. If we observe it is trying to communicate with > master for evry one minute. > > But i am not understanding where to check and modify the things.. please > help. i feel all connections are OK. > > On Thu, Dec 1, 2011 at 12:28 AM, Jean-Daniel Cryans > wrote: > >> stop-hbase.sh only tells the master to stop, which in turn will tell >> the region servers to stop. If they are still running, it might be >> because of an error. Look at their logs to figure what's going on. >> >> J-D >> >> On Tue, Nov 29, 2011 at 10:46 PM, Vamshi Krishna >> wrote: >> > hey soryy for posting multiple times. >> > J-D, As you said, i refered to my regionserver log, there i found >> > Could not resolve the DNS name of vamshikrishna-desktop >> > so i added an alias ' vamshikrishna-desktop ' to its correspond
RE: Suspected memory leak
You can create several heap dumps of JVM process in question and compare heap allocations To create heap dump: jmap pid To analize: 1. jhat 2. visualvm 3. any commercial profiler One note: -Xmn12G ??? How long is your minor collections GC pauses? Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Ramkrishna S Vasudevan [ramkrishna.vasude...@huawei.com] Sent: Wednesday, November 30, 2011 6:51 PM To: user@hbase.apache.org; d...@hbase.apache.org Subject: RE: Suspected memory leak Adding dev list to get some suggestions. Regards Ram -Original Message- From: Shrijeet Paliwal [mailto:shrij...@rocketfuel.com] Sent: Thursday, December 01, 2011 8:08 AM To: user@hbase.apache.org Cc: Gaojinchao; Chenjian Subject: Re: Suspected memory leak Jieshan, We backported https://issues.apache.org/jira/browse/HBASE-2937 to 0.90.3 -Shrijeet 2011/11/30 bijieshan > Hi Shrijeet, > > I think that's jira relevant to trunk, but not for 90.X. For there's no > timeout mechanism in 90.X. Right? > We found this problem in 90.x. > > Thanks, > > Jieshan. > > -邮件原件- > 发件人: Shrijeet Paliwal [mailto:shrij...@rocketfuel.com] > 发送时间: 2011年12月1日 10:26 > 收件人: user@hbase.apache.org > 抄送: Gaojinchao; Chenjian > 主题: Re: Suspected memory leak > > Gaojinchao, > > I had filed this some time ago, > https://issues.apache.org/jira/browse/HBASE-4633 > But after some recent insights on our application code, I am inclined to > think leak (or memory 'hold') is in our application. But it will be good to > check out either way. > I need to update the jira with my saga. See if the description of issue I > posted there, matches yours. If not, may be you can update with your story > in detail. > > -Shrijeet > > 2011/11/30 Gaojinchao > > > I have noticed some memory leak problems in my HBase client. > > RES has increased to 27g > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > 12676 root 20 0 30.8g 27g 5092 S2 57.5 587:57.76 > > /opt/java/jre/bin/java -Djava.library.path=lib/. > > > > But I am not sure the leak comes from HBase Client jar itself or just our > > client code. > > > > This is some parameters of jvm. > > :-Xms15g -Xmn12g -Xmx15g -XX:PermSize=64m -XX:+UseParNewGC > > -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=65 > > -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=1 > > -XX:+CMSParallelRemarkEnabled > > > > Who has experience in this case? , I need continue to dig :) > > > > > > > > 发件人: Gaojinchao > > 发送时间: 2011年11月30日 11:02 > > 收件人: user@hbase.apache.org > > 主题: Suspected memory leak > > > > In HBaseClient proceess, I found heap has been increased. > > I used command ’cat smaps’ to get the heap size. > > It seems in case when the threads pool in HTable has released the no > using > > thread, if you use putlist api to put data again, the memory is > increased. > > > > Who has experience in this case? > > > > Below is the heap of Hbase client: > > C3S31:/proc/18769 # cat smaps > > 4010a000-4709d000 rwxp 00:00 0 > > [heap] > > Size: 114252 kB > > Rss: 114044 kB > > Pss: 114044 kB > > > > 4010a000-4709d000 rwxp 00:00 0 > > [heap] > > Size: 114252 kB > > Rss: 114044 kB > > Pss: 114044 kB > > > > 4010a000-48374000 rwxp 00:00 0 > > [heap] > > Size: 133544 kB > > Rss: 16 kB > > Pss: 16 kB > > > > 4010a000-49f2 rwxp 00:00 0 > > [heap] > > Size: 161880 kB > > Rss: 161672 kB > > Pss: 161672 kB > > > > 4010a000-4c5de000 rwxp 00:00 0 > > [heap] > > Size: 201552 kB > > Rss: 201344 kB > > Pss: 201344 kB > > > Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments.
Re: Performance characteristics of scans using timestamp as the filter
Scans work on startRow/stopRow... http://hbase.apache.org/book.html#scan ... you can also select by timestamp *within the startRow/stopRow selection*, but this isn't intended to quickly select rows by timestamp irrespective of their keys. On 12/1/11 9:03 AM, "Srikanth P. Shreenivas" wrote: >So, will it be safe to assume that Scan queries with TimeRange will >perform well and will read only necessary portions of the tables instead >of doing full table scan? > >I have run into a situation, wherein I would like to find out all rows >that got create/updated on during a time range. >I was hoping that I could to time range scan. > >Regards, >Srikanth > > > >-Original Message- >From: Stuti Awasthi [mailto:stutiawas...@hcl.com] >Sent: Monday, October 10, 2011 3:44 PM >To: user@hbase.apache.org >Subject: RE: Performance characteristics of scans using timestamp as the >filter > >Yes its true. >Your cluster time should be in sync for reliable functioning. > >-Original Message- >From: Steinmaurer Thomas [mailto:thomas.steinmau...@scch.at] >Sent: Monday, October 10, 2011 3:04 PM >To: user@hbase.apache.org >Subject: RE: Performance characteristics of scans using timestamp as the >filter > >Isn't a synchronized time along all nodes a general requirement for >running the cluster reliably? > >Regards, >Thomas > >-Original Message- >From: Stuti Awasthi [mailto:stutiawas...@hcl.com] >Sent: Montag, 10. Oktober 2011 11:18 >To: user@hbase.apache.org >Subject: RE: Performance characteristics of scans using timestamp as the >filter > >Steinmaurer, > >I have done a little POC with Timerange scan and it worked fine for me. >Another thing to note is time should be same on all machines of your >cluster of Hbase. > >-Original Message- >From: Steinmaurer Thomas [mailto:thomas.steinmau...@scch.at] >Sent: Monday, October 10, 2011 2:32 PM >To: user@hbase.apache.org >Subject: RE: Performance characteristics of scans using timestamp as the >filter > >Hello, > >others have stated that one shouldn't try to use timestamps, although I >haven't figured out why? If it's reliability, which means, rows are >omitted, even if they should be included in a timerange-based scan, then >this might be a good argument. ;-) > >One thing is that the timestamp AFAIK changes when you update a row even >cell values didn't change. > >Regards, >Thomas > >-Original Message- >From: Stuti Awasthi [mailto:stutiawas...@hcl.com] >Sent: Montag, 10. Oktober 2011 10:07 >To: user@hbase.apache.org >Subject: RE: Performance characteristics of scans using timestamp as the >filter > >Hi Saurabh, > >AFAIK you can also scan on the basis of Timestamp Range. This can provide >you data update in that timestamp range. You do not need to keep >timestamp in you row key. > >-Original Message- >From: saurabh@gmail.com [mailto:saurabh@gmail.com] On Behalf Of >Sam Seigal >Sent: Monday, October 10, 2011 1:20 PM >To: user@hbase.apache.org >Subject: Re: Performance characteristics of scans using timestamp as the >filter > >Is it possible to do incremental processing without putting the timestamp >in the leading part of the row key in a more efficient manner i.e. >process data that came within the last hour/ 2 hour etc ? I can't seem to >find a good answer to this question myself. > >On Mon, Oct 10, 2011 at 12:09 AM, Steinmaurer Thomas < >thomas.steinmau...@scch.at> wrote: > >> Leif, >> >> we are pretty much in the same boat with a custom timestamp at the end > >> of a three-part rowkey, so basically we end up with reading all data >> when processing daily batches. Beside performance aspects, have you >> seen that using internals timestamps for scans etc... work reliable? >> >> Or did you come up with another solution to your problem? >> >> Thanks, >> Thomas >> >> -Original Message- >> From: Leif Wickland [mailto:leifwickl...@gmail.com] >> Sent: Freitag, 09. September 2011 20:33 >> To: user@hbase.apache.org >> Subject: Performance characteristics of scans using timestamp as the >> filter >> >> (Apologies if this has been answered before. I couldn't find anything > >> in the archives quite along these lines.) >> >> I have a process which writes to HBase as new data arrives. I'd like >> to run a map-reduce periodically, say daily, that takes the new items >as input. >> A naive approach would use a scan which grabs all of the rows that >> have a timestamp in a specified interval as the input to a MapReduce. >> I tested a scenario like that with 10s of GB of data and it seemed to >perform OK. >> Should I expected that approach to continue to perform reasonably >> well when I have TBs of data? >> >> From what I understand of the HBase architecture, I don't see a reason > >> that the the scan approach would continue to perform well as the data >> grows. It seems like I may have to keep a log of modified keys and >> use that as the map-reduce input, instead. >> >> Thanks, >> >> Leif Wickland >> > >::DISCLAIMER:: >--
Re: hbase-regionserver1: bash: {HBASE_HOME}/bin/hbase-daemon.sh: No such file or directory
I found in the logs of region server machines, i found this error (on both regionserver machines) 2011-11-30 14:43:42,447 INFO org.apache.hadoop.ipc.HbaseRPC: Server at hbase-master/10.0.1.54:60020 could not be reached after 1 tries, giving up. *2011-11-30 14:44:37,762* WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to master. Retrying. Error was: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy5.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1462) at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1515) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1499) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572) at java.lang.Thread.run(Thread.java:662) 2011-11-30 14:44:40,768 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at hbase-master:6 *2011-11-30 14:45:40,847* WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to master. Retrying. Error was: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy5.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1462) at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1515) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1499) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572) at java.lang.Thread.run(Thread.java:662) and the same error is observed in the whole log repeatedly. After seeing it what i understand is that some how master started HRegionServer daemons on the machines but from then onwards the RegionServer machines are not able to communicate with master. If we observe it is trying to communicate with master for evry one minute. But i am not understanding where to check and modify the things.. please help. i feel all connections are OK. On Thu, Dec 1, 2011 at 12:28 AM, Jean-Daniel Cryans wrote: > stop-hbase.sh only tells the master to stop, which in turn will tell > the region servers to stop. If they are still running, it might be > because of an error. Look at their logs to figure what's going on. > > J-D > > On Tue, Nov 29, 2011 at 10:46 PM, Vamshi Krishna > wrote: > > hey soryy for posting multiple times. > > J-D, As you said, i refered to my regionserver log, there i found > > Could not resolve the DNS name of vamshikrishna-desktop > > so i added an alias ' vamshikrishna-desktop ' to its corresponding IP > > address in /etc/hosts. So, from then master is able to run HRegionServer > > daemon in the regionserver machines also. > > > > But the ONLY problem now is when i stop hbase on my master node by > running > > bin/stop-hbase.sh, all hbase daemons are stopping on matser node but NOT > on > > regionserver nodes.The HRegionServer daemon is still running on the other > > regionserver machines. > > I think the HRegionServer daemons on al
RE: Performance characteristics of scans using timestamp as the filter
So, will it be safe to assume that Scan queries with TimeRange will perform well and will read only necessary portions of the tables instead of doing full table scan? I have run into a situation, wherein I would like to find out all rows that got create/updated on during a time range. I was hoping that I could to time range scan. Regards, Srikanth -Original Message- From: Stuti Awasthi [mailto:stutiawas...@hcl.com] Sent: Monday, October 10, 2011 3:44 PM To: user@hbase.apache.org Subject: RE: Performance characteristics of scans using timestamp as the filter Yes its true. Your cluster time should be in sync for reliable functioning. -Original Message- From: Steinmaurer Thomas [mailto:thomas.steinmau...@scch.at] Sent: Monday, October 10, 2011 3:04 PM To: user@hbase.apache.org Subject: RE: Performance characteristics of scans using timestamp as the filter Isn't a synchronized time along all nodes a general requirement for running the cluster reliably? Regards, Thomas -Original Message- From: Stuti Awasthi [mailto:stutiawas...@hcl.com] Sent: Montag, 10. Oktober 2011 11:18 To: user@hbase.apache.org Subject: RE: Performance characteristics of scans using timestamp as the filter Steinmaurer, I have done a little POC with Timerange scan and it worked fine for me. Another thing to note is time should be same on all machines of your cluster of Hbase. -Original Message- From: Steinmaurer Thomas [mailto:thomas.steinmau...@scch.at] Sent: Monday, October 10, 2011 2:32 PM To: user@hbase.apache.org Subject: RE: Performance characteristics of scans using timestamp as the filter Hello, others have stated that one shouldn't try to use timestamps, although I haven't figured out why? If it's reliability, which means, rows are omitted, even if they should be included in a timerange-based scan, then this might be a good argument. ;-) One thing is that the timestamp AFAIK changes when you update a row even cell values didn't change. Regards, Thomas -Original Message- From: Stuti Awasthi [mailto:stutiawas...@hcl.com] Sent: Montag, 10. Oktober 2011 10:07 To: user@hbase.apache.org Subject: RE: Performance characteristics of scans using timestamp as the filter Hi Saurabh, AFAIK you can also scan on the basis of Timestamp Range. This can provide you data update in that timestamp range. You do not need to keep timestamp in you row key. -Original Message- From: saurabh@gmail.com [mailto:saurabh@gmail.com] On Behalf Of Sam Seigal Sent: Monday, October 10, 2011 1:20 PM To: user@hbase.apache.org Subject: Re: Performance characteristics of scans using timestamp as the filter Is it possible to do incremental processing without putting the timestamp in the leading part of the row key in a more efficient manner i.e. process data that came within the last hour/ 2 hour etc ? I can't seem to find a good answer to this question myself. On Mon, Oct 10, 2011 at 12:09 AM, Steinmaurer Thomas < thomas.steinmau...@scch.at> wrote: > Leif, > > we are pretty much in the same boat with a custom timestamp at the end > of a three-part rowkey, so basically we end up with reading all data > when processing daily batches. Beside performance aspects, have you > seen that using internals timestamps for scans etc... work reliable? > > Or did you come up with another solution to your problem? > > Thanks, > Thomas > > -Original Message- > From: Leif Wickland [mailto:leifwickl...@gmail.com] > Sent: Freitag, 09. September 2011 20:33 > To: user@hbase.apache.org > Subject: Performance characteristics of scans using timestamp as the > filter > > (Apologies if this has been answered before. I couldn't find anything > in the archives quite along these lines.) > > I have a process which writes to HBase as new data arrives. I'd like > to run a map-reduce periodically, say daily, that takes the new items as input. > A naive approach would use a scan which grabs all of the rows that > have a timestamp in a specified interval as the input to a MapReduce. > I tested a scenario like that with 10s of GB of data and it seemed to perform OK. > Should I expected that approach to continue to perform reasonably > well when I have TBs of data? > > From what I understand of the HBase architecture, I don't see a reason > that the the scan approach would continue to perform well as the data > grows. It seems like I may have to keep a log of modified keys and > use that as the map-reduce input, instead. > > Thanks, > > Leif Wickland > ::DISCLAIMER:: --- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions o
Re: regions and tables
To expand on what Lars said, there is an example of how this is layed out on disk... http://hbase.apache.org/book.html#trouble.namenode.disk ... regions distribute the table, so two different tables will be distributed by separate sets of regions. On 12/1/11 3:14 AM, "Lars George" wrote: >Hi Sam, > >You need to handle them all separately. The note - I assume - was solely >explaining the fact that the "load" of a region server is defined by the >number of regions it hosts, not the number of tables. If you want to >precreate the regions for one or more than one table is the same work: >create the tables (one by one) with the list of split points. > >Lars > >On Dec 1, 2011, at 7:50 AM, Sam Seigal wrote: > >> HI, >> >> I had a question about the relationship between regions and tables. >> >> Is there a way to pre-create regions for multiple tables ? or each >> table has its own set of regions managed independently ? >> >> I read on one of the threads that there is really no limit on the >> number of tables, but that we need to be careful about is the number >> of regions. Does this mean that the regions can be pre created for >> multiple tables ? >> >> Thank you, >> >> Sam > >
Re: Unable to create version file
Could you please pastebin your Hadoop, HBase and ZooKeeper config files? Lars On Dec 1, 2011, at 11:23 AM, Mohammad Tariq wrote: > Today when I issued bin/start-hbase.sh I ran into the following error - > > Thu Dec 1 15:47:30 IST 2011 Starting master on ubuntu > ulimit -n 1024 > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 > GMT > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:host.name=ubuntu.ubuntu-domain > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:java.version=1.6.0_26 > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:java.vendor=Sun Microsystems Inc. > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:java.class.path=/home/solr/hbase-0.90.4/bin/../conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/solr/hbase-0.90.4/bin/..:/home/solr/hbase-0.90.4/bin/../hbase-0.90.4.jar:/home/solr/hbase-0.90.4/bin/../hbase-0.90.4-tests.jar:/home/solr/hbase-0.90.4/bin/../lib/activation-1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/asm-3.1.jar:/home/solr/hbase-0.90.4/bin/../lib/avro-1.3.3.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-cli-1.2.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-codec-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-el-1.0.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-httpclient-3.1.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-lang-2.5.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-logging-1.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-net-1.4.1.jar:/home/solr/hbase-0.90.4/bin/../lib/core-3.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/guava-r06.jar:/home/solr/hbase-0.90.4/bin/../lib/hadoop-core-0.20-append-r1056497.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-core-asl-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-jaxrs-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-mapper-asl-1.4.2.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-xc-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jasper-compiler-5.5.23.jar:/home/solr/hbase-0.90.4/bin/../lib/jasper-runtime-5.5.23.jar:/home/solr/hbase-0.90.4/bin/../lib/jaxb-api-2.1.jar:/home/solr/hbase-0.90.4/bin/../lib/jaxb-impl-2.1.12.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-core-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-json-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-server-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jettison-1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/jetty-6.1.26.jar:/home/solr/hbase-0.90.4/bin/../lib/jetty-util-6.1.26.jar:/home/solr/hbase-0.90.4/bin/../lib/jruby-complete-1.6.0.jar:/home/solr/hbase-0.90.4/bin/../lib/jsp-2.1-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/jsp-api-2.1-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/jsr311-api-1.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/log4j-1.2.16.jar:/home/solr/hbase-0.90.4/bin/../lib/protobuf-java-2.3.0.jar:/home/solr/hbase-0.90.4/bin/../lib/servlet-api-2.5-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/slf4j-api-1.5.8.jar:/home/solr/hbase-0.90.4/bin/../lib/slf4j-log4j12-1.5.8.jar:/home/solr/hbase-0.90.4/bin/../lib/stax-api-1.0.1.jar:/home/solr/hbase-0.90.4/bin/../lib/thrift-0.2.0.jar:/home/solr/hbase-0.90.4/bin/../lib/xmlenc-0.52.jar:/home/solr/hbase-0.90.4/bin/../lib/zookeeper-3.3.2.jar > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.26/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:java.io.tmpdir=/tmp > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:java.compiler= > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:os.name=Linux > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:os.arch=amd64 > 2011-12-01 15:47:31,158 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:os.version=3.0.0-13-generic > 2011-12-01 15:47:31,159 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:user.name=solr > 2011-12-01 15:47:31,159 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:user.home=/home/solr > 2011-12-01 15:47:31,159 INFO > org.apache.zookeeper.server.ZooKeeperServer: Server > environment:user.dir=/home/solr/hbase-0.90.4 > 2011-12-01 15:47:31,169 INFO > org.apache.zookeeper.server.ZooKeeperServer: Created server with > tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 4 datadi
Re: Problem in configuring pseudo distributed mode
Your hosts file (/etc/hosts) should contain only sth like 127.0.0.1 localhost Or 127.0.0.1 It should not contain sth like 127.0.1.1 localhost. And i think u need to reboot after changing it. Hope that helps. Regards, Christopher Am 01.12.2011 13:24 schrieb "Mohammad Tariq" : > Hello list, > >Even after following the directions provided by you guys and Hbase > book and several other blogs and posts I am not able to run Hbase in > pseudo distributed mode.And I think there is some problem with the > hosts file.I would highly appreciate if someone who has done it > properly could share his/her hosts and hbase-site.xml files??? > > Regards, > Mohammad Tariq >
Problem in configuring pseudo distributed mode
Hello list, Even after following the directions provided by you guys and Hbase book and several other blogs and posts I am not able to run Hbase in pseudo distributed mode.And I think there is some problem with the hosts file.I would highly appreciate if someone who has done it properly could share his/her hosts and hbase-site.xml files??? Regards, Mohammad Tariq
Unable to create version file
Today when I issued bin/start-hbase.sh I ran into the following error - Thu Dec 1 15:47:30 IST 2011 Starting master on ubuntu ulimit -n 1024 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:host.name=ubuntu.ubuntu-domain 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:java.version=1.6.0_26 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:java.vendor=Sun Microsystems Inc. 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:java.class.path=/home/solr/hbase-0.90.4/bin/../conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/solr/hbase-0.90.4/bin/..:/home/solr/hbase-0.90.4/bin/../hbase-0.90.4.jar:/home/solr/hbase-0.90.4/bin/../hbase-0.90.4-tests.jar:/home/solr/hbase-0.90.4/bin/../lib/activation-1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/asm-3.1.jar:/home/solr/hbase-0.90.4/bin/../lib/avro-1.3.3.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-cli-1.2.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-codec-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-el-1.0.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-httpclient-3.1.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-lang-2.5.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-logging-1.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/commons-net-1.4.1.jar:/home/solr/hbase-0.90.4/bin/../lib/core-3.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/guava-r06.jar:/home/solr/hbase-0.90.4/bin/../lib/hadoop-core-0.20-append-r1056497.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-core-asl-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-jaxrs-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-mapper-asl-1.4.2.jar:/home/solr/hbase-0.90.4/bin/../lib/jackson-xc-1.5.5.jar:/home/solr/hbase-0.90.4/bin/../lib/jasper-compiler-5.5.23.jar:/home/solr/hbase-0.90.4/bin/../lib/jasper-runtime-5.5.23.jar:/home/solr/hbase-0.90.4/bin/../lib/jaxb-api-2.1.jar:/home/solr/hbase-0.90.4/bin/../lib/jaxb-impl-2.1.12.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-core-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-json-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jersey-server-1.4.jar:/home/solr/hbase-0.90.4/bin/../lib/jettison-1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/jetty-6.1.26.jar:/home/solr/hbase-0.90.4/bin/../lib/jetty-util-6.1.26.jar:/home/solr/hbase-0.90.4/bin/../lib/jruby-complete-1.6.0.jar:/home/solr/hbase-0.90.4/bin/../lib/jsp-2.1-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/jsp-api-2.1-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/jsr311-api-1.1.1.jar:/home/solr/hbase-0.90.4/bin/../lib/log4j-1.2.16.jar:/home/solr/hbase-0.90.4/bin/../lib/protobuf-java-2.3.0.jar:/home/solr/hbase-0.90.4/bin/../lib/servlet-api-2.5-6.1.14.jar:/home/solr/hbase-0.90.4/bin/../lib/slf4j-api-1.5.8.jar:/home/solr/hbase-0.90.4/bin/../lib/slf4j-log4j12-1.5.8.jar:/home/solr/hbase-0.90.4/bin/../lib/stax-api-1.0.1.jar:/home/solr/hbase-0.90.4/bin/../lib/thrift-0.2.0.jar:/home/solr/hbase-0.90.4/bin/../lib/xmlenc-0.52.jar:/home/solr/hbase-0.90.4/bin/../lib/zookeeper-3.3.2.jar 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.26/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:java.io.tmpdir=/tmp 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:java.compiler= 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:os.name=Linux 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:os.arch=amd64 2011-12-01 15:47:31,158 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:os.version=3.0.0-13-generic 2011-12-01 15:47:31,159 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:user.name=solr 2011-12-01 15:47:31,159 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:user.home=/home/solr 2011-12-01 15:47:31,159 INFO org.apache.zookeeper.server.ZooKeeperServer: Server environment:user.dir=/home/solr/hbase-0.90.4 2011-12-01 15:47:31,169 INFO org.apache.zookeeper.server.ZooKeeperServer: Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 4 datadir /tmp/hbase-solr/zookeeper/zookeeper/version-2 snapdir /tmp/hbase-solr/zookeeper/zookeeper/version-2 2011-12-01 15:47:31,185 INFO org.apache.zookeeper.server.NIOServerCnxn: binding to port 0.0.0.0/0.0.0.0:2181 2011-12-01 15:47:31,189 INFO
Re: Constant error when putting large data into HBase
Hi Ed, You need to be more precise I am afraid. First of all what does "some node always dies" mean? Is the process gone? Which process is gone? And the "error" you pasted is a WARN level log that *might* indicate some trouble, but is *not* the reason the "node has died". Please elaborate. Also consider posting the last few hundred lines of the process logs to pastebin so that someone can look at it. Thanks, Lars On Dec 1, 2011, at 9:48 AM, edward choi wrote: > Hi, > I've had a problem that has been killing for some days now. > I am using CDH3 update2 version of Hadoop and Hbase. > When I do a large amount of bulk loading into Hbase, some node always die. > It's not just one particular node. > But one of many nodes fail to serve eventually. > > I set 4 gigs of heap space for master, and regionservers. I monitored the > process and when any node fails, it has not used all the heaps yet. > So it is not a heap space problem. > > Below is what I get when I perform bulk put using MapReduce. > > > 11/12/01 17:17:20 INFO mapred.JobClient: map 100% reduce 100% > 11/12/01 17:18:31 INFO mapred.JobClient: Task Id : > attempt_20302113_0034_r_13_0, Status : FAILED > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 1 action: servers with issues: lp171.etri.re.kr:60020, >at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239) >at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253) >at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828) >at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684) >at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669) >at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) >at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) >at > org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514) >at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) >at etri.qa.mapreduce.PostProcess$PostPro > attempt_20302113_0034_r_13_0: 2022 > 11/12/01 17:18:36 INFO mapred.JobClient: Task Id : > attempt_20302113_0034_r_13_1, Status : FAILED > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 1 action: servers with issues: lp171.etri.re.kr:60020, >at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239) >at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253) >at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828) >at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684) >at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669) >at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) >at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) >at > org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514) >at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) >at etri.qa.mapreduce.PostProcess$PostPro > attempt_20302113_0034_r_13_1: 2022 > 11/12/01 17:18:37 INFO mapred.JobClient: map 100% reduce 95% > 11/12/01 17:18:44 INFO mapred.JobClient: map 100% reduce 96% > 11/12/01 17:18:47 INFO mapred.JobClient: map 100% reduce 98% > 11/12/01 17:18:50 INFO mapred.JobClient: map 100% reduce 99% > 11/12/01 17:18:53 INFO mapred.JobClient: map 100% reduce 100% > 11/12/01 17:20:07 INFO mapred.JobClient: Task Id : > attempt_20302113_0034_r_13_3, Status : FAILED > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 1 action: servers with issues: lp171.etri.re.kr:60020, >at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239) >at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253) >at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828) >at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684) >at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669) >
Constant error when putting large data into HBase
Hi, I've had a problem that has been killing for some days now. I am using CDH3 update2 version of Hadoop and Hbase. When I do a large amount of bulk loading into Hbase, some node always die. It's not just one particular node. But one of many nodes fail to serve eventually. I set 4 gigs of heap space for master, and regionservers. I monitored the process and when any node fails, it has not used all the heaps yet. So it is not a heap space problem. Below is what I get when I perform bulk put using MapReduce. 11/12/01 17:17:20 INFO mapred.JobClient: map 100% reduce 100% 11/12/01 17:18:31 INFO mapred.JobClient: Task Id : attempt_20302113_0034_r_13_0, Status : FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: servers with issues: lp171.etri.re.kr:60020, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at etri.qa.mapreduce.PostProcess$PostPro attempt_20302113_0034_r_13_0: 2022 11/12/01 17:18:36 INFO mapred.JobClient: Task Id : attempt_20302113_0034_r_13_1, Status : FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: servers with issues: lp171.etri.re.kr:60020, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at etri.qa.mapreduce.PostProcess$PostPro attempt_20302113_0034_r_13_1: 2022 11/12/01 17:18:37 INFO mapred.JobClient: map 100% reduce 95% 11/12/01 17:18:44 INFO mapred.JobClient: map 100% reduce 96% 11/12/01 17:18:47 INFO mapred.JobClient: map 100% reduce 98% 11/12/01 17:18:50 INFO mapred.JobClient: map 100% reduce 99% 11/12/01 17:18:53 INFO mapred.JobClient: map 100% reduce 100% 11/12/01 17:20:07 INFO mapred.JobClient: Task Id : attempt_20302113_0034_r_13_3, Status : FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: servers with issues: lp171.etri.re.kr:60020, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1239) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1253) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:828) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:684) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:669) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:127) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:514) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at etri.qa.mapreduce.PostProcess$PostPro attempt_20302113_0034_r_13_3: 2022 11/12/01 17:20:09 INFO mapred.JobClient: map 100% reduce 95% 11/12/01 17:20:09 IN
Re: regions and tables
Hi Sam, You need to handle them all separately. The note - I assume - was solely explaining the fact that the "load" of a region server is defined by the number of regions it hosts, not the number of tables. If you want to precreate the regions for one or more than one table is the same work: create the tables (one by one) with the list of split points. Lars On Dec 1, 2011, at 7:50 AM, Sam Seigal wrote: > HI, > > I had a question about the relationship between regions and tables. > > Is there a way to pre-create regions for multiple tables ? or each > table has its own set of regions managed independently ? > > I read on one of the threads that there is really no limit on the > number of tables, but that we need to be careful about is the number > of regions. Does this mean that the regions can be pre created for > multiple tables ? > > Thank you, > > Sam