Re: Where to place logs

2012-12-28 Thread Varun Sharma
Thanks for the tip. I was just wondering what people have done in the past
- do people typically reserve a separate disk for logging activity ?

Thanks
Varun

On Wed, Dec 26, 2012 at 1:13 PM, Stack st...@duboce.net wrote:

 On Mon, Dec 24, 2012 at 9:27 AM, Varun Sharma va...@pinterest.com wrote:

  Hi,
 
  I am wondering where people usually place hbase + hadoop logs. I have 4
  disks and 1 very tiny disk with barely 500 megs (thats the typical setup
 on
  amazon ec2). The 4 disks shall be used for hbase data. Since 500M is too
  small, should I place logs on one of the 4 disks. Could it potentially
  steal IOP(s) from hbase ? Does anyone have an idea how much of an
 overhead
  logging really is ?
 

 Looks like you have no choice but to put your logging on a data disk.
  Logging can be expensive, yes.  It should be easy enough to measure in
 your setup.  You can disable all logging for a short period while the
 cluster is under load; do it on a single node even (you can do this from
 the UI).  See how your io changes when logging is off.

 St.Ack



Re: how can thrift connect to hbase?

2012-12-28 Thread 周梦想
Hi hua,
The zookeeper is used by HBase for tow main purpose, one is manging every
region server state, the other is managing --ROOT-- table updated by
HMaster.  So most HBase operation will keep touch with zookeeper,  the
thrift server is not an exception.



2012/12/27 hua beatls bea...@gmail.com

 Hi Andy,
  i have a question about thrift,does thrift server connect to hbase
 through zookeeper?
  i read the thrift log and find thrift is assign 'request' to different
 regionserver.
  below is the log:
  2012-12-27 15:39:27,924 DEBUG

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
 Cached location for BT_NET_LOG_000,189094114602012122620
 20010731857,1356593966233.d20458ecf526a932f602af63002b290e. is
 hadoop1:60020
 2012-12-27 15:39:27,924 DEBUG

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
 Cached location for BT_NET_LOG_000,189098696122012122921
 00010731857,1356593966233.c1f127dc9de117605ea332f27b0b3775. is
 hadoop1:60020
 2012-12-27 15:39:28,563 DEBUG org.apache.hadoop.hbase.client.MetaScanner:
 Scanning .META. starting at
 row=BT_NET_LOG_000,18909411460201212262020010731857,0
 0 for max=10 rows using

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@52313a4e
 2012-12-27 15:45:14,850 DEBUG

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
 Removed BT_NET_LOG_000,1539100201212011513400100
 272,1356589842295.5c84298f6889734514903fffc9582689. for
 tableName=BT_NET_LOG_000 from cache because of
 1890020879120121201010731857
 2012-12-27 15:45:14,853 DEBUG

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
 Retry 1, sleep for 1000ms!
 2012-12-27 15:45:15,379 DEBUG org.apache.hadoop.hbase.client.MetaScanner:
 Scanning .META. starting at
 row=BT_NET_LOG_000,1539100201212011513400100272,0
 0 for max=10 rows using

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@52313a4e
 2012-12-27 15:45:15,384 DEBUG

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
 locateRegionInMeta parentTable=.META., metaLocation={region=
 .META.,,1.1028785192, hostname=hadoop1, port=60020}, attempt=0 of 10
 failed; retrying after sleep of 1000 because: the only available region for
 the required row is a
 split parent, the daughters should be online soon:

 BT_NET_LOG_000,1539100201212011513400100272,1356589842295.5c84298f6889734514903fffc9582689.
 2012-12-27 15:45:15,856 DEBUG org.apache.hadoop.hbase.client.MetaScanner:
 Scanning .META. starting at
 row=BT_NET_LOG_000,18900196224201212232220010600559,0
 0 for max=10 rows using

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@52313a4e
 2012-12-27 15:45:15,859 DEBUG

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
 Cached location for BT_NET_LOG_000,189001962242012122322
 20010600559,1356594314689.23d2fce6f5b6912e39eb7bdf22a069b3. is
 hadoop3:60020
 2012-12-27 15:45:16,387 DEBUG org.apache.hadoop.hbase.client.MetaScanner:
 Scanning .META. starting at
 row=BT_NET_LOG_000,18900196224201212232220010600559,0
 0 for max=10 rows using

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@52313a4e
 2012-12-27 15:45:17,638 DEBUG org.apache.hadoop.hbase.client.MetaScanner:
 Scanning .META. starting at
 row=BT_NET_LOG_000,1539100201212011513400100272,0
 0 for max=10 rows using

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@52313a4e
 2012-12-27 15:45:17,642 DEBUG

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
 Cached location for BT_NET_LOG_000,15391002012120115
 13400100272,1356594314689.91a440fc38a1d4d9b0afbf0ebf33e7a8. is
 hadoop3:60020

 seems thrift assign request to hadoop1 and hadoop3, but i still have a
 hadoop2 which seems not included in this circulation.
 any suggestion?


maybe your region data just asigned to hadoop1 and hadoop3?  you could
check hadoop2's state, and increase data and load to test.
Good luck!

Andy



 Thanks!

beatls

 On Wed, Dec 26, 2012 at 2:52 PM, 周梦想 abloz...@gmail.com wrote:

  Yes, the first article is the base to install and test thrift, it has
  nothing to do with HBase.
  And the second one is how to use thrift  on HBase.
 
  2012/12/25 hua beatls bea...@gmail.com
 
   Hi Andy,
   shoud i first install as
   http://abloz.com/2012/05/31/example-thrift-installation.html, then
   configure thrift as @
   http://abloz.com/2012/06/01/python-operating-hbase-thrift-to.html.
  that's
   first install thrift using tarball from apache.thirft.org, then
  configrue
   and start it with './hbase-daemon.sh start thrift'.
  
  
   Best R.
  
   beatls
  
   On Tue, Dec 25, 2012 at 9:29 PM, hua beatls 

Re: HBase - Secondary Index

2012-12-28 Thread Shengjie Min
Yes as you say when the no of rows to be returned is becoming more and
more the latency will be becoming more.  seeks within an HFile block is
some what expensive op now. (Not much but still)  The new encoding prefix
trie will be a huge bonus here. There the seeks will be flying.. [Ted also
presented this in the Hadoop China]  Thanks to Matt... :)  I am trying to
measure the scan performance with this new encoding . Trying to back port
a simple patch for 94 version just for testing...   Yes when the no of
results to be returned is more and more any index will become less
performing as per my study  :)

yes, you are right, I guess it's just a drawback of any index approach.
Thanks for the explanation.

Shengjie

On 28 December 2012 04:14, Anoop Sam John anoo...@huawei.com wrote:

  Do you have link to that presentation?

 http://hbtc2012.hadooper.cn/subject/track4TedYu4.pdf

 -Anoop-

 
 From: Mohit Anchlia [mohitanch...@gmail.com]
 Sent: Friday, December 28, 2012 9:12 AM
 To: user@hbase.apache.org
 Subject: Re: HBase - Secondary Index

 On Thu, Dec 27, 2012 at 7:33 PM, Anoop Sam John anoo...@huawei.com
 wrote:

  Yes as you say when the no of rows to be returned is becoming more and
  more the latency will be becoming more.  seeks within an HFile block is
  some what expensive op now. (Not much but still)  The new encoding prefix
  trie will be a huge bonus here. There the seeks will be flying.. [Ted
 also
  presented this in the Hadoop China]  Thanks to Matt... :)  I am trying to
  measure the scan performance with this new encoding . Trying to back
 port a
  simple patch for 94 version just for testing...   Yes when the no of
  results to be returned is more and more any index will become less
  performing as per my study  :)
 
  Do you have link to that presentation?


  btw, quick question- in your presentation, the scale there is seconds or
  mill-seconds:)
 
  It is seconds.  Dont consider the exact values. What is the % of increase
  in latency is important :) Those were not high end machines.
 
  -Anoop-
  
  From: Shengjie Min [kelvin@gmail.com]
  Sent: Thursday, December 27, 2012 9:59 PM
  To: user@hbase.apache.org
  Subject: Re: HBase - Secondary Index
 
   Didnt follow u completely here. There wont be any get() happening.. As
  the
  exact rowkey in a region we get from the index table, we can seek to the
  exact position and return that row.
 
  Sorry, When I misused get() here, I meant seeking. Yes, if it's just
  small number of rows returned, this works perfect. As you said you will
 get
  the exact rowkey positions per region, and simply seek them. I was trying
  to work out the case that when the number of result rows increases
  massively. Like in Anil's case, he wants to do a scan query against the
  2ndary index(timestamp): select all rows from timestamp1 to timestamp2
  given no customerId provided. During that time period, he might have a
 big
  chunk of rows from different customerIds. The index table returns a lot
 of
  rowkey positions for different customerIds (I believe they are scattered
 in
  different regions), then you end up seeking all different positions in
  different regions and return all the rows needed. According to your
  presentation page14 - Performance Test Results (Scan), without index,
 it's
  a linear increase as result rows # increases. on the other hand, with
  index, time spent climbs up way quicker than the case without index.
 
  btw, quick question- in your presentation, the scale there is seconds or
  mill-seconds:)
 
  - Shengjie
 
 
  On 27 December 2012 15:54, Anoop John anoop.hb...@gmail.com wrote:
 
   how the massive number of get() is going to
   perform againt the main table
  
   Didnt follow u completely here. There wont be any get() happening.. As
  the
   exact rowkey in a region we get from the index table, we can seek to
 the
   exact position and return that row.
  
   -Anoop-
  
   On Thu, Dec 27, 2012 at 6:37 PM, Shengjie Min kelvin@gmail.com
   wrote:
  
how the massive number of get() is going to
perform againt the main table
   
  
 
 
 
  --
  All the best,
  Shengjie Min
 




-- 
All the best,
Shengjie Min


Re: Fw: Regarding Rowkey and Column Family

2012-12-28 Thread Jean-Marc Spaggiari
I have not looked at your Jason object but the rest sound good to me.

You might want to implement a test class Wichita creates a Jason, put it in
the table, retrieve it and compare it with the original one...

JM
Le 27 déc. 2012 01:07, varaprasad.bh...@polarisft.com a écrit :

 Hi,

 For the below, may I insert into 'Customer' hbase table in the following
 way:

 put 'Customer', 'rowkey', 'cf:json',
 '{Customer: {Customer Detail:[{CustomerNumber: 101,DOB:
 01/01/01,Fname: Fname1,Mname:Mname1,Lname: Lname1,address:
 {AddressType: Home,AddressLine1 :1.1.Address Line1,
 AddressLine2 :1.1.Address Line2,AddressLine3 :1.1.Address
 Line3,AddressLine4 :1.1.Address Line4,State :1.1.State,City
 :1.1.City,Country :1.1.Country}},{ CustomerNumber:
 102,DOB: 01/02/01,Fname: Fname2,Mname:
 Mname2,Lname: Lname2,address: [{AddressType:
 Home,AddressLine1 :2.1.Address Line1,AddressLine2 :2.1.Address
 Line2,AddressLine3 :2.1.Address Line3,AddressLine4 :2.1.Address
 Line4,State :2.1.State,City :2.1.City,Country
 :2.1.Country},{AddressType: Office,AddressLine1 :2.2.Address
 Line1,AddressLine2 :2.2.Address Line2,AddressLine3 :2.2.Address
 Line3,
 AddressLine4 :2.2.Address Line4,State :2.2.State,City
 :2.2.City,Country :2.2.Country}]},{CustomerNumber:
 103,DOB: 01/03/01,Fname: Fname3,Mname:
 Mname3,Lname: Lname3,address: [{AddressType:
 Home,AddressLine1 :3.1.Address Line1,AddressLine2 :3.1.Address
 Line2,AddressLine3 :3.1.Address Line3,AddressLine4 :3.1.Address
 Line4,State :3.1.State,City :3.1.City,Country
 :3.1.Country},{AddressType: Office,AddressLine1 :3.2.Address
 Line1,AddressLine2 :3.2.Address Line2,AddressLine3 :3.2.Address
 Line3,AddressLine4 :3.2.Address Line4,State :3.2.State,City
 :3.2.City,Country :3.2.Country},{AddressType:
 Others,AddressLine1 :3.3.Address Line1,AddressLine2 :3.3.Address
 Line2,AddressLine3 :3.3.Address Line3,AddressLine4 :3.3.Address
 Line4,State :3.3.State,City :3.3.City,Country
 :3.3.Country}]},{CustomerNumber: 104,DOB:
 01/04/01,Fname: Fname4,Mname: Mname4,Lname: Lname4,address:
 [{AddressType: Home,AddressLine1 :4.1.Address Line1,AddressLine2
 :4.1.Address Line2,AddressLine3 :4.1.Address Line3,AddressLine4
 :4.1.Address Line4,State :4.1.State,City :4.1.City,Country
 :4.1.Country},{AddressType: Office,AddressLine1 :4.2.Address
 Line1,AddressLine2 :4.2.Address Line2,AddressLine3 :4.2.Address
 Line3,AddressLine4 :4.2.Address Line4,State :4.2.State,City
 :4.2.City,Country :4.2.Country},{AddressType:
 Office2,AddressLine1 :4.3.Address Line1,AddressLine2 :4.3.Address
 Line2,AddressLine3 :4.3.Address Line3,AddressLine4 :4.3.Address
 Line4,State :4.3.State,City :4.3.City,Country
 :4.3.Country},{AddressType: Others,AddressLine1 :4.4.Address
 Line1,AddressLine2 :4.4.Address Line2,AddressLine3 :4.4.Address
 Line3,AddressLine4 :4.4.Address Line4,State :4.4.State,City
 :4.4.City,Country :4.4.Country}]}]}}'

 The scan of Customer table gives:

 hbase(main):009:0 scan 'Customer'
 ROW  COLUMN+CELL
  rowkey  column=cf:json, timestamp=1356583865944,
 value={Customer: {Customer Detail: [{CustomerNumber:
 101,DOB: 01/01/01,Fname: Fname1,Mname:Mname1,Lname:
 Lname1
  ,address: {AddressType:
 Home,AddressLine1 :1.1.Address Line1, \x0AAddressLine2 :
  1.1.Address Line2,AddressLine3
 :1.1.Address Line3,AddressLine4 :1.1.Address Line4,
  State :1.1.State,City
 :1.1.City,Country :1.1.Country}},{ CustomerNumber: 1000
 002,DOB: 01/02/01,Fname:
 Fname2,Mname: Mname2,Lname: Lname2,address:[{AddressType:
 Home,AddressLine1 :2.1.Address Line1,AddressLine2 :2.1.Address Lin
  e2,AddressLine3 :2.1.Address
 Line3,AddressLine4 :2.1.Address Line4,State :2.1.St
  ate,City :2.1.City,Country
 :2.1.Country},{AddressType: Office,AddressLine1 :
  2.2.Address Line1,AddressLine2
 :2.2.Address Line2,AddressLine3 :2.2.Address Line3,\
  x0AAddressLine4 :2.2.Address
 Line4,State :2.2.State,City :2.2.City,Country :2
  .2.Country}]},{CustomerNumber:
 103,DOB: 01/03/01,Fname: Fname3,Mname:
   Mname3,Lname: Lname3,address:
 [{AddressType: Home,AddressLine1 :3.1.Address
  Line1,AddressLine2 :3.1.Address
 Line2,AddressLine3 :3.1.Address Line3,AddressLine4
   :3.1.Address Line4,State
 :3.1.State,City :3.1.City,Country :3.1.Country},{A
  ddressType: Office,AddressLine1
 :3.2.Address Line1,AddressLine2 :3.2.Address Line2
  ,AddressLine3 :3.2.Address
 Line3,AddressLine4 :3.2.Address Line4,State :3.2.Stat
  e,City :3.2.City,Country
 :3.2.Country},{AddressType: 

thrift client api supports filters and coprocessor

2012-12-28 Thread Shengjie Min
Hi guys,

Sadly, My hbase client language is Python, I am using happybase for now
which is based on thrift AFAIK. I know thrift so far is still not
supporting filters, coprocessors (correct me if I am wrong here). Can some
one point me any Jira items I can track the plan/progress if there is one?
The only ones I can find is from Hbase in Action:

“Thrift server to match the new Java API”:
https://issues.apache.org/jira/browse/HBASE-1744

“Make Endpoint Coprocessors Available from Thrift,”
https://issues.apache.org/jira/browse/HBASE-5600.

The 1st one doesn't seem covering filters and the 2nd one hasn't been
updated for a long while.

-- 
Shengjie Min


Re: Where to place logs

2012-12-28 Thread Leonid Fedotov
Varun,
this really depends on your log rotation and retention policy.
Logs usually pretty big, but if you rotate it once a day(for example) and 
remove old logs after, say, 1 week, you probably will not need to have huge 
amount of space for it…
You should estimate logs side before making decision.

Thank you!

Sincerely,
Leonid Fedotov

On Dec 28, 2012, at 12:51 AM, Varun Sharma wrote:

 Thanks for the tip. I was just wondering what people have done in the past
 - do people typically reserve a separate disk for logging activity ?
 
 Thanks
 Varun
 
 On Wed, Dec 26, 2012 at 1:13 PM, Stack st...@duboce.net wrote:
 
 On Mon, Dec 24, 2012 at 9:27 AM, Varun Sharma va...@pinterest.com wrote:
 
 Hi,
 
 I am wondering where people usually place hbase + hadoop logs. I have 4
 disks and 1 very tiny disk with barely 500 megs (thats the typical setup
 on
 amazon ec2). The 4 disks shall be used for hbase data. Since 500M is too
 small, should I place logs on one of the 4 disks. Could it potentially
 steal IOP(s) from hbase ? Does anyone have an idea how much of an
 overhead
 logging really is ?
 
 
 Looks like you have no choice but to put your logging on a data disk.
 Logging can be expensive, yes.  It should be easy enough to measure in
 your setup.  You can disable all logging for a short period while the
 cluster is under load; do it on a single node even (you can do this from
 the UI).  See how your io changes when logging is off.
 
 St.Ack
 



Re: HBase client locks application during major compactions

2012-12-28 Thread Ted Yu
Looks like there was socket timeout :

java.net.SocketTimeoutException: 6 millis timeout while waiting for
channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/***:39752
remote=***/***:60020]

Have you collected / checked GC log on the server referenced above ?

BTW Have you considered deploying 0.92.2 in your cluster ?

Thanks, glad to see Cerner using HBase.

On Fri, Dec 28, 2012 at 9:40 AM, Baugher,Bryan bryan.baug...@cerner.comwrote:

 Hi everyone,

 For the past month or so we have noticed that some of our applications
 become frozen about once a day and need to be restarted in order to bring
 them back. We eventually figured out that it was caused by/happening during
 major compactions.

 We have automated major compactions disabled and are running them manually
 on each table sequentially each day starting at 4am. We are running on
 CDH4.1.1 (Hbase Version : 0.92.1-cdh4.1.1). Interestingly enough this is
 only happening in our dev environment with each region server serving ~650
 regions.

 Looking at the logs in HBase show that the compactions are occurring and
 this warning repeatedly while the compactions are occurring,

 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call
 getHTableDescriptors(), rpc version=1, client version=29,
 methodsFingerPrint=400804878 from ***: output error

 Looking at our application logs we often see this error or a variation[1].

 I took a thread dump of our application while it was locked and saw that
 nearly all of the threads in the application were blocked by a single
 thread that was waiting on HBaseClient$Call[2].

 [1] - http://pastebin.com/P4skndEg
 [2] - http://pastebin.com/YLZn3SRK


 CONFIDENTIALITY NOTICE This message and any included attachments are from
 Cerner Corporation and are intended only for the addressee. The information
 contained in this message is confidential and may constitute inside or
 non-public information under international, federal, or state securities
 laws. Unauthorized forwarding, printing, copying, distribution, or use of
 such information is strictly prohibited and may be unlawful. If you are not
 the addressee, please promptly delete this message and notify the sender of
 the delivery error by e-mail or you may call Cerner's corporate offices in
 Kansas City, Missouri, U.S.A at (+1) (816)221-1024.



Re: HBase client locks application during major compactions

2012-12-28 Thread Baugher,Bryan


On 12/28/12 12:14 PM, Ted Yu yuzhih...@gmail.com wrote:

Looks like there was socket timeout :

java.net.SocketTimeoutException: 6 millis timeout while waiting for
channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/***:39752
remote=***/***:60020]

Have you collected / checked GC log on the server referenced above ?

I am not sure exactly which server you are referring to. For the
application server we don't currently collect gc logs. For hbase we do but
the gc logs were truncated recently and won't help.


BTW Have you considered deploying 0.92.2 in your cluster ?

Not really. We have stuck with cloudera's distribution for a couple years
now and I don't really see us going down that track.


Thanks, glad to see Cerner using HBase.

On Fri, Dec 28, 2012 at 9:40 AM, Baugher,Bryan
bryan.baug...@cerner.comwrote:

 Hi everyone,

 For the past month or so we have noticed that some of our applications
 become frozen about once a day and need to be restarted in order to
bring
 them back. We eventually figured out that it was caused by/happening
during
 major compactions.

 We have automated major compactions disabled and are running them
manually
 on each table sequentially each day starting at 4am. We are running on
 CDH4.1.1 (Hbase Version : 0.92.1-cdh4.1.1). Interestingly enough this is
 only happening in our dev environment with each region server serving
~650
 regions.

 Looking at the logs in HBase show that the compactions are occurring and
 this warning repeatedly while the compactions are occurring,

 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call
 getHTableDescriptors(), rpc version=1, client version=29,
 methodsFingerPrint=400804878 from ***: output error

 Looking at our application logs we often see this error or a
variation[1].

 I took a thread dump of our application while it was locked and saw that
 nearly all of the threads in the application were blocked by a single
 thread that was waiting on HBaseClient$Call[2].

 [1] - http://pastebin.com/P4skndEg
 [2] - http://pastebin.com/YLZn3SRK


 CONFIDENTIALITY NOTICE This message and any included attachments are
from
 Cerner Corporation and are intended only for the addressee. The
information
 contained in this message is confidential and may constitute inside or
 non-public information under international, federal, or state securities
 laws. Unauthorized forwarding, printing, copying, distribution, or use
of
 such information is strictly prohibited and may be unlawful. If you are
not
 the addressee, please promptly delete this message and notify the
sender of
 the delivery error by e-mail or you may call Cerner's corporate offices
in
 Kansas City, Missouri, U.S.A at (+1) (816)221-1024.




Re: HBase client locks application during major compactions

2012-12-28 Thread Ted Yu
I was talking about the server which was anonymized:
***/***:60020

Cheers

On Fri, Dec 28, 2012 at 10:41 AM, Baugher,Bryan bryan.baug...@cerner.comwrote:



 On 12/28/12 12:14 PM, Ted Yu yuzhih...@gmail.com wrote:

 Looks like there was socket timeout :
 
 java.net.SocketTimeoutException: 6 millis timeout while waiting for
 channel to be ready for read. ch :
 java.nio.channels.SocketChannel[connected local=/***:39752
 remote=***/***:60020]
 
 Have you collected / checked GC log on the server referenced above ?

 I am not sure exactly which server you are referring to. For the
 application server we don't currently collect gc logs. For hbase we do but
 the gc logs were truncated recently and won't help.

 
 BTW Have you considered deploying 0.92.2 in your cluster ?

 Not really. We have stuck with cloudera's distribution for a couple years
 now and I don't really see us going down that track.

 
 Thanks, glad to see Cerner using HBase.
 
 On Fri, Dec 28, 2012 at 9:40 AM, Baugher,Bryan
 bryan.baug...@cerner.comwrote:
 
  Hi everyone,
 
  For the past month or so we have noticed that some of our applications
  become frozen about once a day and need to be restarted in order to
 bring
  them back. We eventually figured out that it was caused by/happening
 during
  major compactions.
 
  We have automated major compactions disabled and are running them
 manually
  on each table sequentially each day starting at 4am. We are running on
  CDH4.1.1 (Hbase Version : 0.92.1-cdh4.1.1). Interestingly enough this is
  only happening in our dev environment with each region server serving
 ~650
  regions.
 
  Looking at the logs in HBase show that the compactions are occurring and
  this warning repeatedly while the compactions are occurring,
 
  WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call
  getHTableDescriptors(), rpc version=1, client version=29,
  methodsFingerPrint=400804878 from ***: output error
 
  Looking at our application logs we often see this error or a
 variation[1].
 
  I took a thread dump of our application while it was locked and saw that
  nearly all of the threads in the application were blocked by a single
  thread that was waiting on HBaseClient$Call[2].
 
  [1] - http://pastebin.com/P4skndEg
  [2] - http://pastebin.com/YLZn3SRK
 
 
  CONFIDENTIALITY NOTICE This message and any included attachments are
 from
  Cerner Corporation and are intended only for the addressee. The
 information
  contained in this message is confidential and may constitute inside or
  non-public information under international, federal, or state securities
  laws. Unauthorized forwarding, printing, copying, distribution, or use
 of
  such information is strictly prohibited and may be unlawful. If you are
 not
  the addressee, please promptly delete this message and notify the
 sender of
  the delivery error by e-mail or you may call Cerner's corporate offices
 in
  Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
 




Re: Error: java.io.IOException: Could not seek StoreFileScanner[HFileScanner for reade

2012-12-28 Thread Bryan Beaudreault
We've seen this at times too and would be interested to know what causes
it.  Until you get a better answer, one thing we found helps is to move the
region via the hbase shell.  For us clients would be able to scan the
region once it has moved.


On Fri, Dec 28, 2012 at 2:00 AM, satish verma satish.bigd...@gmail.comwrote:

 Hi

  I am using a MR job on Hbase . I am getting this sort of error on a few
 region servers. Can someone explain what this means and what should I do ?

  Thanks.

 12/12/26 03:03:18 INFO mapred.JobClient: Task Id :
 attempt_201210121702_699686_m_
 46_0, Status : FAILED
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
 region server b-app-80.ssprod:65520 for region
 webpage_production,9c3583e2e97d7845288f6792ed641e60,1343757929164, row
 '9c3583e2e97d7845288f6792ed641e60', but failed after 10 attempts.
 Exceptions:
 java.io.IOException: java.io.IOException: Could not seek
 StoreFileScanner[HFileScanner for reader

 reader=hdfs://b-hadoop-master.ssprod:54310/hbase/webpage_production/1629342332/tm/7630213467863536608,
 compression=gz, inMemory=false,
 firstKey=9c3583e2e97d7845288f6792ed641e60/tm:eid/1337397152960/Put,
 lastKey=a05d22c20c779e39dffc28fe86053408/tm:url/1352931877105/Put,
 avgKeyLen=49, avgValueLen=38, entries=4091448, length=75747961, cur=null]
 at

 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:104)
 at

 org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:77)
 at
 org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1408)
 at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner



Re: Hbase Question

2012-12-28 Thread yonghu
I think you can take a look at your row-key design and evenly
distribute your data in your cluster, as you mentioned even if you
added more nodes, there was no improvement of performance. Maybe you
have a node who is a hot spot, and the other nodes have no work to do.

regards!

Yong

On Tue, Dec 25, 2012 at 3:31 AM, 周梦想 abloz...@gmail.com wrote:
 Hi Dalia,

 I think you can make a small sample of the table to do the test, then
 you'll find what's the difference of scan and count.
 because you can count it by human.

 Best regards,
 Andy

 2012/12/24 Dalia Sobhy dalia.mohso...@hotmail.com


 Dear all,

 I have 50,000 row with diagnosis qualifier = cardiac, and another 50,000
 rows with renal.

 When I type this in Hbase shell,

 import org.apache.hadoop.hbase.filter.CompareFilter
 import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
 import org.apache.hadoop.hbase.filter.SubstringComparator
 import org.apache.hadoop.hbase.util.Bytes

 scan 'patient', { COLUMNS = info:diagnosis, FILTER =
 SingleColumnValueFilter.new(Bytes.toBytes('info'),
  Bytes.toBytes('diagnosis'),
  CompareFilter::CompareOp.valueOf('EQUAL'),
  SubstringComparator.new('cardiac'))}

 Output = 50,000 row

 import org.apache.hadoop.hbase.filter.CompareFilter
 import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
 import org.apache.hadoop.hbase.filter.SubstringComparator
 import org.apache.hadoop.hbase.util.Bytes

 count 'patient', { COLUMNS = info:diagnosis, FILTER =
 SingleColumnValueFilter.new(Bytes.toBytes('info'),
  Bytes.toBytes('diagnosis'),
  CompareFilter::CompareOp.valueOf('EQUAL'),
  SubstringComparator.new('cardiac'))}
 Output = 100,000 row

 Even though I tried it using Hbase Java API, Aggregation Client Instance,
 and I enabled the Coprocessor aggregation for the table.
 rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan)

 Also when measuring the improved performance on case of adding more nodes
 the operation takes the same time.

 So any advice please?

 I have been throughout all this mess from a couple of weeks

 Thanks,






Re: HBase client locks application during major compactions

2012-12-28 Thread Baugher,Bryan
I believe that is one of our region servers which I will have to wait till
tomorrow to check gc logs.

On 12/28/12 12:45 PM, Ted Yu yuzhih...@gmail.com wrote:

I was talking about the server which was anonymized:
***/***:60020

Cheers

On Fri, Dec 28, 2012 at 10:41 AM, Baugher,Bryan
bryan.baug...@cerner.comwrote:



 On 12/28/12 12:14 PM, Ted Yu yuzhih...@gmail.com wrote:

 Looks like there was socket timeout :
 
 java.net.SocketTimeoutException: 6 millis timeout while waiting for
 channel to be ready for read. ch :
 java.nio.channels.SocketChannel[connected local=/***:39752
 remote=***/***:60020]
 
 Have you collected / checked GC log on the server referenced above ?

 I am not sure exactly which server you are referring to. For the
 application server we don't currently collect gc logs. For hbase we do
but
 the gc logs were truncated recently and won't help.

 
 BTW Have you considered deploying 0.92.2 in your cluster ?

 Not really. We have stuck with cloudera's distribution for a couple
years
 now and I don't really see us going down that track.

 
 Thanks, glad to see Cerner using HBase.
 
 On Fri, Dec 28, 2012 at 9:40 AM, Baugher,Bryan
 bryan.baug...@cerner.comwrote:
 
  Hi everyone,
 
  For the past month or so we have noticed that some of our
applications
  become frozen about once a day and need to be restarted in order to
 bring
  them back. We eventually figured out that it was caused by/happening
 during
  major compactions.
 
  We have automated major compactions disabled and are running them
 manually
  on each table sequentially each day starting at 4am. We are running
on
  CDH4.1.1 (Hbase Version : 0.92.1-cdh4.1.1). Interestingly enough
this is
  only happening in our dev environment with each region server serving
 ~650
  regions.
 
  Looking at the logs in HBase show that the compactions are occurring
and
  this warning repeatedly while the compactions are occurring,
 
  WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call
  getHTableDescriptors(), rpc version=1, client version=29,
  methodsFingerPrint=400804878 from ***: output error
 
  Looking at our application logs we often see this error or a
 variation[1].
 
  I took a thread dump of our application while it was locked and saw
that
  nearly all of the threads in the application were blocked by a single
  thread that was waiting on HBaseClient$Call[2].
 
  [1] - http://pastebin.com/P4skndEg
  [2] - http://pastebin.com/YLZn3SRK
 
 
  CONFIDENTIALITY NOTICE This message and any included attachments are
 from
  Cerner Corporation and are intended only for the addressee. The
 information
  contained in this message is confidential and may constitute inside
or
  non-public information under international, federal, or state
securities
  laws. Unauthorized forwarding, printing, copying, distribution, or
use
 of
  such information is strictly prohibited and may be unlawful. If you
are
 not
  the addressee, please promptly delete this message and notify the
 sender of
  the delivery error by e-mail or you may call Cerner's corporate
offices
 in
  Kansas City, Missouri, U.S.A at (+1) (816)221-1024.