[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-10-27 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13806331#comment-13806331
 ] 

takeshi.miao commented on HBASE-7525:
-

[~eclark] tks for reviewing this jira :)

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Critical
 Fix For: 0.98.0, 0.96.1

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-trunk-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-10-25 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805632#comment-13805632
 ] 

Elliott Clark commented on HBASE-7525:
--

+1 there are some docs that should be added.  But I can add those in a new jira.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0, 0.96.1

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-trunk-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-10-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805848#comment-13805848
 ] 

Hudson commented on HBASE-7525:
---

SUCCESS: Integrated in hbase-0.96-hadoop2 #102 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/102/])
HBASE-7525 A canary monitoring program specifically for regionserver 
(takeshi.miao) (eclark: rev 1535847)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java


 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Critical
 Fix For: 0.98.0, 0.96.1

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-trunk-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-10-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805903#comment-13805903
 ] 

Hudson commented on HBASE-7525:
---

SUCCESS: Integrated in hbase-0.96 #162 (See 
[https://builds.apache.org/job/hbase-0.96/162/])
HBASE-7525 A canary monitoring program specifically for regionserver 
(takeshi.miao) (eclark: rev 1535847)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java


 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Critical
 Fix For: 0.98.0, 0.96.1

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-trunk-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-10-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805914#comment-13805914
 ] 

Hudson commented on HBASE-7525:
---

SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #811 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/811/])
HBASE-7525 A canary monitoring program specifically for regionserver 
(takeshi.miao) (eclark: rev 1535846)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java


 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Critical
 Fix For: 0.98.0, 0.96.1

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-trunk-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-10-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805946#comment-13805946
 ] 

Hudson commented on HBASE-7525:
---

SUCCESS: Integrated in HBase-TRUNK #4648 (See 
[https://builds.apache.org/job/HBase-TRUNK/4648/])
HBASE-7525 A canary monitoring program specifically for regionserver 
(takeshi.miao) (eclark: rev 1535846)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java


 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Critical
 Fix For: 0.98.0, 0.96.1

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-trunk-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-09-24 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776161#comment-13776161
 ] 

takeshi.miao commented on HBASE-7525:
-

Dear [~stack]

I had already unloaded the patch 
[HBASE-7525-trunk-v4.patch|https://issues.apache.org/jira/secure/attachment/12604509/HBASE-7525-trunk-v4.patch],
 but I not sure why the CI job did not execute it.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-trunk-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776225#comment-13776225
 ] 

Hadoop QA commented on HBASE-7525:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12604509/HBASE-7525-trunk-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestAtomicOperation

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7354//console

This message is automatically generated.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-trunk-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 

[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-09-22 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774256#comment-13774256
 ] 

takeshi.miao commented on HBASE-7525:
-

Dear [~stack]

Sorry about this, I think that I might test the patch on 0.95 branch, but 
forgot to test it on trunk. Currently fixed it.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-trunk-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-09-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13772618#comment-13772618
 ] 

stack commented on HBASE-7525:
--

That should be good [~takeshi.miao].  I went to commit but trunk patch has this 
issue:

[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) 
on project hbase-server: Compilation failure
[ERROR] 
/Users/stack/checkouts/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java:[610,43]
 cannot find symbol
[ERROR] symbol  : method getNameAsString()
[ERROR] location: class byte[]
[ERROR] - [Help 1]


Does it work for you?  Thanks.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767709#comment-13767709
 ] 

Hadoop QA commented on HBASE-7525:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12603227/HBASE-7525-0.95-v7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 hadoop1.0{color}.  The patch failed to compile against the 
hadoop 1.0 profile.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7232//console

This message is automatically generated.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-09-15 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767713#comment-13767713
 ] 

takeshi.miao commented on HBASE-7525:
-

Dear [~stack]

I uploaded the new patches for both 0.95 and trunk with following changes.

1. added a check method for user wheteher pass the tableName with -regionserver 
option
{code}
# user pass tableNames 't1' and 't2' with '-regionserver' option
bin/hbase org.apache.hadoop.hbase.tool.Canary -regionserver t1 t2
...
# will see following error msg from stderr
Cannot pass a tablename when using the -regionserver option, tablenames:[t1, t2]
{code}

2. changed the usage output.
{code}
bin/hbase org.apache.hadoop.hbase.tool.Canary -help
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table1 
[table2]...] | [regionserver1 [regionserver2]..]
...
{code}

3. removed 'DEBUG [main] tool.Canary: runCount=...' from log msg

Pls tell me if any question, tks~

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-0.95-v7.patch, HBASE-7525-trunk-v2.patch, 
 HBASE-7525-trunk-v3.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-09-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767688#comment-13767688
 ] 

stack commented on HBASE-7525:
--

bq. Yes, it's default behavior is just align with the old one, does the all 
regions monitoring

Ok.  The original behavior is a little 'surprising' but if it has been this way 
up to this, it is fair-enough changing it.

bq. It is the internal DEBUG msg, for counting how many loop of this monitor 
instance did; It can help user to observe the monitor instance's behavior 
whether as expected

I did not understand this log message.  I did not seem to ask for more than one 
loop so seeing more than one w/o asking for it is unexpected.

bq. The option '-regionserver' (regionserver mode) is exclusive with the 
default mode (region mode), which means user can only choose to use default 
mode or regionserver mode either

Understood.  We should fix the usage to make it more plain it exclusive w/ 
table ops:

Usage: ./bin/hbase Canary [opts] [table1 [table2]...] | [regionserver1 
[regionserver2]..]

... or something like that.  As is it would seem to mix the exlusive args.

Your suggestion would allow:

Canary table1 regionserver2 ,etc.

Suggest that in the usage you are more clear that it is table OR regionserver 
ops.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-trunk-v2.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-30 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754458#comment-13754458
 ] 

takeshi.miao commented on HBASE-7525:
-

Dear [~stack]

Here is the answer for your questions

{quote}
 ./hbase-0.95.3-SNAPSHOT/bin/hbase --config /home/stack/conf_hbase 
org.apache.hadoop.hbase.tool.Canary
... it goes off and does something; default looks to go and get from all 
regions.
{quote}
Yes, it's default behavior is just align with the old one, does the all regions 
monitoring

bq. You add 2013-08-29 09:32:16,463 DEBUG [main] tool.Canary: runCount=2. What 
does it mean ?
It is the internal DEBUG msg, for counting how many loop of this monitor 
instance did; It can help user to observe the monitor instance's behavior 
whether as expected

Following are the questions you asked about _'-regionserver'_ option
{quote}
{code}
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table/regionserver 
1 [table/regionserver 2...]]
...
{code}
{quote}

{quote}
Would it be clearer if the -regionserver option took arguments as in 
-regionserver=rs1,rs2,rs3 etc.?
How to interpret this then:
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary -regionserver=rs1 table1
Would above only get regions from table1 on rs1? If no regions from table1 then 
it would print out there are none?
{quote}
The option _'-regionserver'_ (regionserver mode) is exclusive with the default 
mode (region mode), which means user can only choose to use default mode or 
regionserver mode either

bq. I do not know how to read 'table/regionserver 1'. What is the '1'?
So it seems the usage output confuses the user, I would like to change it to 
following, how do you think ?
{code}
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table|regionserver 
[table|regionserver ...]]
...
{code}

{quote}
 Or if you pass a table1 when you have a -regionserver option specified, you 
could just fail with Cannot pass a tablename when using the -regionserver 
option – that'd probably be simplest.
{quote}
Yes, this is a good suggestion, but currently I would not check this if the 
passed arguments are whether tableNames in HBase, due to I need to new a 
HBaseAdim instance to get the table list firstly, then compare them with the 
passed argument.
How do you think that I modify the usage output more precisely for 
-regionserver option ? such as...
{code}
...
-regionserver  replace the table argument to regionserver,
  which means to enable regionserver mode, instead of region mode (default)
...
{code}
Either way is ok for me.

I will upload the new patches after we confirm which way to go, and tks for 
your questions and suggestions :)


 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-trunk-v2.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase 

[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-29 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753446#comment-13753446
 ] 

takeshi.miao commented on HBASE-7525:
-

Dear [~stack]
I rebased and attached two patches for branch trunk and 0.95 respectively, and 
the root cause of the old patch did not work is due to the 
_'region.getTableNameAsString()'_ method was removed.

{code:title=Old code}
...
448  public static void sniff(final HBaseAdmin admin, String tableName) throws 
Exception {
449sniff(admin, new StdOutSink(), tableName);
450   }
...
579  tableName = region.getTableNameAsString();
{code}

{code:title=New code}
...
448  public static void sniff(final HBaseAdmin admin, TableName tableName) 
throws Exception {
449sniff(admin, new StdOutSink(), tableName.getNameAsString());
450   }
...
579  tableName = region.getTableName().getNameAsString();
{code}


 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-trunk-v2.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753486#comment-13753486
 ] 

Hadoop QA commented on HBASE-7525:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12600557/HBASE-7525-0.95-v6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6962//console

This message is automatically generated.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-trunk-v2.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a 

[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753807#comment-13753807
 ] 

stack commented on HBASE-7525:
--

I tried it.  It is great.  Nice utility.  But usage needs cleanup because 
otherwise users will be confused on how to use it.

Did it always just run if you did not provide an argument: i.e. if I do this:

 ./hbase-0.95.3-SNAPSHOT/bin/hbase --config /home/stack/conf_hbase 
org.apache.hadoop.hbase.tool.Canary

... it goes off and does something; default looks to go and get from all 
regions.

You add 2013-08-29 09:32:16,463 DEBUG [main] tool.Canary: runCount=2.  What 
does it mean?

Here is current usage:


{code}
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table/regionserver 
1 [table/regionserver 2...]]
 where [opts] are:
   -help  Show this help and exit.
   -regionserver  replace the table argument to regionserver,
  which means to enable regionserver mode
   -daemonContinuous check at defined intervals.
   -interval N  Interval between checks (sec)
   -e Use region/regionserver as regular expression
  which means the region/regionserver is regular expression pattern
   -f B stop whole program if first error occurs, default is true
   -t N timeout for a check, default is 60 (milisecs)
{code}

First, the formatting is off in the above.

So, if I supply -regionserver, then the last argument is a regionserver 
hostname rather than a table it seems? (I tried it and it seems so)

I do not know how to read 'table/regionserver 1'.  What is the '1'?

Would it be clearer if the -regionserver option took arguments as in 
-regionserver=rs1,rs2,rs3 etc.?

How to interpret this then:

Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary -regionserver=rs1 table1

Would above only get regions from table1 on rs1?  If no regions from table1 
then it would print out there are none?  Or if you pass a table1 when you have 
a -regionserver option specified, you could just fail with Cannot pass a 
tablename when using the -regionserver option -- that'd probably be simplest.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
 HBASE-7525-trunk-v2.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test 

[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752928#comment-13752928
 ] 

stack commented on HBASE-7525:
--

[~takeshi.miao] for 0.95 branch and for trunk.  Thank you.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0, 0.96.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-24 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749337#comment-13749337
 ] 

takeshi.miao commented on HBASE-7525:
-

Dear [~stack]

I am not sure which branch I need to rebase for you ? 0.95 or trunk ? due to I 
not see any branch for 0.98 and 0.96

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0, 0.96.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748316#comment-13748316
 ] 

stack commented on HBASE-7525:
--

[~takeshi.miao] I went to test this patch this evening but it has rotted in a 
pretty bad way (seems strange since no changes to Canary.java in a while).  Any 
chance of a rebase?  Thank you.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Critical
 Fix For: 0.98.0, 0.96.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-11 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736350#comment-13736350
 ] 

takeshi.miao commented on HBASE-7525:
-

Dear [~stack]
Pls tell me if any thing I can do for this ticket, tks a lot

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-07 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731848#comment-13731848
 ] 

takeshi.miao commented on HBASE-7525:
-

Dear [~stack]

I have added the rebased patch, pls note that I am still using the Scan for 
empty startRowKey case, due to I still suffer the following exception while 
testing in my Env.

{code}
...
13/08/07 10:05:11 INFO zookeeper.RecoverableZooKeeper: Process 
identifier=hconnection-0x6a3b8b49 connecting to ZooKeeper 
ensemble=localhost:2181
13/08/07 10:05:11 INFO zookeeper.ClientCnxn: Session establishment complete on 
server localhost.localdomain/127.0.0.1:2181, sessionid = 0x140583a69570010, 
negotiated timeout = 9
Exception in thread Thread-1 java.lang.IllegalArgumentException: Row length 
is 0
at org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:364)
at org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:348)
at org.apache.hadoop.hbase.client.Get.init(Get.java:86)
at 
org.apache.hadoop.hbase.tool.Canary$RegionServerMonitor.monitorRegionServers(Canary.java:563)
at 
org.apache.hadoop.hbase.tool.Canary$RegionServerMonitor.run(Canary.java:540)
at java.lang.Thread.run(Thread.java:662)
{code}

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731889#comment-13731889
 ] 

Hadoop QA commented on HBASE-7525:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12596576/HBASE-7525-0.95-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.backup.TestHFileArchiving

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6633//console

This message is automatically generated.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N 

[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729743#comment-13729743
 ] 

stack commented on HBASE-7525:
--

[~takeshi.miao] Pardon us for overlooking this addition.  Please rebase (I 
think the scan instead of get for the first key in table has been addressed) 
and lets get it in.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-08-04 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729100#comment-13729100
 ] 

takeshi.miao commented on HBASE-7525:
-

Dear [~stack]  [~mbertozzi]

I am wondering how do you think about this ticket ?

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-05-01 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646538#comment-13646538
 ] 

takeshi.miao commented on HBASE-7525:
-

[~mbertozzi] the new patch is applied, could you please help to take a look on 
it ?

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646539#comment-13646539
 ] 

Hadoop QA commented on HBASE-7525:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12581354/HBASE-7525-0.95-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5519//console

This message is automatically generated.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-05-01 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646553#comment-13646553
 ] 

Matteo Bertozzi commented on HBASE-7525:


thanks for the quick follow-up

to make the Hadoop QA happy, use git diff  HBASE-XYZ.patch
(see the unable to apply patch error above)

for the getStartKey() you're right, it's not null but even an empty byte array 
doesn't pass the test. Mutation.checkRow() throws an exception on length == 0
{code}
java.lang.IllegalArgumentException: Row length is 0
at org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:335)
at org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:319)
at org.apache.hadoop.hbase.client.Get.init(Get.java:86)
at 
org.apache.hadoop.hbase.tool.Canary$RegionMonitor.sniffRegion(Canary.java:483)
at 
org.apache.hadoop.hbase.tool.Canary$RegionMonitor.sniff(Canary.java:463)
at 
org.apache.hadoop.hbase.tool.Canary$RegionMonitor.sniff(Canary.java:433)
at 
org.apache.hadoop.hbase.tool.Canary$RegionMonitor.run(Canary.java:380)
{code}

My simple test is this... and you get a misleading error on the first region 
due to the empty key.
{code}
$ hbase shell
 create 'testtb', 'cf'
 put 'testtb', 'row0', 'cf:q', '0'
 put 'testtb', 'row1', 'cf:q', '1'
 put 'testtb', 'row2', 'cf:q', '2'
$ hbase org.apache.hadoop.hbase.tool.Canary
...
2013-05-01 13:58:50,310 ERROR [Thread-0] tool.Canary: read from region 
testtb,,1367412865960.99b4d7e3b71c1f5292bc96fad28bb67e. failed
{code}

also you may be useful to add at least a LOG.debug() with the exception inside 
all the catch just to have an idea of what's going wrong (like the Get failure 
above)

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-05-01 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646626#comment-13646626
 ] 

takeshi.miao commented on HBASE-7525:
-

[~mbertozzi], I'd like to use Scan to solve the region.getStartKey() issue, how 
do you think ?
{code}
startKey = region.getStartKey();
if(startKey.length  0) {
  get = new Get(startKey);
  get.addFamily(column.getName());
} else {
  scan = new Scan();
  scan.setCaching(1);
  scan.addFamily(column.getName())
}
...
if(startKey.length  0) {
  table.get(get);
} else {
  table.getScanner(scan);
}
{code}

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-05-01 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646632#comment-13646632
 ] 

Matteo Bertozzi commented on HBASE-7525:


yeah, it seems a good solution. 
add also scan.setMaxResultSize(1)

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646773#comment-13646773
 ] 

Hadoop QA commented on HBASE-7525:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12581374/HBASE-7525-0.95-v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.util.TestHBaseFsck

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5520//console

This message is automatically generated.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
 HBASE-7525-0.95-v3.patch, HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 

[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-04-30 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645465#comment-13645465
 ] 

Matteo Bertozzi commented on HBASE-7525:


at first look it seems ok, 

as the Hadoop QA has reported there's one line over 100 in printUsageAndExit() 
-f B

There're some names that may be changed, and a one line doc on the long ones 
may be useful
Monitor.isError() maybe rename to hasError()
Monitor.initialAdmin() maybe initAdmin()
Monitor.doPrepareFilteredRegionServerAndRegionsMap() is supposed to be 
something like filterRegionServerByName ?

give me some more time to try it and get back with other feedback

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-04-30 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645593#comment-13645593
 ] 

Matteo Bertozzi commented on HBASE-7525:


There's a problem with the first regions on Get(region.getStartKey()) that 
throws an exception since the start key may be null. so you get a wrong report 
read from region xyz failed.

Could you extract the list of names in the Canary.run() where you do all the 
other parse instead passing the args + index around? or at least add a comment 
of what is the args + index... it's a bit anonymous if you don't read the rest 
of the code...

It's not documented in the coding style, but I don't like the Yoda Conditions 
(null != obj) we've already some of them in but most of the code is (obj != 
null) so my guess is that is better to stay inline with what we've already.


 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-04-30 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645604#comment-13645604
 ] 

takeshi.miao commented on HBASE-7525:
-

Do I need to do any follow up ? this is my first time to contribute to 
community, please remind me if I miss any thing, thanks a lot

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-04-30 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645622#comment-13645622
 ] 

takeshi.miao commented on HBASE-7525:
-

[~mbertozzi], I got it, I will modify codes with the issues you talked about. I 
may ask more questions if anything need to be confirm with you, thanks a lot.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644319#comment-13644319
 ] 

Hadoop QA commented on HBASE-7525:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12577525/HBASE-7525-0.95-v0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 release 
audit warnings (more than the trunk's current 0 warnings).

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.backup.TestHFileArchiving

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5481//console

This message is automatically generated.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole 

[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-04-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644705#comment-13644705
 ] 

stack commented on HBASE-7525:
--

[~mbertozzi] You mind taking a looksee boss?  Its mods your Canary program.

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.95.0

 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-04-08 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625336#comment-13625336
 ] 

takeshi.miao commented on HBASE-7525:
-

I also tested it with hbase-0.95 branch

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-04-08 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625339#comment-13625339
 ] 

takeshi.miao commented on HBASE-7525:
-

new usage output will like is...
{Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] 
[table/regionserver 1 [table/regionserver 2...]]
 where [opts] are:
   -help  Show this help and exit.
   -regionserver  replace the table argument to regionserver,
  which means to enable regionserver mode
   -daemonContinuous check at defined intervals.
   -interval N  Interval between checks (sec)
   -e Use region/regionserver as regular expression
  which means the region/regionserver is regular expression pattern
   -f B stop whole program if first error occurs, default is true
   -t N timeout for a check, default is 60 (milisecs)}

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-v0.patch, 
 RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-01-09 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548542#comment-13548542
 ] 

Jonathan Hsieh commented on HBASE-7525:
---

Can you compare how is this related to HBASE-4393?  

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-01-09 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548894#comment-13548894
 ] 

Andrew Purtell commented on HBASE-7525:
---

The ideas are good, curious if it's possible to submit this as an incremental 
change on the existing utility? 

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-01-09 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549310#comment-13549310
 ] 

takeshi.miao commented on HBASE-7525:
-

This is for Jonathan Hsieh's question
There are 4 differences compared with #HBASE-4393
1. this tool will take any one region from each region server to monitor, not 
every region in whole HBase cluster
2. this tool was implemented with multi-threaded feature, so it will not be 
blocked if any region server being hung
3. this tool is taking one or more region server FQDN as options, then will 
monitor the given region servers
3.1 monitor all region servers if no option given
4. this tool can also take one or more regular expression patterns for region 
server FQDN for user easily use

I use this tool on our internal HBase operation, so I think that other people 
may have the identical requirements


 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-01-09 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549314#comment-13549314
 ] 

takeshi.miao commented on HBASE-7525:
-

Andrew Purtell, yes, I can merge this tool with o.a.h.h.tool.Canary in these 
couple days.
Do I need to issue a new ticket for this merge work ?

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7525) A canary monitoring program specifically for regionserver

2013-01-09 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549319#comment-13549319
 ] 

Andrew Purtell commented on HBASE-7525:
---

bq. Do I need to issue a new ticket for this merge work ?

This ticket is ok. 

 A canary monitoring program specifically for regionserver
 -

 Key: HBASE-7525
 URL: https://issues.apache.org/jira/browse/HBASE-7525
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.94.0
Reporter: takeshi.miao
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-7525-v0.patch, RegionServerCanary.java


 *Motivation*
 This ticket is to provide a canary monitoring tool specifically for 
 HRegionserver, details as follows
 1. This tool is required by operation team due to they thought that the 
 canary for each region of a HBase is too many for them, so I implemented this 
 coarse-granular one based on the original o.a.h.h.tool.Canary for them
 2. And this tool is implemented by multi-threading, which means the each Get 
 request sent by a thread. the reason I use this way is due to we suffered the 
 region server hung issue by now the root cause is still not clear. so this 
 tool can help operation team to detect hung region server if any.
 *example*
 1. the tool docs
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
 Usage: [opts] [regionServerName 1 [regionServrName 2...]]
  regionServerName - FQDN serverName, can use linux command:hostname -f to 
 check your serverName
  where [-opts] are:
-help Show this help and exit.
-eUse regionServerName as regular expression
   which means the regionServerName is regular expression pattern
-f B stop whole program if first error occurs, default is true
-t N timeout for a check, default is 60 (milisecs)
-daemonContinuous check at defined intervals.
-interval N  Interval between checks (sec)
 2. Will send a request to each regionserver in a HBase cluster
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
 3. Will send a request to a regionserver by given name
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
 4. Will send a request to regionserver(s) by given regular-expression
 /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
 rs1.domainname.pattern
 // another example
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
 tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
 5. Will send a request to a regionserver and also set a timeout limit for 
 this test
 // query regionserver:rs1.domainname with timeout limit 10sec
 // -f false, means that will not exit this program even test failed
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 1 
 rs1.domainname
 // echo 1 if timeout
 echo $?
 6. Will run as daemon mode, which means it will send request to each 
 regionserver periodically
 ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira