[jira] [Commented] (HBASE-6418) Minor bug in delete flow.
[ https://issues.apache.org/jira/browse/HBASE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416893#comment-13416893 ] Lars Hofhansl commented on HBASE-6418: -- Are you sure Laxman? This logic only needs to be done for version delete markers (not column or family markers). kv.isDeleteType() is only true to version delete markers. KeyValue.isDelete(type) would be true for all delete markers. I think it is correct the way it is. Minor bug in delete flow. - Key: HBASE-6418 URL: https://issues.apache.org/jira/browse/HBASE-6418 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0, 0.94.1, 0.94.2 Reporter: Laxman Assignee: Laxman Timestamp updation in Delete flow is not considering all flavors (Delete record, Delete Family, Delete Column) of Delete API. Currently its considering Delete Record only. org.apache.hadoop.hbase.regionserver.HRegion.prepareDeleteTimestamps(Delete, byte[]) {code} for (KeyValue kv: kvs) { // Check if time is LATEST, change to time of most recent addition if so // This is expensive. if (kv.isLatestTimestamp() kv.isDeleteType()) { {code} Basically used a wrong API. kv.isDeleteType() should be KeyValue.isDelete(type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6418) Minor bug in delete flow.
[ https://issues.apache.org/jira/browse/HBASE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416939#comment-13416939 ] Anoop Sam John commented on HBASE-6418: --- bq.This logic only needs to be done for version delete markers (not column or family markers). You mean in the delete column family we can not say only some older versions need to be deleted? In Delete {code} /** * Delete all columns of the specified family with a timestamp less than * or equal to the specified timestamp. * p * Overrides previous calls to deleteColumn and deleteColumns for the * specified family. * @param family family name * @param timestamp maximum version timestamp * @return this for invocation chaining */ public Delete deleteFamily(byte [] family, long timestamp) { {code} This is Delete family which tells a timestamp also. So all KVs older than this time in that family should get deleted right? As per the code in prepareDeleteTimestamps {code} if (kv.isLatestTimestamp() kv.isDeleteType()) { } else { kv.updateLatestStamp(byteNow); } {code} Here KV is type DeleteFamily but I have some specific timestamp. Still we will change the timestamp to the cur time! This is wrong? Pls correct me if I am wrong Lars. I am not familiar with the delete. Minor bug in delete flow. - Key: HBASE-6418 URL: https://issues.apache.org/jira/browse/HBASE-6418 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0, 0.94.1, 0.94.2 Reporter: Laxman Assignee: Laxman Timestamp updation in Delete flow is not considering all flavors (Delete record, Delete Family, Delete Column) of Delete API. Currently its considering Delete Record only. org.apache.hadoop.hbase.regionserver.HRegion.prepareDeleteTimestamps(Delete, byte[]) {code} for (KeyValue kv: kvs) { // Check if time is LATEST, change to time of most recent addition if so // This is expensive. if (kv.isLatestTimestamp() kv.isDeleteType()) { {code} Basically used a wrong API. kv.isDeleteType() should be KeyValue.isDelete(type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6418) Minor bug in delete flow.
[ https://issues.apache.org/jira/browse/HBASE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416956#comment-13416956 ] Anoop Sam John commented on HBASE-6418: --- Doing some testing on this delete API with diff types of Deletes. KeyValue#updateLatestStamp() is having check for avoid setting the passed TS when the TS is not Latest TS Minor bug in delete flow. - Key: HBASE-6418 URL: https://issues.apache.org/jira/browse/HBASE-6418 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0, 0.94.1, 0.94.2 Reporter: Laxman Assignee: Laxman Timestamp updation in Delete flow is not considering all flavors (Delete record, Delete Family, Delete Column) of Delete API. Currently its considering Delete Record only. org.apache.hadoop.hbase.regionserver.HRegion.prepareDeleteTimestamps(Delete, byte[]) {code} for (KeyValue kv: kvs) { // Check if time is LATEST, change to time of most recent addition if so // This is expensive. if (kv.isLatestTimestamp() kv.isDeleteType()) { {code} Basically used a wrong API. kv.isDeleteType() should be KeyValue.isDelete(type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6418) Minor bug in delete flow.
[ https://issues.apache.org/jira/browse/HBASE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416959#comment-13416959 ] Anoop Sam John commented on HBASE-6418: --- @Laxman Seems fine. I am inline with Lars after more code study and testing. Delete type will be used only when deleteColumn(byte [] family, byte [] qualifier, long timestamp) is used.[deleteColumn(byte [] family, byte [] qualifier) also] This is just one version deletion There only need to get the cell value and find the TS. Else in all other types of deletes, need to set either the TS specified by user of the cur time at RS Minor bug in delete flow. - Key: HBASE-6418 URL: https://issues.apache.org/jira/browse/HBASE-6418 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0, 0.94.1, 0.94.2 Reporter: Laxman Assignee: Laxman Timestamp updation in Delete flow is not considering all flavors (Delete record, Delete Family, Delete Column) of Delete API. Currently its considering Delete Record only. org.apache.hadoop.hbase.regionserver.HRegion.prepareDeleteTimestamps(Delete, byte[]) {code} for (KeyValue kv: kvs) { // Check if time is LATEST, change to time of most recent addition if so // This is expensive. if (kv.isLatestTimestamp() kv.isDeleteType()) { {code} Basically used a wrong API. kv.isDeleteType() should be KeyValue.isDelete(type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4255) Expose CatalogJanitor controls
[ https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu reassigned HBASE-4255: - Assignee: Devaraj Das Expose CatalogJanitor controls -- Key: HBASE-4255 URL: https://issues.apache.org/jira/browse/HBASE-4255 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 4255-4.2.patch When doing surgery or other operational tasks, it's nice to be able to have the .META. table quickly cleaned of split parents. The CatalogJanitor already has controls baked in (currently used in unit tests), I think we should expose this the same way we do with the balancer, that is: - start - stop - request a run A client would need to go through HBaseAdmin, and shell commands need to be created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4255) Expose CatalogJanitor controls
[ https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-4255: -- Attachment: 4255-4.2.patch Patch from review board. Expose CatalogJanitor controls -- Key: HBASE-4255 URL: https://issues.apache.org/jira/browse/HBASE-4255 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 4255-4.2.patch When doing surgery or other operational tasks, it's nice to be able to have the .META. table quickly cleaned of split parents. The CatalogJanitor already has controls baked in (currently used in unit tests), I think we should expose this the same way we do with the balancer, that is: - start - stop - request a run A client would need to go through HBaseAdmin, and shell commands need to be created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4255) Expose CatalogJanitor controls
[ https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-4255: -- Hadoop Flags: Reviewed Status: Patch Available (was: Open) Expose CatalogJanitor controls -- Key: HBASE-4255 URL: https://issues.apache.org/jira/browse/HBASE-4255 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 4255-4.2.patch When doing surgery or other operational tasks, it's nice to be able to have the .META. table quickly cleaned of split parents. The CatalogJanitor already has controls baked in (currently used in unit tests), I think we should expose this the same way we do with the balancer, that is: - start - stop - request a run A client would need to go through HBaseAdmin, and shell commands need to be created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4255) Expose CatalogJanitor controls
[ https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417057#comment-13417057 ] Zhihong Ted Yu commented on HBASE-4255: --- @J-D: Please take a look at Deravaj's patch. Expose CatalogJanitor controls -- Key: HBASE-4255 URL: https://issues.apache.org/jira/browse/HBASE-4255 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 4255-4.2.patch When doing surgery or other operational tasks, it's nice to be able to have the .META. table quickly cleaned of split parents. The CatalogJanitor already has controls baked in (currently used in unit tests), I think we should expose this the same way we do with the balancer, that is: - start - stop - request a run A client would need to go through HBaseAdmin, and shell commands need to be created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4255) Expose CatalogJanitor controls
[ https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417114#comment-13417114 ] Hadoop QA commented on HBASE-4255: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536980/4255-4.2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 11 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2397//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2397//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2397//console This message is automatically generated. Expose CatalogJanitor controls -- Key: HBASE-4255 URL: https://issues.apache.org/jira/browse/HBASE-4255 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 4255-4.2.patch When doing surgery or other operational tasks, it's nice to be able to have the .META. table quickly cleaned of split parents. The CatalogJanitor already has controls baked in (currently used in unit tests), I think we should expose this the same way we do with the balancer, that is: - start - stop - request a run A client would need to go through HBaseAdmin, and shell commands need to be created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4255) Expose CatalogJanitor controls
[ https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417123#comment-13417123 ] stack commented on HBASE-4255: -- I added feedback up on rb Expose CatalogJanitor controls -- Key: HBASE-4255 URL: https://issues.apache.org/jira/browse/HBASE-4255 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 4255-4.2.patch When doing surgery or other operational tasks, it's nice to be able to have the .META. table quickly cleaned of split parents. The CatalogJanitor already has controls baked in (currently used in unit tests), I think we should expose this the same way we do with the balancer, that is: - start - stop - request a run A client would need to go through HBaseAdmin, and shell commands need to be created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6418) Minor bug in delete flow.
[ https://issues.apache.org/jira/browse/HBASE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417130#comment-13417130 ] stack commented on HBASE-6418: -- Assuming the mighty @Laxman comes over to Anoop and Lars' way of thinking, is there anything we can do to make it so others don't think the same? Can we add code comment, a unit test, or update doc somewhere so we avoid others having same (mis)understanding? Anoop, any of your testing that could go in as a unit test? Attach here and I can commit. Minor bug in delete flow. - Key: HBASE-6418 URL: https://issues.apache.org/jira/browse/HBASE-6418 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0, 0.94.1, 0.94.2 Reporter: Laxman Assignee: Laxman Timestamp updation in Delete flow is not considering all flavors (Delete record, Delete Family, Delete Column) of Delete API. Currently its considering Delete Record only. org.apache.hadoop.hbase.regionserver.HRegion.prepareDeleteTimestamps(Delete, byte[]) {code} for (KeyValue kv: kvs) { // Check if time is LATEST, change to time of most recent addition if so // This is expensive. if (kv.isLatestTimestamp() kv.isDeleteType()) { {code} Basically used a wrong API. kv.isDeleteType() should be KeyValue.isDelete(type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)
stack created HBASE-6419: Summary: PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) Key: HBASE-6419 URL: https://issues.apache.org/jira/browse/HBASE-6419 Project: HBase Issue Type: Improvement Reporter: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)
[ https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6419: - Attachment: ServerMetrics_HBASE_6220_Flush_Metrics.patch Paul's patch copied over from hbase-6220 PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) --- Key: HBASE-6419 URL: https://issues.apache.org/jira/browse/HBASE-6419 Project: HBase Issue Type: Improvement Reporter: stack Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)
[ https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reassigned HBASE-6419: Assignee: Paul Cavallaro PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) --- Key: HBASE-6419 URL: https://issues.apache.org/jira/browse/HBASE-6419 Project: HBase Issue Type: Improvement Reporter: stack Assignee: Paul Cavallaro Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)
[ https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6419: - Status: Patch Available (was: Open) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) --- Key: HBASE-6419 URL: https://issues.apache.org/jira/browse/HBASE-6419 Project: HBase Issue Type: Improvement Reporter: stack Assignee: Paul Cavallaro Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6220) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics
[ https://issues.apache.org/jira/browse/HBASE-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417135#comment-13417135 ] stack commented on HBASE-6220: -- I made HBASE-6419 to apply Pauls' amendment. PersistentMetricsTimeVaryingRate gets used for non-time-based metrics - Key: HBASE-6220 URL: https://issues.apache.org/jira/browse/HBASE-6220 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.96.0 Reporter: David S. Wang Assignee: Paul Cavallaro Priority: Minor Labels: noob Attachments: ServerMetrics_HBASE_6220.patch, ServerMetrics_HBASE_6220_Flush_Metrics.patch PersistentMetricsTimeVaryingRate gets used for metrics that are not time-based, leading to confusing names such as avg_time for compaction size, etc. You hav to read the code in order to understand that this is actually referring to bytes, not seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6399) MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics
[ https://issues.apache.org/jira/browse/HBASE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417142#comment-13417142 ] stack commented on HBASE-6399: -- @Chunhui Shouldn't the comments in the properties file explain why there are two hbase contexts that can be loaded? i.e. one does basic metrics and the hbase-dynamic adds a bunch more but beware, enabling it could overwhelm your monitoring as... etc. MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics - Key: HBASE-6399 URL: https://issues.apache.org/jira/browse/HBASE-6399 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6399.patch, HBASE-6399v2.patch In hadoop-metrics.properties, GangliaContext is optional metrics context, I think we will use ganglia to monitor hbase cluster generally. However, I find a serious problem: RegionServerDynamicMetrics will generate lots of rrd file because we would move region or create/delete table. Especially if table is created everyday in some applications, there are much more and more rrd files in Gmetad Server. It will make Gmetad Server corrupted. IMO, MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6418) Minor bug in delete flow.
[ https://issues.apache.org/jira/browse/HBASE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417146#comment-13417146 ] Lars Hofhansl commented on HBASE-6418: -- It was late yesterday so I was terse. This is (partially) documented in Delete.java. Version delete markers are bit strange to begin with. They let you target a specific version of a specific column of a specific column family (of a specific row). It's makes little sense to talk about the current time for those. That is why there is this special logic to either target a version of your choosing or the latest version. Column or Family should be using the current time unless a time is specified... As is done right now. A comment to that extend in the server code would not hurt. :) I'm happy to make a simple patch. Minor bug in delete flow. - Key: HBASE-6418 URL: https://issues.apache.org/jira/browse/HBASE-6418 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0, 0.94.1, 0.94.2 Reporter: Laxman Assignee: Laxman Timestamp updation in Delete flow is not considering all flavors (Delete record, Delete Family, Delete Column) of Delete API. Currently its considering Delete Record only. org.apache.hadoop.hbase.regionserver.HRegion.prepareDeleteTimestamps(Delete, byte[]) {code} for (KeyValue kv: kvs) { // Check if time is LATEST, change to time of most recent addition if so // This is expensive. if (kv.isLatestTimestamp() kv.isDeleteType()) { {code} Basically used a wrong API. kv.isDeleteType() should be KeyValue.isDelete(type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6401) HBase may lose edits after a crash if used with HDFS 1.0.3 or older
[ https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417149#comment-13417149 ] Lars Hofhansl commented on HBASE-6401: -- This needs to be fixed in a hdfs. If it is fixed in Hadoop-2 there should be a jira to backport the change. When I did HDFS-744 I found Hadoop-1 and Hadoop-2 quite different in the way it handled packet shipping from the DFSClient, so the work might be non-trivial. HBase may lose edits after a crash if used with HDFS 1.0.3 or older --- Key: HBASE-6401 URL: https://issues.apache.org/jira/browse/HBASE-6401 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Environment: all Reporter: nkeywal Priority: Critical Attachments: TestReadAppendWithDeadDN.java This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the hdfs jira for this. Context: HBase Write Ahead Log features. This is using hdfs append. If the node crashes, the file that was written is read by other processes to replay the action. - So we have in hdfs one (dead) process writing with another process reading. - But, despite the call to syncFs, we don't always see the data when we have a dead node. It seems to be because the call in DFSClient#updateBlockInfo ignores the ipc errors and set the length to 0. - So we may miss all the writes to the last block if we try to connect to the dead DN. hdfs 1.0.3, branch-1 or branch-1-win: we have the issue http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853view=markup hdfs branch-2 or trunk: we should not have the issue (but not tested) http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup The attached test will fail ~50 of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master
[ https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417154#comment-13417154 ] stack commented on HBASE-4470: -- Patch lgtm. +1 (I thought we'd plugged all the places these could come up but looks like no). Good stuff G. ServerNotRunningException coming out of assignRootAndMeta kills the Master -- Key: HBASE-4470 URL: https://issues.apache.org/jira/browse/HBASE-4470 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Gregory Chanan Priority: Critical Fix For: 0.90.7 Attachments: HBASE-4470-90.patch I'm surprised we still have issues like that and I didn't get a hit while googling so forgive me if there's already a jira about it. When the master starts it verifies the locations of root and meta before assigning them, if the server is started but not running you'll get this: {quote} 2011-09-23 04:47:44,859 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) {quote} I hit that 3-4 times this week while debugging something else. The worst is that when you restart the master it sees that as a failover, but none of the regions are assigned so it takes an eternity to get back fully online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6418) Minor bug in delete flow.
[ https://issues.apache.org/jira/browse/HBASE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417156#comment-13417156 ] stack commented on HBASE-6418: -- It worries me if likes of a Laxman gets confused over this so I think the comment'd be great Mr Hofhansl. Minor bug in delete flow. - Key: HBASE-6418 URL: https://issues.apache.org/jira/browse/HBASE-6418 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0, 0.94.1, 0.94.2 Reporter: Laxman Assignee: Laxman Timestamp updation in Delete flow is not considering all flavors (Delete record, Delete Family, Delete Column) of Delete API. Currently its considering Delete Record only. org.apache.hadoop.hbase.regionserver.HRegion.prepareDeleteTimestamps(Delete, byte[]) {code} for (KeyValue kv: kvs) { // Check if time is LATEST, change to time of most recent addition if so // This is expensive. if (kv.isLatestTimestamp() kv.isDeleteType()) { {code} Basically used a wrong API. kv.isDeleteType() should be KeyValue.isDelete(type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master
[ https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417157#comment-13417157 ] Zhihong Ted Yu commented on HBASE-4470: --- Indentation seems to be off in testVerifyMetaRegionLocationWithException(): {code} + Mockito.when(implementation.get((byte [])Mockito.any(), (Get)Mockito.any())). {code} ServerNotRunningException coming out of assignRootAndMeta kills the Master -- Key: HBASE-4470 URL: https://issues.apache.org/jira/browse/HBASE-4470 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Gregory Chanan Priority: Critical Fix For: 0.90.7 Attachments: HBASE-4470-90.patch I'm surprised we still have issues like that and I didn't get a hit while googling so forgive me if there's already a jira about it. When the master starts it verifies the locations of root and meta before assigning them, if the server is started but not running you'll get this: {quote} 2011-09-23 04:47:44,859 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) {quote} I hit that 3-4 times this week while debugging something else. The worst is that when you restart the master it sees that as a failover, but none of the regions are assigned so it takes an eternity to get back fully online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6381) AssignmentManager should use the same logic for clean startup and failover
[ https://issues.apache.org/jira/browse/HBASE-6381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417160#comment-13417160 ] Jimmy Xiang commented on HBASE-6381: If RS that was online dies during rebuildUserRegions, SSH will take care of it. In the scenario I described, it is like this: master dies, then rs dies, then rs starts up, then master fails over/starts up. Now during rebuildUserRegions, the rs is online but gets no region assigned. I think we should check if a rs is online, and the regions are opened on the rs too. AssignmentManager should use the same logic for clean startup and failover -- Key: HBASE-6381 URL: https://issues.apache.org/jira/browse/HBASE-6381 Project: HBase Issue Type: Bug Components: master Reporter: Jimmy Xiang Assignee: Jimmy Xiang Currently AssignmentManager handles clean startup and failover very differently. Different logic is mingled together so it is hard to find out which is for which. We should clean it up and share the same logic so that AssignmentManager handles both cases the same way. This way, the code will much easier to understand and maintain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6401) HBase may lose edits after a crash if used with HDFS 1.0.3 or older
[ https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417161#comment-13417161 ] stack commented on HBASE-6401: -- bq. I haven't found the hdfs jira for this. Does svn blame/git bisecting not turn up the issue that fixed this? HBase may lose edits after a crash if used with HDFS 1.0.3 or older --- Key: HBASE-6401 URL: https://issues.apache.org/jira/browse/HBASE-6401 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Environment: all Reporter: nkeywal Priority: Critical Attachments: TestReadAppendWithDeadDN.java This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the hdfs jira for this. Context: HBase Write Ahead Log features. This is using hdfs append. If the node crashes, the file that was written is read by other processes to replay the action. - So we have in hdfs one (dead) process writing with another process reading. - But, despite the call to syncFs, we don't always see the data when we have a dead node. It seems to be because the call in DFSClient#updateBlockInfo ignores the ipc errors and set the length to 0. - So we may miss all the writes to the last block if we try to connect to the dead DN. hdfs 1.0.3, branch-1 or branch-1-win: we have the issue http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853view=markup hdfs branch-2 or trunk: we should not have the issue (but not tested) http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup The attached test will fail ~50 of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master
[ https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417165#comment-13417165 ] stack commented on HBASE-4470: -- Yeah, G, you have tabs in your test code... can you purge 'em? ServerNotRunningException coming out of assignRootAndMeta kills the Master -- Key: HBASE-4470 URL: https://issues.apache.org/jira/browse/HBASE-4470 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Gregory Chanan Priority: Critical Fix For: 0.90.7 Attachments: HBASE-4470-90.patch I'm surprised we still have issues like that and I didn't get a hit while googling so forgive me if there's already a jira about it. When the master starts it verifies the locations of root and meta before assigning them, if the server is started but not running you'll get this: {quote} 2011-09-23 04:47:44,859 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) {quote} I hit that 3-4 times this week while debugging something else. The worst is that when you restart the master it sees that as a failover, but none of the regions are assigned so it takes an eternity to get back fully online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6381) AssignmentManager should use the same logic for clean startup and failover
[ https://issues.apache.org/jira/browse/HBASE-6381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417168#comment-13417168 ] stack commented on HBASE-6381: -- bq. Now during rebuildUserRegions, the rs is online but gets no region assigned. Is this so? The restarted RS will have a different startcode/ServerName? bq. I think we should check if a rs is online, and the regions are opened on the rs too. Probably no harm. What if a 1k cluster w/ each server carrying hundreds of regions. What you think? It might take a little while doing this step. Would that be a prob? AssignmentManager should use the same logic for clean startup and failover -- Key: HBASE-6381 URL: https://issues.apache.org/jira/browse/HBASE-6381 Project: HBase Issue Type: Bug Components: master Reporter: Jimmy Xiang Assignee: Jimmy Xiang Currently AssignmentManager handles clean startup and failover very differently. Different logic is mingled together so it is hard to find out which is for which. We should clean it up and share the same logic so that AssignmentManager handles both cases the same way. This way, the code will much easier to understand and maintain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6381) AssignmentManager should use the same logic for clean startup and failover
[ https://issues.apache.org/jira/browse/HBASE-6381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417177#comment-13417177 ] Jimmy Xiang commented on HBASE-6381: Good catch. I see. That's right. Thanks! AssignmentManager should use the same logic for clean startup and failover -- Key: HBASE-6381 URL: https://issues.apache.org/jira/browse/HBASE-6381 Project: HBase Issue Type: Bug Components: master Reporter: Jimmy Xiang Assignee: Jimmy Xiang Currently AssignmentManager handles clean startup and failover very differently. Different logic is mingled together so it is hard to find out which is for which. We should clean it up and share the same logic so that AssignmentManager handles both cases the same way. This way, the code will much easier to understand and maintain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417178#comment-13417178 ] stack commented on HBASE-6261: -- bq. Stack indicated way back on the mailing list that he was okay waiting for a hadoop-common version bump, which is kind of a long timescale. Yeah. Code copied in tends to never go away (For example: see MurmurHash that started out in hbase and has been in hadoop now w/ a good few years). bq. If people really urgently want this, we could just copy the code over and then refactor it away when it's released in hadoop-common. Sounds like a nice to have. How much code would you have to copy in? What would it be? Thanks Andrew. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Assignee: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6412) Move external servers to metrics2 (thrift,thrift2,rest)
[ https://issues.apache.org/jira/browse/HBASE-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417181#comment-13417181 ] stack commented on HBASE-6412: -- AVRO is deprecated in 0.94 and will be removed in 0.96 (you did it over in HBASE-5948). Move external servers to metrics2 (thrift,thrift2,rest) --- Key: HBASE-6412 URL: https://issues.apache.org/jira/browse/HBASE-6412 Project: HBase Issue Type: Sub-task Reporter: Elliott Clark Implement metrics2 for all the external servers: * Thrift * Thrift2 * Rest * Avro ? (Not sure if we should do this as it's deprecated.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6404) Collect p50, p75 and p95 stats
[ https://issues.apache.org/jira/browse/HBASE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417184#comment-13417184 ] stack commented on HBASE-6404: -- This is related to HBASE-6261? Collect p50, p75 and p95 stats -- Key: HBASE-6404 URL: https://issues.apache.org/jira/browse/HBASE-6404 Project: HBase Issue Type: Improvement Components: monitoring Reporter: Arjen Roodselaar Assignee: Liyin Tang Stats in current versions of HBase are currently exposed as avg, min and max. This gives a skewed view of performance as the outliers are usually the indicators of problems. Please revise the stats collection framework to use true buckets and expose the p50, p75 and p95 values of these buckets through JMX. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master
[ https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417186#comment-13417186 ] stack commented on HBASE-4470: -- Or nvm the tabs, can fix on commit. Jon Hsieh, you want this in 0.90 (Or G do you know if he wants it in 0.90?) ServerNotRunningException coming out of assignRootAndMeta kills the Master -- Key: HBASE-4470 URL: https://issues.apache.org/jira/browse/HBASE-4470 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Gregory Chanan Priority: Critical Fix For: 0.90.7 Attachments: HBASE-4470-90.patch I'm surprised we still have issues like that and I didn't get a hit while googling so forgive me if there's already a jira about it. When the master starts it verifies the locations of root and meta before assigning them, if the server is started but not running you'll get this: {quote} 2011-09-23 04:47:44,859 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) {quote} I hit that 3-4 times this week while debugging something else. The worst is that when you restart the master it sees that as a failover, but none of the regions are assigned so it takes an eternity to get back fully online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection
[ https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417188#comment-13417188 ] stack commented on HBASE-6400: -- +1 on v2 Add getMasterAdmin() and getMasterMonitor() to HConnection -- Key: HBASE-6400 URL: https://issues.apache.org/jira/browse/HBASE-6400 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.96.0 Attachments: 6400-v2.patch, HBASE-6400_v1.patch HConnection used to have getMaster() which returns HMasterInterface, but after HBASE-6039 it has been removed. I think we need to expose HConnection.getMasterAdmin() and getMasterMonitor() a la HConnection.getAdmin(), and getClient(). HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason to leak keep alive classes to upper layers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)
[ https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417191#comment-13417191 ] Hadoop QA commented on HBASE-6419: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536995/ServerMetrics_HBASE_6220_Flush_Metrics.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation org.apache.hadoop.hbase.regionserver.TestSplitLogWorker Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2398//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2398//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2398//console This message is automatically generated. PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) --- Key: HBASE-6419 URL: https://issues.apache.org/jira/browse/HBASE-6419 Project: HBase Issue Type: Improvement Reporter: stack Assignee: Paul Cavallaro Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6398) Print a warning if there is no local datanode
[ https://issues.apache.org/jira/browse/HBASE-6398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6398: - Tags: noob Labels: noob (was: ) Print a warning if there is no local datanode - Key: HBASE-6398 URL: https://issues.apache.org/jira/browse/HBASE-6398 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Labels: noob When starting up a RS HBase should print out a warning if there is no datanode locally. Lots of optimizations are only available if the data is machine local. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6336) Split point should not be equal to start row or end row
[ https://issues.apache.org/jira/browse/HBASE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6336: - Resolution: Fixed Status: Resolved (was: Patch Available) Resolving. Was committed to trunk. Could be backported. Lets open new issue to do that if wanted. Split point should not be equal to start row or end row --- Key: HBASE-6336 URL: https://issues.apache.org/jira/browse/HBASE-6336 Project: HBase Issue Type: Bug Components: regionserver Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6336.patch Should we allow split point equal with region's start row or end row? {code} // if the midkey is the same as the first and last keys, then we cannot // (ever) split this region. if (this.comparator.compareRows(mk, firstKey) == 0 this.comparator.compareRows(mk, lastKey) == 0) { if (LOG.isDebugEnabled()) { LOG.debug(cannot split because midkey is the same as first or + last row); } {code} Here, I think it is a mistake. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master
[ https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417200#comment-13417200 ] Jonathan Hsieh commented on HBASE-4470: --- I do. We ran into this and some of its friends last week and would like to get it taken care of. ServerNotRunningException coming out of assignRootAndMeta kills the Master -- Key: HBASE-4470 URL: https://issues.apache.org/jira/browse/HBASE-4470 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Gregory Chanan Priority: Critical Fix For: 0.90.7 Attachments: HBASE-4470-90.patch I'm surprised we still have issues like that and I didn't get a hit while googling so forgive me if there's already a jira about it. When the master starts it verifies the locations of root and meta before assigning them, if the server is started but not running you'll get this: {quote} 2011-09-23 04:47:44,859 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) {quote} I hit that 3-4 times this week while debugging something else. The worst is that when you restart the master it sees that as a failover, but none of the regions are assigned so it takes an eternity to get back fully online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417223#comment-13417223 ] Jonathan Hsieh commented on HBASE-6396: --- I ran into this on HBase 0.94.1rc0. We didn't run into on the 0.94.0 version because I tested against the lastest hadoop 0.23 inseate of hadoop 2.0. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.1 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6396: -- Affects Version/s: 0.94.1 Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.1 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection
[ https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417230#comment-13417230 ] Zhihong Ted Yu commented on HBASE-6400: --- Integrated to trunk. Thanks for the patch, Enis. Thanks for the review, Stack. Add getMasterAdmin() and getMasterMonitor() to HConnection -- Key: HBASE-6400 URL: https://issues.apache.org/jira/browse/HBASE-6400 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.96.0 Attachments: 6400-v2.patch, HBASE-6400_v1.patch HConnection used to have getMaster() which returns HMasterInterface, but after HBASE-6039 it has been removed. I think we need to expose HConnection.getMasterAdmin() and getMasterMonitor() a la HConnection.getAdmin(), and getClient(). HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason to leak keep alive classes to upper layers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6412) Move external servers to metrics2 (thrift,thrift2,rest)
[ https://issues.apache.org/jira/browse/HBASE-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417240#comment-13417240 ] Elliott Clark commented on HBASE-6412: -- Thanks stack. I totally forgot the timing on that. Move external servers to metrics2 (thrift,thrift2,rest) --- Key: HBASE-6412 URL: https://issues.apache.org/jira/browse/HBASE-6412 Project: HBase Issue Type: Sub-task Reporter: Elliott Clark Implement metrics2 for all the external servers: * Thrift * Thrift2 * Rest * Avro ? (Not sure if we should do this as it's deprecated.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6412) Move external servers to metrics2 (thrift,thrift2,rest)
[ https://issues.apache.org/jira/browse/HBASE-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6412: - Description: Implement metrics2 for all the external servers: * Thrift * Thrift2 * Rest was: Implement metrics2 for all the external servers: * Thrift * Thrift2 * Rest * Avro ? (Not sure if we should do this as it's deprecated.) Move external servers to metrics2 (thrift,thrift2,rest) --- Key: HBASE-6412 URL: https://issues.apache.org/jira/browse/HBASE-6412 Project: HBase Issue Type: Sub-task Reporter: Elliott Clark Implement metrics2 for all the external servers: * Thrift * Thrift2 * Rest -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417247#comment-13417247 ] Jonathan Hsieh commented on HBASE-6396: --- This problem existed against Hbase 0.94.0 untop of hadoop-2.0.0-cdh4.0.1. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.1 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)
[ https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417250#comment-13417250 ] Zhihong Ted Yu commented on HBASE-6419: --- I ran the two tests above and they passed with patch. Integrated to trunk. Thanks for the patch, Paul. Thanks for the review, Stack. PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) --- Key: HBASE-6419 URL: https://issues.apache.org/jira/browse/HBASE-6419 Project: HBase Issue Type: Improvement Reporter: stack Assignee: Paul Cavallaro Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)
[ https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417255#comment-13417255 ] David S. Wang commented on HBASE-6419: -- +1 on the review. PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) --- Key: HBASE-6419 URL: https://issues.apache.org/jira/browse/HBASE-6419 Project: HBase Issue Type: Improvement Reporter: stack Assignee: Paul Cavallaro Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6396: -- Affects Version/s: (was: 0.94.1) 0.94.0 Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6405) Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics
[ https://issues.apache.org/jira/browse/HBASE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-6405: --- Attachment: hbase-6405-addendum-2.patch Eclipse is having trouble understanding the maven-dependency-plugin usage in hbase-hadoop2-compat. It seems to not be able to find the plugin goal build-classpath with the missing version. Also, the m2e lifecycle mapping needs to be told how to handle the lifecycle. Attaching patch to fix the above as well as fix formatting on the added modules. Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics - Key: HBASE-6405 URL: https://issues.apache.org/jira/browse/HBASE-6405 Project: HBase Issue Type: Sub-task Reporter: Zhihong Ted Yu Assignee: Elliott Clark Fix For: 0.96.0 Attachments: 6405.txt, HBASE-6405-ADD.patch, hbase-6405-addendum-2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417276#comment-13417276 ] Andrew Purtell commented on HBASE-6396: --- +1 to commit to trunk and 0.94 branch. Thanks Ted. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6420) Gracefully shutdown logsyncer
Jimmy Xiang created HBASE-6420: -- Summary: Gracefully shutdown logsyncer Key: HBASE-6420 URL: https://issues.apache.org/jira/browse/HBASE-6420 Project: HBase Issue Type: Bug Components: wal Reporter: Jimmy Xiang Currently, in closing a HLog, logSyncerThread is interrupted. logSyncer could be in the middle to sync the writer. We should avoid interrupting the sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6405) Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics
[ https://issues.apache.org/jira/browse/HBASE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417279#comment-13417279 ] Zhihong Ted Yu commented on HBASE-6405: --- Addendum 2 looks good. Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics - Key: HBASE-6405 URL: https://issues.apache.org/jira/browse/HBASE-6405 Project: HBase Issue Type: Sub-task Reporter: Zhihong Ted Yu Assignee: Elliott Clark Fix For: 0.96.0 Attachments: 6405.txt, HBASE-6405-ADD.patch, hbase-6405-addendum-2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6404) Collect p50, p75 and p95 stats
[ https://issues.apache.org/jira/browse/HBASE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417283#comment-13417283 ] Arjen Roodselaar commented on HBASE-6404: - Yes, this would actually be the same as HBASE-6261. I should have given that one to our Eng team. Collect p50, p75 and p95 stats -- Key: HBASE-6404 URL: https://issues.apache.org/jira/browse/HBASE-6404 Project: HBase Issue Type: Improvement Components: monitoring Reporter: Arjen Roodselaar Assignee: Liyin Tang Stats in current versions of HBase are currently exposed as avg, min and max. This gives a skewed view of performance as the outliers are usually the indicators of problems. Please revise the stats collection framework to use true buckets and expose the p50, p75 and p95 values of these buckets through JMX. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6405) Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics
[ https://issues.apache.org/jira/browse/HBASE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417292#comment-13417292 ] Elliott Clark commented on HBASE-6405: -- Jesse: Your added patch fails to compile when compiling for hadoop2. It's pretty important that hbase-hadoop1-compat has an explicit dependency on hadoop1's version of things, and hbase-hadoop2-compat has an explicit dependency on hadoop2 versions. If I add version${hadoop-one.version}/version back in where it was removed things seem to build for me. Is there some reason that only hbase-hadoop2 has the maven m2e stuff added ? Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics - Key: HBASE-6405 URL: https://issues.apache.org/jira/browse/HBASE-6405 Project: HBase Issue Type: Sub-task Reporter: Zhihong Ted Yu Assignee: Elliott Clark Fix For: 0.96.0 Attachments: 6405.txt, HBASE-6405-ADD.patch, hbase-6405-addendum-2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417293#comment-13417293 ] Lars Hofhansl commented on HBASE-6396: -- Strange that I do not see this problem as long as I build 0.94 against Hadoop-2. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang resolved HBASE-5443. Resolution: Fixed Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6404) Collect p50, p75 and p95 stats
[ https://issues.apache.org/jira/browse/HBASE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417298#comment-13417298 ] Elliott Clark commented on HBASE-6404: -- Only kind of. HBASE-6261 is talking about replacing our histogram implementation to make it more accurate. HBase already has a histogram implementation that works. HBASE-6261 is just about accuracy for the 99.95% and 99.995% percentile numbers on extremely skewed distributions. This seems to be about using a histogram in more places, where we currently use PersistantTimeVaryingRate. Collect p50, p75 and p95 stats -- Key: HBASE-6404 URL: https://issues.apache.org/jira/browse/HBASE-6404 Project: HBase Issue Type: Improvement Components: monitoring Reporter: Arjen Roodselaar Assignee: Liyin Tang Stats in current versions of HBase are currently exposed as avg, min and max. This gives a skewed view of performance as the outliers are usually the indicators of problems. Please revise the stats collection framework to use true buckets and expose the p50, p75 and p95 values of these buckets through JMX. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417302#comment-13417302 ] Andrew Wang commented on HBASE-6261: It'd be these files from hadoop-common: * src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java * src/main/java/org/apache/hadoop/metrics2/lib/Quantiles.java * src/main/java/org/apache/hadoop/metrics2/util/SampleQuantiles.java {{wc -l}} reports it's 534 lines across those three files, heavily commented of course. {{MutableQuantiles}} is a hadoop2 metrics2 interface for SampleQuantiles, and might need to be modified for use in HBase. I haven't looked at what Elliot's done for HBASE-4050 yet. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Assignee: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6406) TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently
[ https://issues.apache.org/jira/browse/HBASE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417310#comment-13417310 ] Lars Hofhansl commented on HBASE-6406: -- Hmm... Looks like TestReplication is hanging in setup waiting for the root region to be assigned. TestZooKeeper also appears to be waiting for a RegionServer to start. These would seem to be more general issues with starting the MiniCluster TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently Key: HBASE-6406 URL: https://issues.apache.org/jira/browse/HBASE-6406 Project: HBase Issue Type: Bug Affects Versions: 0.94.1 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.2 Attachments: testReplication.jstack, testZooKeeper.jstack Looking back through the 0.94 test runs these two tests accounted for 11 of 34 failed tests. They should be fixed or (temporarily) disabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417311#comment-13417311 ] Andrew Purtell commented on HBASE-6396: --- That is strange. Is this a case of building with one profile active but running/testing with another Jon? Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6421) [pom] add jettison and fix netty specification
Jesse Yates created HBASE-6421: -- Summary: [pom] add jettison and fix netty specification Key: HBASE-6421 URL: https://issues.apache.org/jira/browse/HBASE-6421 Project: HBase Issue Type: Bug Reporter: Jesse Yates Assignee: Jesse Yates Attachments: hbase-6421-v0.patch Currently, jettison isn't required for testing hbase-server, but TestSchemaConfigured requires it, causing the compile phase (at least on my MBP) to fail. Further, in cleaning up the poms, netty should be declared in the parent hbase/pom.xml and then inherited in the subclass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6421) [pom] add jettison and fix netty specification
[ https://issues.apache.org/jira/browse/HBASE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-6421: --- Attachment: hbase-6421-v0.patch Attaching patch with fix. Compiles and starts testing fine locally. [pom] add jettison and fix netty specification -- Key: HBASE-6421 URL: https://issues.apache.org/jira/browse/HBASE-6421 Project: HBase Issue Type: Bug Reporter: Jesse Yates Assignee: Jesse Yates Attachments: hbase-6421-v0.patch Currently, jettison isn't required for testing hbase-server, but TestSchemaConfigured requires it, causing the compile phase (at least on my MBP) to fail. Further, in cleaning up the poms, netty should be declared in the parent hbase/pom.xml and then inherited in the subclass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6421) [pom] add jettison and fix netty specification
[ https://issues.apache.org/jira/browse/HBASE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417324#comment-13417324 ] Zhihong Ted Yu commented on HBASE-6421: --- Patch didn't compile against hadoop 2.0: {code} == == Checking against hadoop 2.0 build == == {code} [pom] add jettison and fix netty specification -- Key: HBASE-6421 URL: https://issues.apache.org/jira/browse/HBASE-6421 Project: HBase Issue Type: Bug Reporter: Jesse Yates Assignee: Jesse Yates Attachments: hbase-6421-v0.patch Currently, jettison isn't required for testing hbase-server, but TestSchemaConfigured requires it, causing the compile phase (at least on my MBP) to fail. Further, in cleaning up the poms, netty should be declared in the parent hbase/pom.xml and then inherited in the subclass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417339#comment-13417339 ] Jonathan Hsieh commented on HBASE-6396: --- Andrew, might be - I've been building 0.94 with the 23 profile but running on top of a hadoop 2 installation. With Ted's patch, it seems the shell works but the master web interface gets the exn still. I'm digging further. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4255) Expose CatalogJanitor controls
[ https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-4255: --- Attachment: 4255-5.1.patch This is the patch from RB. Expose CatalogJanitor controls -- Key: HBASE-4255 URL: https://issues.apache.org/jira/browse/HBASE-4255 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 4255-4.2.patch, 4255-5.1.patch When doing surgery or other operational tasks, it's nice to be able to have the .META. table quickly cleaned of split parents. The CatalogJanitor already has controls baked in (currently used in unit tests), I think we should expose this the same way we do with the balancer, that is: - start - stop - request a run A client would need to go through HBaseAdmin, and shell commands need to be created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5547) Don't delete HFiles when in backup mode
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-5547: --- Release Note: All HFiles are now automatically archived to the configured hbase.table.archive.directory. HFiles are cleaned similarly to HLogs - a cleaner delegate chain that is instantiated as per the configured classes. Similar to hlog cleaners, a TimeToLiveHFileCleaner is used by default, and specified under the hbase.master.hfilecleaner.plugins configuration key. In unifying the two cleaner interfaces, the new configuration parameter to use the TimeToLiveHLogCleaner is: property namehbase.master.logcleaner.plugins/name valueorg.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner/value /property It is still enabled by default, so nothing needs to be changed if you are not modifying the logcleaners. Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Jesse Yates Fix For: 0.94.2 Attachments: 5547-v12.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, hbase-5547-v9.patch, java_HBASE-5547_v13.patch, java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, java_HBASE-5547_v6.patch, java_HBASE-5547_v7.patch This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417354#comment-13417354 ] Lars Hofhansl commented on HBASE-6396: -- Generally we're getting into muddy waters here I think. In some cases we abstract Hadoop versions away with reflection, in other cases HBase needs to be build with a specific hadoop profile. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417355#comment-13417355 ] Andrew Purtell commented on HBASE-6396: --- What do you think Lars, err on the side of reflection here? Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417366#comment-13417366 ] Zhihong Ted Yu commented on HBASE-5547: --- Here is the link about InterruptedException handling: http://www.ibm.com/developerworks/java/library/j-jtp05236/index.html Take a look at Listing 3 under 'Don't swallow interrupts' Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Jesse Yates Fix For: 0.94.2 Attachments: 5547-v12.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, hbase-5547-v9.patch, java_HBASE-5547_v13.patch, java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, java_HBASE-5547_v6.patch, java_HBASE-5547_v7.patch This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)
[ https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417370#comment-13417370 ] Hudson commented on HBASE-6419: --- Integrated in HBase-TRUNK #3144 (See [https://builds.apache.org/job/HBase-TRUNK/3144/]) HBASE-6419 PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) (Paul Cavallaro) (Revision 1363016) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) --- Key: HBASE-6419 URL: https://issues.apache.org/jira/browse/HBASE-6419 Project: HBase Issue Type: Improvement Reporter: stack Assignee: Paul Cavallaro Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection
[ https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417368#comment-13417368 ] Hudson commented on HBASE-6400: --- Integrated in HBase-TRUNK #3144 (See [https://builds.apache.org/job/HBase-TRUNK/3144/]) HBASE-6400 Add getMasterAdmin() and getMasterMonitor() to HConnection (Enis) (Revision 1363009) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnection.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java Add getMasterAdmin() and getMasterMonitor() to HConnection -- Key: HBASE-6400 URL: https://issues.apache.org/jira/browse/HBASE-6400 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.96.0 Attachments: 6400-v2.patch, HBASE-6400_v1.patch HConnection used to have getMaster() which returns HMasterInterface, but after HBASE-6039 it has been removed. I think we need to expose HConnection.getMasterAdmin() and getMasterMonitor() a la HConnection.getAdmin(), and getClient(). HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason to leak keep alive classes to upper layers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6220) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics
[ https://issues.apache.org/jira/browse/HBASE-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417369#comment-13417369 ] Hudson commented on HBASE-6220: --- Integrated in HBase-TRUNK #3144 (See [https://builds.apache.org/job/HBase-TRUNK/3144/]) HBASE-6419 PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220) (Paul Cavallaro) (Revision 1363016) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java PersistentMetricsTimeVaryingRate gets used for non-time-based metrics - Key: HBASE-6220 URL: https://issues.apache.org/jira/browse/HBASE-6220 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.96.0 Reporter: David S. Wang Assignee: Paul Cavallaro Priority: Minor Labels: noob Attachments: ServerMetrics_HBASE_6220.patch, ServerMetrics_HBASE_6220_Flush_Metrics.patch PersistentMetricsTimeVaryingRate gets used for metrics that are not time-based, leading to confusing names such as avg_time for compaction size, etc. You hav to read the code in order to understand that this is actually referring to bytes, not seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417379#comment-13417379 ] Zhihong Ted Yu commented on HBASE-6396: --- Clarification: the goal of the patch was not to make build compiled against hadoop 1 to work against hadoop 2. The goal is to make build against 0.23 profile work with hadoop 2. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6422) Add switch in LoadIncrementalHFiles API to allow for programatically changing
Jeff Lord created HBASE-6422: Summary: Add switch in LoadIncrementalHFiles API to allow for programatically changing Key: HBASE-6422 URL: https://issues.apache.org/jira/browse/HBASE-6422 Project: HBase Issue Type: New Feature Components: mapred Affects Versions: 0.94.0 Reporter: Jeff Lord Hbase bulk load often requires manual chown and permission changes. Usually it goes something like try to run completebulkload and it fails with the following hdfs error: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode=mydata: hadoop:supergroup:rwxr-xr-x To work around this mismatch between Hadoop and HBase user permissions, you can make both users share a group: that is, the user where you run the MapReduce jobs and the user running HBase. Then, after running your MapReduce job, you can chgrp the output directory to the HBase group, and run chmod g+w. This allows the bulk loader to move the files into the HBase data directory. It would be useful if there was a way to do this in the LoadIncrementalHFiles API It is the case of linux permissions too: If we have: file owner:me group:me We can chown to group:x and owner:x, without needing special permissions chown x:x file Then when we trigger bulk load, HBase does the fs -mv, and finds that the owner is itself, so no permission hitches. Goes smooth. We are thinking that a switch to turn this on/off (default off) would be nice to have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6220) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics
[ https://issues.apache.org/jira/browse/HBASE-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417388#comment-13417388 ] Paul Cavallaro commented on HBASE-6220: --- Should I resolve this or someone else? Thanks. PersistentMetricsTimeVaryingRate gets used for non-time-based metrics - Key: HBASE-6220 URL: https://issues.apache.org/jira/browse/HBASE-6220 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.96.0 Reporter: David S. Wang Assignee: Paul Cavallaro Priority: Minor Labels: noob Attachments: ServerMetrics_HBASE_6220.patch, ServerMetrics_HBASE_6220_Flush_Metrics.patch PersistentMetricsTimeVaryingRate gets used for metrics that are not time-based, leading to confusing names such as avg_time for compaction size, etc. You hav to read the code in order to understand that this is actually referring to bytes, not seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6220) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics
[ https://issues.apache.org/jira/browse/HBASE-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6220: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics - Key: HBASE-6220 URL: https://issues.apache.org/jira/browse/HBASE-6220 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.96.0 Reporter: David S. Wang Assignee: Paul Cavallaro Priority: Minor Labels: noob Attachments: ServerMetrics_HBASE_6220.patch, ServerMetrics_HBASE_6220_Flush_Metrics.patch PersistentMetricsTimeVaryingRate gets used for metrics that are not time-based, leading to confusing names such as avg_time for compaction size, etc. You hav to read the code in order to understand that this is actually referring to bytes, not seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6422) Add switch in LoadIncrementalHFiles API to allow for programatically changing perms on output directory
[ https://issues.apache.org/jira/browse/HBASE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated HBASE-6422: - Summary: Add switch in LoadIncrementalHFiles API to allow for programatically changing perms on output directory (was: Add switch in LoadIncrementalHFiles API to allow for programatically changing ) Add switch in LoadIncrementalHFiles API to allow for programatically changing perms on output directory --- Key: HBASE-6422 URL: https://issues.apache.org/jira/browse/HBASE-6422 Project: HBase Issue Type: New Feature Components: mapred Affects Versions: 0.94.0 Reporter: Jeff Lord Hbase bulk load often requires manual chown and permission changes. Usually it goes something like try to run completebulkload and it fails with the following hdfs error: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode=mydata: hadoop:supergroup:rwxr-xr-x To work around this mismatch between Hadoop and HBase user permissions, you can make both users share a group: that is, the user where you run the MapReduce jobs and the user running HBase. Then, after running your MapReduce job, you can chgrp the output directory to the HBase group, and run chmod g+w. This allows the bulk loader to move the files into the HBase data directory. It would be useful if there was a way to do this in the LoadIncrementalHFiles API It is the case of linux permissions too: If we have: file owner:me group:me We can chown to group:x and owner:x, without needing special permissions chown x:x file Then when we trigger bulk load, HBase does the fs -mv, and finds that the owner is itself, so no permission hitches. Goes smooth. We are thinking that a switch to turn this on/off (default off) would be nice to have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4255) Expose CatalogJanitor controls
[ https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417398#comment-13417398 ] Hadoop QA commented on HBASE-4255: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537036/4255-5.1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 11 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2400//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2400//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2400//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2400//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2400//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2400//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2400//console This message is automatically generated. Expose CatalogJanitor controls -- Key: HBASE-4255 URL: https://issues.apache.org/jira/browse/HBASE-4255 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 4255-4.2.patch, 4255-5.1.patch When doing surgery or other operational tasks, it's nice to be able to have the .META. table quickly cleaned of split parents. The CatalogJanitor already has controls baked in (currently used in unit tests), I think we should expose this the same way we do with the balancer, that is: - start - stop - request a run A client would need to go through HBaseAdmin, and shell commands need to be created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4255) Expose CatalogJanitor controls
[ https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417400#comment-13417400 ] Zhihong Ted Yu commented on HBASE-4255: --- No test failure from https://builds.apache.org/job/PreCommit-HBASE-Build/2400//testReport/: {code} Results : Tests run: 1021, Failures: 0, Errors: 0, Skipped: 9 {code} Expose CatalogJanitor controls -- Key: HBASE-4255 URL: https://issues.apache.org/jira/browse/HBASE-4255 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 4255-4.2.patch, 4255-5.1.patch When doing surgery or other operational tasks, it's nice to be able to have the .META. table quickly cleaned of split parents. The CatalogJanitor already has controls baked in (currently used in unit tests), I think we should expose this the same way we do with the balancer, that is: - start - stop - request a run A client would need to go through HBaseAdmin, and shell commands need to be created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417410#comment-13417410 ] Jonathan Hsieh commented on HBASE-6396: --- Since 0.92, we've needed to recompile -- this is due to some classes that became interfaces, and an enum that got moved and brought back in via inheritance. These changes let client code source-compatible but not binary compatible -- it required re-compilation. Since 0.21, 0.22, and 0.23 is basically deprecated by hadoop 1.0 and 2.0, I'm leaning towards just having 0.94 support 1.0.x and 2.0.x hadoops. (does anyone really run or test this against 0.22 or 0.21?) I'll post a question on the mailing lists. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417657#comment-13417657 ] Lars Hofhansl commented on HBASE-6396: -- @Andy and @Jon: I think it should be one way or the other. Can we support Hadoop 1.0.x and Hadoop 2.0.x with just reflection? That would be preferable, but it seems we cannot. @Ted: I'd rather add a Hadoop 2.0.x profile (which is in fact what I did when I built 0.94 for Hadoop 2) Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6421) [pom] add jettison and fix netty specification
[ https://issues.apache.org/jira/browse/HBASE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417661#comment-13417661 ] Jesse Yates commented on HBASE-6421: @Ted: how are you testing it? I ran 'mvn clean compile' and 'mvn clean compile -Phadoop.profile=2.0 and both worked fine. Same deal with test, though I didn't wait for the whole test suite. [pom] add jettison and fix netty specification -- Key: HBASE-6421 URL: https://issues.apache.org/jira/browse/HBASE-6421 Project: HBase Issue Type: Bug Reporter: Jesse Yates Assignee: Jesse Yates Attachments: hbase-6421-v0.patch Currently, jettison isn't required for testing hbase-server, but TestSchemaConfigured requires it, causing the compile phase (at least on my MBP) to fail. Further, in cleaning up the poms, netty should be declared in the parent hbase/pom.xml and then inherited in the subclass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6423) Writes should not block reads on blocking updates to memstores
Karthik Ranganathan created HBASE-6423: -- Summary: Writes should not block reads on blocking updates to memstores Key: HBASE-6423 URL: https://issues.apache.org/jira/browse/HBASE-6423 Project: HBase Issue Type: Bug Reporter: Karthik Ranganathan Assignee: Amitanand Aiyer We have a big data use case where we turn off WAL and have a ton of reads and writes. We found that: 1. flushing a memstore takes a while (GZIP compression) 2. incoming writes cause the new memstore to grow in an unbounded fashion 3. this triggers blocking memstore updates 4. in turn, this causes all the RPC handler threads to block on writes to that memstore 5. we are not able to read during this time as RPC handlers are blocked At a higher level, we should not hold up the RPC threads while blocking updates, and we should build in some sort of rate control. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417664#comment-13417664 ] Jesse Yates commented on HBASE-5547: Something Matteo mentioned in the latest review: {quote} MasterFileSystem contains deleteRegion() and deleteTable() that calls fs.delete() with the recursive flag on. This two methods get called by DeleteTableHandler (drop table). In a backup/snapshot situation we want to keep the regions/hfiles. {quote} Question is if we want to scope this in this patch or if we should do it in another patch, e.g. HBASE-6205? Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Jesse Yates Fix For: 0.94.2 Attachments: 5547-v12.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, hbase-5547-v9.patch, java_HBASE-5547_v13.patch, java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, java_HBASE-5547_v6.patch, java_HBASE-5547_v7.patch This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6406) TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently
[ https://issues.apache.org/jira/browse/HBASE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417665#comment-13417665 ] Zhihong Ted Yu commented on HBASE-6406: --- For trunk, TestZooKeeper hung with the following output: {code} 2012-07-18 13:24:34,764 INFO [Master:0;sdev25.arch.ebay.com,59816,1342643039714] master.HMaster(455): HMaster main thread exiting 2012-07-18 13:24:34,764 INFO [RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759] zookeeper.RecoverableZooKeeper(102): The identifier of this process is 15496@sdev25 2012-07-18 13:24:34,772 DEBUG [RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759-EventThread] zookeeper.ZooKeeperWatcher(262): regionserver:60707 Received ZooKeeper Event, type=None, state=SyncConnected, path=null 2012-07-18 13:24:34,773 DEBUG [RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759] zookeeper.ZKUtil(238): regionserver:60707 /hbase/master does not exist. Watcher is set. 2012-07-18 13:24:34,774 DEBUG [RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759-EventThread] zookeeper.ZooKeeperWatcher(339): regionserver:60707-0x1389bc2dddb000c connected 2012-07-18 13:24:35,062 INFO [sdev25.arch.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor] hbase.Chore(82): sdev25.arch.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor exiting 2012-07-18 13:24:35,080 DEBUG [RegionServer:0;sdev25.arch.ebay.com,48349,1342643039994] regionserver.HRegionServer(1817): No master found; retry 2012-07-18 13:24:36,081 DEBUG [RegionServer:0;sdev25.arch.ebay.com,48349,1342643039994] regionserver.HRegionServer(1817): No master found; retry {code} TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently Key: HBASE-6406 URL: https://issues.apache.org/jira/browse/HBASE-6406 Project: HBase Issue Type: Bug Affects Versions: 0.94.1 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.2 Attachments: testReplication.jstack, testZooKeeper.jstack Looking back through the 0.94 test runs these two tests accounted for 11 of 34 failed tests. They should be fixed or (temporarily) disabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6406) TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently
[ https://issues.apache.org/jira/browse/HBASE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417665#comment-13417665 ] Zhihong Ted Yu edited comment on HBASE-6406 at 7/18/12 8:50 PM: For trunk, TestZooKeeper hung with the following output: {code} 2012-07-18 13:24:34,764 INFO [Master:0;X.ebay.com,59816,1342643039714] master.HMaster(455): HMaster main thread exiting 2012-07-18 13:24:34,764 INFO [RegionServer:2;X.ebay.com,60707,1342643074759] zookeeper.RecoverableZooKeeper(102): The identifier of this process is 15496@sdev25 2012-07-18 13:24:34,772 DEBUG [RegionServer:2;X.ebay.com,60707,1342643074759-EventThread] zookeeper.ZooKeeperWatcher(262): regionserver:60707 Received ZooKeeper Event, type=None, state=SyncConnected, path=null 2012-07-18 13:24:34,773 DEBUG [RegionServer:2;X.ebay.com,60707,1342643074759] zookeeper.ZKUtil(238): regionserver:60707 /hbase/master does not exist. Watcher is set. 2012-07-18 13:24:34,774 DEBUG [RegionServer:2;X.ebay.com,60707,1342643074759-EventThread] zookeeper.ZooKeeperWatcher(339): regionserver:60707-0x1389bc2dddb000c connected 2012-07-18 13:24:35,062 INFO [X.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor] hbase.Chore(82): X.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor exiting 2012-07-18 13:24:35,080 DEBUG [RegionServer:0;X.ebay.com,48349,1342643039994] regionserver.HRegionServer(1817): No master found; retry 2012-07-18 13:24:36,081 DEBUG [RegionServer:0;X.ebay.com,48349,1342643039994] regionserver.HRegionServer(1817): No master found; retry{code} was (Author: zhi...@ebaysf.com): For trunk, TestZooKeeper hung with the following output: {code} 2012-07-18 13:24:34,764 INFO [Master:0;sdev25.arch.ebay.com,59816,1342643039714] master.HMaster(455): HMaster main thread exiting 2012-07-18 13:24:34,764 INFO [RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759] zookeeper.RecoverableZooKeeper(102): The identifier of this process is 15496@sdev25 2012-07-18 13:24:34,772 DEBUG [RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759-EventThread] zookeeper.ZooKeeperWatcher(262): regionserver:60707 Received ZooKeeper Event, type=None, state=SyncConnected, path=null 2012-07-18 13:24:34,773 DEBUG [RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759] zookeeper.ZKUtil(238): regionserver:60707 /hbase/master does not exist. Watcher is set. 2012-07-18 13:24:34,774 DEBUG [RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759-EventThread] zookeeper.ZooKeeperWatcher(339): regionserver:60707-0x1389bc2dddb000c connected 2012-07-18 13:24:35,062 INFO [sdev25.arch.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor] hbase.Chore(82): sdev25.arch.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor exiting 2012-07-18 13:24:35,080 DEBUG [RegionServer:0;sdev25.arch.ebay.com,48349,1342643039994] regionserver.HRegionServer(1817): No master found; retry 2012-07-18 13:24:36,081 DEBUG [RegionServer:0;sdev25.arch.ebay.com,48349,1342643039994] regionserver.HRegionServer(1817): No master found; retry {code} TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently Key: HBASE-6406 URL: https://issues.apache.org/jira/browse/HBASE-6406 Project: HBase Issue Type: Bug Affects Versions: 0.94.1 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.2 Attachments: testReplication.jstack, testZooKeeper.jstack Looking back through the 0.94 test runs these two tests accounted for 11 of 34 failed tests. They should be fixed or (temporarily) disabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417667#comment-13417667 ] Andrew Purtell commented on HBASE-6396: --- @Lars, not yet, but trunk is evolving such that we can move code sections implemented as reflection into version specific implementations guarded by Maven profiles. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417669#comment-13417669 ] Jonathan Hsieh commented on HBASE-6396: --- @Lars, idealy yeah. Actually, in this case where we have source-compatible changes from hdfs that require recompile, I'd rather not use reflection - we'd need to run the whole test suite to catch cross-version problems instead of just doing the compile check that currently happens. (+1 hadoop23 compile check added to the hadoopqa script). I created a hadoop 2.0 profile in my local and compiled everything against hadoop 2.0 jars instead of hadoop 0.23.x jars. This build didn't include this patch, and seems to have gotten past where the other build got stuck. With Ted's patch, my guess is that the reflection trick caught one place but that there are others in the source where the trick needed to be done again. I second the idea of adding a hadoop 2.0 profile, and possibly dropping hadoop 0.23, 0.21 build support (but keep 0.22 is being used by Ted's crew). Bottom line, this patch isn't really needed for hadoop 2.0. I'm not sure if this patch is needed with 0.22 hadoops -- it would be up to folks using that to vet it there. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6421) [pom] add jettison and fix netty specification
[ https://issues.apache.org/jira/browse/HBASE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417670#comment-13417670 ] Zhihong Ted Yu commented on HBASE-6421: --- I got the following: {code} [ERROR] The build could not read 1 project - [Help 1] org.apache.maven.project.ProjectBuildingException: Some problems were encountered while processing the POMs: [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-core:jar is missing. @ line 64, column 17 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-test:jar is missing. @ line 91, column 17 at org.apache.maven.project.DefaultProjectBuilder.build(DefaultProjectBuilder.java:339) at org.apache.maven.DefaultMaven.collectProjects(DefaultMaven.java:632) at org.apache.maven.DefaultMaven.getProjectsForMavenReactor(DefaultMaven.java:581) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:233) {code} Here is the command: mvn clean test help:active-profiles -X -DskipTests -Dhadoop.profile=2.0 [pom] add jettison and fix netty specification -- Key: HBASE-6421 URL: https://issues.apache.org/jira/browse/HBASE-6421 Project: HBase Issue Type: Bug Reporter: Jesse Yates Assignee: Jesse Yates Attachments: hbase-6421-v0.patch Currently, jettison isn't required for testing hbase-server, but TestSchemaConfigured requires it, causing the compile phase (at least on my MBP) to fail. Further, in cleaning up the poms, netty should be declared in the parent hbase/pom.xml and then inherited in the subclass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0
[ https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417673#comment-13417673 ] Andrew Purtell commented on HBASE-6396: --- We need to control this special casing / version combinatorics. I do not think we should officially support a Hadoop != 1.0 2.0 in our project build and test procedures. Shops running specialty versions can fix up our artifacts as needed, IMO. Fix NoSuchMethodError running against hadoop 2.0 Key: HBASE-6396 URL: https://issues.apache.org/jira/browse/HBASE-6396 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Zhihong Ted Yu Assignee: Zhihong Ted Yu Labels: hadoop-2.0 Fix For: 0.96.0 Attachments: 6396-v2.txt HADOOP-8350 changed the signature of NetUtils.getInputStream() This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams(). See https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6421) [pom] add jettison and fix netty specification
[ https://issues.apache.org/jira/browse/HBASE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417677#comment-13417677 ] Jesse Yates commented on HBASE-6421: hmmm, odd. I'll put a new version up. I think this got munged with my addendum to HBASE-6405. [pom] add jettison and fix netty specification -- Key: HBASE-6421 URL: https://issues.apache.org/jira/browse/HBASE-6421 Project: HBase Issue Type: Bug Reporter: Jesse Yates Assignee: Jesse Yates Attachments: hbase-6421-v0.patch Currently, jettison isn't required for testing hbase-server, but TestSchemaConfigured requires it, causing the compile phase (at least on my MBP) to fail. Further, in cleaning up the poms, netty should be declared in the parent hbase/pom.xml and then inherited in the subclass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master
[ https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417679#comment-13417679 ] stack commented on HBASE-4470: -- You want to make patches for the other branches G and I'll apply them all at once or you want to make new issues to do that and have this applied to 0.90 so Jon can do his 0.90.7? ServerNotRunningException coming out of assignRootAndMeta kills the Master -- Key: HBASE-4470 URL: https://issues.apache.org/jira/browse/HBASE-4470 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Gregory Chanan Priority: Critical Fix For: 0.90.7 Attachments: HBASE-4470-90.patch I'm surprised we still have issues like that and I didn't get a hit while googling so forgive me if there's already a jira about it. When the master starts it verifies the locations of root and meta before assigning them, if the server is started but not running you'll get this: {quote} 2011-09-23 04:47:44,859 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) {quote} I hit that 3-4 times this week while debugging something else. The worst is that when you restart the master it sees that as a failover, but none of the regions are assigned so it takes an eternity to get back fully online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master
[ https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417683#comment-13417683 ] Gregory Chanan commented on HBASE-4470: --- I'll make the patches for the other branches and fix the tabs...sorry about that. Greg ServerNotRunningException coming out of assignRootAndMeta kills the Master -- Key: HBASE-4470 URL: https://issues.apache.org/jira/browse/HBASE-4470 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Gregory Chanan Priority: Critical Fix For: 0.90.7 Attachments: HBASE-4470-90.patch I'm surprised we still have issues like that and I didn't get a hit while googling so forgive me if there's already a jira about it. When the master starts it verifies the locations of root and meta before assigning them, if the server is started but not running you'll get this: {quote} 2011-09-23 04:47:44,859 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) {quote} I hit that 3-4 times this week while debugging something else. The worst is that when you restart the master it sees that as a failover, but none of the regions are assigned so it takes an eternity to get back fully online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6392) UnknownRegionException blocks hbck from sideline big overlap regions
[ https://issues.apache.org/jira/browse/HBASE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6392: --- Status: Open (was: Patch Available) UnknownRegionException blocks hbck from sideline big overlap regions Key: HBASE-6392 URL: https://issues.apache.org/jira/browse/HBASE-6392 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: 6392-trunk.patch, 6392-trunk_v2.patch Before sidelining a big overlap region, hbck tries to close it and offline it at first. However, sometimes, it throws NotServingRegion or UnknownRegionException. It could be because the region is not open/assigned at all, or some other issue. We should figure out why and fix it. By the way, it's better to print out in the log the command line to bulk load back sidelined regions, if any. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6392) UnknownRegionException blocks hbck from sideline big overlap regions
[ https://issues.apache.org/jira/browse/HBASE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6392: --- Attachment: 6392-trunk_v2.patch UnknownRegionException blocks hbck from sideline big overlap regions Key: HBASE-6392 URL: https://issues.apache.org/jira/browse/HBASE-6392 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: 6392-trunk.patch, 6392-trunk_v2.patch Before sidelining a big overlap region, hbck tries to close it and offline it at first. However, sometimes, it throws NotServingRegion or UnknownRegionException. It could be because the region is not open/assigned at all, or some other issue. We should figure out why and fix it. By the way, it's better to print out in the log the command line to bulk load back sidelined regions, if any. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6392) UnknownRegionException blocks hbck from sideline big overlap regions
[ https://issues.apache.org/jira/browse/HBASE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6392: --- Status: Patch Available (was: Open) Added a unit test case to cover that closeRegion throws NotServingRegionException. UnknownRegionException blocks hbck from sideline big overlap regions Key: HBASE-6392 URL: https://issues.apache.org/jira/browse/HBASE-6392 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: 6392-trunk.patch, 6392-trunk_v2.patch Before sidelining a big overlap region, hbck tries to close it and offline it at first. However, sometimes, it throws NotServingRegion or UnknownRegionException. It could be because the region is not open/assigned at all, or some other issue. We should figure out why and fix it. By the way, it's better to print out in the log the command line to bulk load back sidelined regions, if any. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417697#comment-13417697 ] Zhihong Ted Yu commented on HBASE-6389: --- I tried to see why TestZooKeeper hung strangely: {code} 2012-07-18 14:05:59,533 DEBUG [pool-57-thread-1] zookeeper.ZKUtil(1142): master:52861-0x1389be8bd6e-0x1389be8bd6e000a-0x1389be8bd6e000b Retrieved 39 byte(s) of data from znode /hbase/root-region-server and set watcher; X.ebay.com,44052,1342645522433 2012-07-18 14:05:59,533 WARN [pool-52-thread-1] zookeeper.RecoverableZooKeeper(218): Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/root-region-server 2012-07-18 14:05:59,533 INFO [pool-52-thread-1] util.RetryCounter(55): Sleeping 2000ms before retry #1... 2012-07-18 14:05:59,536 INFO [main] ipc.HBaseRpcMetrics(66): Initializing RPC Metrics with hostName=MiniHBaseCluster$MiniHBaseClusterRegionServer, port=44030 2012-07-18 14:05:59,537 INFO [Master:0;X.ebay.com,52861,1342645522110] master.HMaster(455): HMaster main thread exiting {code} Basically the test hung in setup(). I then traced where TestZooKeeper stopped showing up in test result and this was the first URL giving me 404 error: https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/3126/testReport/org.apache.hadoop.hbase/TestZooKeeper/ That was when this patch went in. Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments Key: HBASE-6389 URL: https://issues.apache.org/jira/browse/HBASE-6389 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0, 0.96.0 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch Continuing from HBASE-6375. It seems I was mistaken in my assumption that changing the value of hbase.master.wait.on.regionservers.mintostart to a sufficient number (from default of 1) can help prevent assignment of all regions to one (or a small number of) region server(s). While this was the case in 0.90.x and 0.92.x, the behavior has changed in 0.94.0 onwards to address HBASE-4993. From 0.94.0 onwards, Master will proceed immediately after the timeout has lapsed, even if hbase.master.wait.on.regionservers.mintostart has not reached. Reading the current conditions of waitForRegionServers() clarifies it {code:title=ServerManager.java (trunk rev:1360470)} 581 /** 582 * Wait for the region servers to report in. 583 * We will wait until one of this condition is met: 584 * - the master is stopped 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of 587 *region servers is reached 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND 589 * there have been no new region server in for 590 * 'hbase.master.wait.on.regionservers.interval' time 591 * 592 * @throws InterruptedException 593 */ 594 public void waitForRegionServers(MonitoredTask status) 595 throws InterruptedException { 612 while ( 613 !this.master.isStopped() 614 slept timeout 615 count maxToStart 616 (lastCountChange+interval now || count minToStart) 617 ){ {code} So with the current conditions, the wait will end as soon as timeout is reached even lesser number of RS have checked-in with the Master and the master will proceed with the region assignment among these RSes alone. As mentioned in -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, and I concur, this could have disastrous effect in large cluster especially now that MSLAB is turned on. To enforce the required quorum as specified by hbase.master.wait.on.regionservers.mintostart irrespective of timeout, these conditions need to be modified as following {code:title=ServerManager.java} .. /** * Wait for the region servers to report in. * We will wait until one of this condition is met: * - the master is stopped * - the 'hbase.master.wait.on.regionservers.maxtostart' number of *region servers is reached * - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND * there have been no new region server in for * 'hbase.master.wait.on.regionservers.interval'
[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417701#comment-13417701 ] Jimmy Xiang commented on HBASE-6272: I changed the patch a little bit and posted on RB: https://reviews.apache.org/r/5717/ In-memory region state is inconsistent -- Key: HBASE-6272 URL: https://issues.apache.org/jira/browse/HBASE-6272 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang AssignmentManger stores region state related information in several places: regionsInTransition, regions (region info to server name map), and servers (server name to region info set map). However the access to these places is not coordinated properly. It leads to inconsistent in-memory region state information. Sometimes, some region could even be offline, and not in transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments
[ https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu reopened HBASE-6389: --- After reverting the patch, test passed smoothly: {code} Running org.apache.hadoop.hbase.TestZooKeeper Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 48.678 sec Results : Tests run: 11, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] --- maven-surefire-plugin:2.10:test (secondPartTestsExecution) @ hbase-server --- [INFO] Tests are skipped. [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 58.563s {code} Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments Key: HBASE-6389 URL: https://issues.apache.org/jira/browse/HBASE-6389 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0, 0.96.0 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch Continuing from HBASE-6375. It seems I was mistaken in my assumption that changing the value of hbase.master.wait.on.regionservers.mintostart to a sufficient number (from default of 1) can help prevent assignment of all regions to one (or a small number of) region server(s). While this was the case in 0.90.x and 0.92.x, the behavior has changed in 0.94.0 onwards to address HBASE-4993. From 0.94.0 onwards, Master will proceed immediately after the timeout has lapsed, even if hbase.master.wait.on.regionservers.mintostart has not reached. Reading the current conditions of waitForRegionServers() clarifies it {code:title=ServerManager.java (trunk rev:1360470)} 581 /** 582 * Wait for the region servers to report in. 583 * We will wait until one of this condition is met: 584 * - the master is stopped 585 * - the 'hbase.master.wait.on.regionservers.timeout' is reached 586 * - the 'hbase.master.wait.on.regionservers.maxtostart' number of 587 *region servers is reached 588 * - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND 589 * there have been no new region server in for 590 * 'hbase.master.wait.on.regionservers.interval' time 591 * 592 * @throws InterruptedException 593 */ 594 public void waitForRegionServers(MonitoredTask status) 595 throws InterruptedException { 612 while ( 613 !this.master.isStopped() 614 slept timeout 615 count maxToStart 616 (lastCountChange+interval now || count minToStart) 617 ){ {code} So with the current conditions, the wait will end as soon as timeout is reached even lesser number of RS have checked-in with the Master and the master will proceed with the region assignment among these RSes alone. As mentioned in -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-, and I concur, this could have disastrous effect in large cluster especially now that MSLAB is turned on. To enforce the required quorum as specified by hbase.master.wait.on.regionservers.mintostart irrespective of timeout, these conditions need to be modified as following {code:title=ServerManager.java} .. /** * Wait for the region servers to report in. * We will wait until one of this condition is met: * - the master is stopped * - the 'hbase.master.wait.on.regionservers.maxtostart' number of *region servers is reached * - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND * there have been no new region server in for * 'hbase.master.wait.on.regionservers.interval' time AND * the 'hbase.master.wait.on.regionservers.timeout' is reached * * @throws InterruptedException */ public void waitForRegionServers(MonitoredTask status) .. .. int minToStart = this.master.getConfiguration(). getInt(hbase.master.wait.on.regionservers.mintostart, 1); int maxToStart = this.master.getConfiguration(). getInt(hbase.master.wait.on.regionservers.maxtostart, Integer.MAX_VALUE); if (maxToStart minToStart) { maxToStart = minToStart; } .. .. while ( !this.master.isStopped() count maxToStart (lastCountChange+interval now || timeout slept || count minToStart) ){ .. {code} -- This message is automatically generated by
[jira] [Updated] (HBASE-6420) Gracefully shutdown logsyncer
[ https://issues.apache.org/jira/browse/HBASE-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6420: --- Status: Patch Available (was: Open) Gracefully shutdown logsyncer - Key: HBASE-6420 URL: https://issues.apache.org/jira/browse/HBASE-6420 Project: HBase Issue Type: Bug Components: wal Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: 6420-trunk.patch Currently, in closing a HLog, logSyncerThread is interrupted. logSyncer could be in the middle to sync the writer. We should avoid interrupting the sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6420) Gracefully shutdown logsyncer
[ https://issues.apache.org/jira/browse/HBASE-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6420: --- Attachment: 6420-trunk.patch Gracefully shutdown logsyncer - Key: HBASE-6420 URL: https://issues.apache.org/jira/browse/HBASE-6420 Project: HBase Issue Type: Bug Components: wal Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: 6420-trunk.patch Currently, in closing a HLog, logSyncerThread is interrupted. logSyncer could be in the middle to sync the writer. We should avoid interrupting the sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6420) Gracefully shutdown logsyncer
[ https://issues.apache.org/jira/browse/HBASE-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-6420: -- Assignee: Jimmy Xiang Gracefully shutdown logsyncer - Key: HBASE-6420 URL: https://issues.apache.org/jira/browse/HBASE-6420 Project: HBase Issue Type: Bug Components: wal Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: 6420-trunk.patch Currently, in closing a HLog, logSyncerThread is interrupted. logSyncer could be in the middle to sync the writer. We should avoid interrupting the sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6373) Add more context information to audit log messages
[ https://issues.apache.org/jira/browse/HBASE-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HBASE-6373: -- Affects Version/s: 0.96.0 Status: Patch Available (was: Open) Add more context information to audit log messages -- Key: HBASE-6373 URL: https://issues.apache.org/jira/browse/HBASE-6373 Project: HBase Issue Type: Improvement Components: security Affects Versions: 0.96.0 Reporter: Marcelo Vanzin Priority: Minor Attachments: accesscontroller.patch, accesscontroller.patch The attached patch adds more information to the audit log messages; namely, it includes the IP address where the request originated, if it's available. The patch is against trunk, but I've tested it against the 0.92 branch. I didn't find any unit test for this code, please let me know if I missed something. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira