[jira] [Updated] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-4379: -- Status: Open (was: Patch Available) [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.92.0, 0.90.5 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6267) hbase.store.delete.expired.storefile should be true by default
[ https://issues.apache.org/jira/browse/HBASE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401194#comment-13401194 ] Hudson commented on HBASE-6267: --- Integrated in HBase-0.94 #281 (See [https://builds.apache.org/job/HBase-0.94/281/]) HBASE-6267. hbase.store.delete.expired.storefile should be true by default (Revision 1353813) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java hbase.store.delete.expired.storefile should be true by default -- Key: HBASE-6267 URL: https://issues.apache.org/jira/browse/HBASE-6267 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6267-0.94.patch, HBASE-6267.patch HBASE-5199 introduces this logic into Store: {code} + // Delete the expired store files before the compaction selection. + if (conf.getBoolean(hbase.store.delete.expired.storefile, false) + (ttl != Long.MAX_VALUE) (this.scanInfo.minVersions == 0)) { +CompactSelection expiredSelection = compactSelection +.selectExpiredStoreFilesToCompact( +EnvironmentEdgeManager.currentTimeMillis() - this.ttl); + +// If there is any expired store files, delete them by compaction. +if (expiredSelection != null) { + return expiredSelection; +} + } {code} Is there any reason why that should not be default {{true}}? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-4379: -- Attachment: HBASE-4379_94_V2.patch HBASE-4379_Trunk.patch [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.5, 0.92.0 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, HBASE-4379_94_V2.patch, HBASE-4379_Trunk.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401195#comment-13401195 ] Jieshan Bean commented on HBASE-6200: - Ya...You reminded me. I have one idea to optimize this: 1. if (left family length != right family length). Only compare column family is enough. 2. if (left family length == right family length). we can put them together to compare. So no mattter which case, only one comparison will happen. I will test it right now. KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, HBASE-6200-90-v2.patch, HBASE-6200-90.patch, HBASE-6200-92-v2.patch, HBASE-6200-92.patch, HBASE-6200-94-v2.patch, HBASE-6200-94.patch, HBASE-6200-trunk-v2.patch, HBASE-6200-trunk.patch, PerformanceTestCase-6200-94.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives
[ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401198#comment-13401198 ] Matteo Bertozzi commented on HBASE-6233: @Jon Yes on Take snapshot you rename the hfile to .snapshot/files directory and replace it with a symlink. Also you need to create a symlink in .snapshot/name/ folder (the one that describe the snapshot). When you want to restore you have just to create a symlink of the file. I see two advantages for using this approach: One is code remain unchanged fs.delete() stay fs.delete() (all the symlink code is done in takeSnapshot() and nothing change from the hbase point of view) The other one is: * hbase 0.96 ship with snapshots (hardlink alternative) * hbase 0.98 ship with snapshot + hdfs hardlink If you use the approach that I've described a user that have taken snapshots using 0.96 doesn't have to do nothing special to migrate to 0.98. symlink to .snapshot/files/ keeps to work. And the future 'take snapshot' just create hardlink in .snapshot/name/ and restore as another hardlink against .snapshot/name In the other case (take the exception and retry) you need to keep the logic in 0.98 or do some fancy script that search for the Reference files and replace with the hardlink. [brainstorm] snapshots: hardlink alternatives - Key: HBASE-6233 URL: https://issues.apache.org/jira/browse/HBASE-6233 Project: HBase Issue Type: Brainstorming Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Discussion ticket around snapshots and hardlink alternatives. (See the HDFS-3370 discussion about hardlink and implementation problems) (taking for a moment WAL out of the discussion and focusing on hfiles) With hardlinks available taking snapshot will be fairly easy: * (hfiles are immutable) * hardlink to .snapshot/name to take snapshot * hardlink from .snapshot/name to restore the snapshot * No code change needed (on fs.delete() only one reference is deleted) but we don't have hardlinks, what are the alternatives? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives
[ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401204#comment-13401204 ] Zhihong Ted Yu commented on HBASE-6233: --- From discussion of HDFS-3370, it is unknown when hdfs hardlink would get accepted. [brainstorm] snapshots: hardlink alternatives - Key: HBASE-6233 URL: https://issues.apache.org/jira/browse/HBASE-6233 Project: HBase Issue Type: Brainstorming Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Discussion ticket around snapshots and hardlink alternatives. (See the HDFS-3370 discussion about hardlink and implementation problems) (taking for a moment WAL out of the discussion and focusing on hfiles) With hardlinks available taking snapshot will be fairly easy: * (hfiles are immutable) * hardlink to .snapshot/name to take snapshot * hardlink from .snapshot/name to restore the snapshot * No code change needed (on fs.delete() only one reference is deleted) but we don't have hardlinks, what are the alternatives? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401205#comment-13401205 ] Zhihong Ted Yu commented on HBASE-6200: --- The above approach should work. KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, HBASE-6200-90-v2.patch, HBASE-6200-90.patch, HBASE-6200-92-v2.patch, HBASE-6200-92.patch, HBASE-6200-94-v2.patch, HBASE-6200-94.patch, HBASE-6200-trunk-v2.patch, HBASE-6200-trunk.patch, PerformanceTestCase-6200-94.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6220) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics
[ https://issues.apache.org/jira/browse/HBASE-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401212#comment-13401212 ] Hadoop QA commented on HBASE-6220: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533426/ServerMetrics_HBASE_6220.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2254//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2254//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2254//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2254//console This message is automatically generated. PersistentMetricsTimeVaryingRate gets used for non-time-based metrics - Key: HBASE-6220 URL: https://issues.apache.org/jira/browse/HBASE-6220 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.96.0 Reporter: David S. Wang Priority: Minor Labels: noob Attachments: ServerMetrics_HBASE_6220.patch PersistentMetricsTimeVaryingRate gets used for metrics that are not time-based, leading to confusing names such as avg_time for compaction size, etc. You hav to read the code in order to understand that this is actually referring to bytes, not seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6170) Timeouts for row lock and scan should be separate
[ https://issues.apache.org/jira/browse/HBASE-6170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401219#comment-13401219 ] Hadoop QA commented on HBASE-6170: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533425/HBASE-6170v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2255//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2255//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2255//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2255//console This message is automatically generated. Timeouts for row lock and scan should be separate - Key: HBASE-6170 URL: https://issues.apache.org/jira/browse/HBASE-6170 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.0 Reporter: Otis Gospodnetic Assignee: Chris Trezzo Priority: Minor Fix For: 0.96.0 Attachments: HBASE-6170v1.patch Apparently the timeout used for row locking and for scanning is global. It would be better to have two separate timeouts. (opening the issue to make Lars George happy) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401220#comment-13401220 ] Hadoop QA commented on HBASE-4379: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533442/HBASE-4379_94_V2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2256//console This message is automatically generated. [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, HBASE-4379_94_V2.patch, HBASE-4379_Trunk.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-4379: -- Attachment: HBASE-4379_Trunk.patch [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, HBASE-4379_94_V2.patch, HBASE-4379_Trunk.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-4379: -- Status: Open (was: Patch Available) [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.92.0, 0.90.5, 0.96.0 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, HBASE-4379_94_V2.patch, HBASE-4379_Trunk.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-4379: -- Attachment: (was: HBASE-4379_Trunk.patch) [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, HBASE-4379_94_V2.patch, HBASE-4379_Trunk.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-4379: -- Status: Patch Available (was: Open) [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.92.0, 0.90.5, 0.96.0 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, HBASE-4379_94_V2.patch, HBASE-4379_Trunk.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6205) Support an option to keep data of dropped table for some time
[ https://issues.apache.org/jira/browse/HBASE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401228#comment-13401228 ] Anoop Sam John commented on HBASE-6205: --- How about trying Devaraj's idea? If we do this way, need to see will it have some impacts on tools like HBCK. Support an option to keep data of dropped table for some time - Key: HBASE-6205 URL: https://issues.apache.org/jira/browse/HBASE-6205 Project: HBase Issue Type: New Feature Affects Versions: 0.94.0, 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6205.patch, HBASE-6205v2.patch, HBASE-6205v3.patch, HBASE-6205v4.patch, HBASE-6205v5.patch User may drop table accidentally because of error code or other uncertain reasons. Unfortunately, it happens in our environment because one user make a mistake between production cluster and testing cluster. So, I just give a suggestion, do we need to support an option to keep data of dropped table for some time, e.g. 1 day In the patch: We make a new dir named .trashtables in the rood dir. In the DeleteTableHandler, we move files in dropped table's dir to trash table dir instead of deleting them directly. And Create new class TrashCleaner which will clean dropped tables if it is time out with a period check. Default keep time for dropped tables is 1 day, and check period is 1 hour. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6205) Support an option to keep data of dropped table for some time
[ https://issues.apache.org/jira/browse/HBASE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401239#comment-13401239 ] chunhui shen commented on HBASE-6205: - bq.How about trying Devaraj's idea? If user dropped the table, and create the table with the same name, is it will something wrong? Another problem, if we set table disable_delete, could user see this table? Support an option to keep data of dropped table for some time - Key: HBASE-6205 URL: https://issues.apache.org/jira/browse/HBASE-6205 Project: HBase Issue Type: New Feature Affects Versions: 0.94.0, 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6205.patch, HBASE-6205v2.patch, HBASE-6205v3.patch, HBASE-6205v4.patch, HBASE-6205v5.patch User may drop table accidentally because of error code or other uncertain reasons. Unfortunately, it happens in our environment because one user make a mistake between production cluster and testing cluster. So, I just give a suggestion, do we need to support an option to keep data of dropped table for some time, e.g. 1 day In the patch: We make a new dir named .trashtables in the rood dir. In the DeleteTableHandler, we move files in dropped table's dir to trash table dir instead of deleting them directly. And Create new class TrashCleaner which will clean dropped tables if it is time out with a period check. Default keep time for dropped tables is 1 day, and check period is 1 hour. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice
[ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401243#comment-13401243 ] ramkrishna.s.vasudevan commented on HBASE-6228: --- +1 Chunhui. Your explanation is right. Sorry for making noise here. :) Thanks. Fixup daughters twice cause daughter region assigned twice --- Key: HBASE-6228 URL: https://issues.apache.org/jira/browse/HBASE-6228 Project: HBase Issue Type: Bug Components: master Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch First, how fixup daughters twice happen? 1.we will fixupDaughters at the last of HMaster#finishInitialization 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor. The detail is as the following: Suppose region A is a splitted parent region, and its daughter region B is missing 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter. 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6205) Support an option to keep data of dropped table for some time
[ https://issues.apache.org/jira/browse/HBASE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401245#comment-13401245 ] ramkrishna.s.vasudevan commented on HBASE-6205: --- bq.if we set table disable_delete, could user see this table? I think here the user should be able to see until the table is really deleted. bq.create the table with the same name, is it will something wrong? If drop is done completely then we should allow the table creation with same name. Maybe till then we should not allow. Just my thoughts on this. Support an option to keep data of dropped table for some time - Key: HBASE-6205 URL: https://issues.apache.org/jira/browse/HBASE-6205 Project: HBase Issue Type: New Feature Affects Versions: 0.94.0, 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6205.patch, HBASE-6205v2.patch, HBASE-6205v3.patch, HBASE-6205v4.patch, HBASE-6205v5.patch User may drop table accidentally because of error code or other uncertain reasons. Unfortunately, it happens in our environment because one user make a mistake between production cluster and testing cluster. So, I just give a suggestion, do we need to support an option to keep data of dropped table for some time, e.g. 1 day In the patch: We make a new dir named .trashtables in the rood dir. In the DeleteTableHandler, we move files in dropped table's dir to trash table dir instead of deleting them directly. And Create new class TrashCleaner which will clean dropped tables if it is time out with a period check. Default keep time for dropped tables is 1 day, and check period is 1 hour. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
ShiXing created HBASE-6269: -- Summary: Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after flush after put value2 : value0 {code} I analyze the code for StoreFileScanner with lazy seek, the StoreFileScanners are sorted by SequenceId, so the latest StoreFile is on the top KeyValueHeap, and the KeyValue for latest StoreFile will comapre to the second latest StoreFile, but the second latest StoreFile generated the fake row for same row, family, qualifier excepts the timestamp( maximum), memstoreTS(0). And the latest KeyValue recognized as not latest than the second latest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators:
[jira] [Updated] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ShiXing updated HBASE-6269: --- Attachment: HBASE-6269-v1.patch Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after flush after put value2 : value0 {code} I analyze the code for StoreFileScanner with lazy seek, the StoreFileScanners are sorted by SequenceId, so the latest StoreFile is on the top KeyValueHeap, and the KeyValue for latest StoreFile will comapre to the second latest StoreFile, but the second latest
[jira] [Commented] (HBASE-6195) Increment data will be lost when the memstore is flushed
[ https://issues.apache.org/jira/browse/HBASE-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401267#comment-13401267 ] ShiXing commented on HBASE-6195: I find that the problem is introduced by the lazyseek. I have open a jira for this problem HBASE-6269. Increment data will be lost when the memstore is flushed Key: HBASE-6195 URL: https://issues.apache.org/jira/browse/HBASE-6195 Project: HBase Issue Type: Bug Components: regionserver Reporter: ShiXing Assignee: ShiXing Fix For: 0.96.0, 0.94.1 Attachments: 6195-trunk-V7.patch, 6195.addendum, HBASE-6195-trunk-V2.patch, HBASE-6195-trunk-V3.patch, HBASE-6195-trunk-V4.patch, HBASE-6195-trunk-V5.patch, HBASE-6195-trunk-V6.patch, HBASE-6195-trunk.patch There are two problems in increment() now: First: I see that the timestamp(the variable now) in HRegion's Increment() is generated before got the rowLock, so when there are multi-thread increment the same row, although it generate earlier, it may got the lock later. Because increment just store one version, so till now, the result will still be right. When the region is flushing, these increment will read the kv from snapshot and memstore with whose timestamp is larger, and write it back to memstore. If the snapshot's timestamp larger than the memstore, the increment will got the old data and then do the increment, it's wrong. Secondly: Also there is a risk in increment. Because it writes the memstore first and then HLog, so if it writes HLog failed, the client will also read the incremented value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-6200: Attachment: (was: HBASE-6200-90.patch) KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, HBASE-6200-90-v2.patch, HBASE-6200-92-v2.patch, HBASE-6200-94-v2.patch, PerformanceTestCase-6200-94.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-6200: Attachment: (was: HBASE-6200-94.patch) KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, HBASE-6200-90-v2.patch, HBASE-6200-92-v2.patch, HBASE-6200-94-v2.patch, PerformanceTestCase-6200-94.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401286#comment-13401286 ] Hadoop QA commented on HBASE-4379: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533445/HBASE-4379_Trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2257//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2257//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2257//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2257//console This message is automatically generated. [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, HBASE-4379_94_V2.patch, HBASE-4379_Trunk.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5631) hbck should handle case where .tableinfo file is missing.
[ https://issues.apache.org/jira/browse/HBASE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401288#comment-13401288 ] Anoop Sam John commented on HBASE-5631: --- One point is with out the .tableinfo file in HDFS, HBCK can not fix HDFS integrity issues. For recreating the .tableinfo file in HDFS we need the HTD instance of the table. Well we can try getting this from the Master or RSs. In RS side the HRegion will have HTD instances. Also in Master, it might be already cached in FSTableDescriptors before the file actually got missed. We can try getting from any where possible (Hope from some where we will get HTD) and recreate the .tableinfo file. One point is this can work only in online mode. And if the .tableinfo file is missing the offline mode fixes wont work also. Pls validate my analysis hbck should handle case where .tableinfo file is missing. - Key: HBASE-5631 URL: https://issues.apache.org/jira/browse/HBASE-5631 Project: HBase Issue Type: Improvement Components: hbck Affects Versions: 0.92.2, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh 0.92+ branches have a .tableinfo file which could be missing from hdfs. hbck should be able to detect and repair this properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401294#comment-13401294 ] Ted Yu commented on HBASE-6269: --- Can you generate patch for trunk for Hadoop aa ? Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after flush after put value2 : value0 {code} I analyze the code for StoreFileScanner with lazy seek, the StoreFileScanners are sorted by SequenceId, so the latest StoreFile is on the top KeyValueHeap, and the KeyValue for latest
[jira] [Comment Edited] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401294#comment-13401294 ] Ted Yu edited comment on HBASE-6269 at 6/26/12 11:00 AM: - Can you generate patch for trunk for Hadoop QA ? was (Author: yuzhih...@gmail.com): Can you generate patch for trunk for Hadoop aa ? Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after flush after put value2 : value0 {code} I analyze the code for
[jira] [Commented] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401296#comment-13401296 ] ramkrishna.s.vasudevan commented on HBASE-6269: --- @ShiXing If the data is not flushed we are able to get value1 which is latest. But when we flush we have this behavourial change like the StoreFileScanner(KeyValueHeap) gives us the older value? Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after flush after put value2 : value0 {code} I
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-6200: Attachment: 6200-trunk-v3.patch I tried my best to decrease the calculate times in this version of patch. And I changed the test case to test comparison directly(Do 1,000,000 compares each time). With the same steps: 1. Test comparing between 'famia:qualia' with 'famib:qualia'. Run for 100 times. Calculate the time consumed(Using System.currentTimeMillis() to get current time). 2. Test comparing between 'fami:qualia' with 'fami:qualib'. Run for 1,000,000 times. Calculate the time consumed. 3. Repeats 1~2 for 20 times. Accumulate the total consumed time at step 1 and step 2, and then calculate for the average time. Test code: {noformat} for (int loop = 0; loop 20; loop++) { long start = System.currentTimeMillis(); for (int i = 0; i 100; i++) { compareIgnoringPrefix(c, 0, kvf_a, kvf_b); } long end = System.currentTimeMillis(); long useTimeA = end - start; start = end; for (int i = 0; i 100; i++) { compareIgnoringPrefix(c, 0, kvq_a, kvq_b); } end = System.currentTimeMillis(); long useTimeB = end - start; totalTimeA += useTimeA; totalTimeB += useTimeB; } private void compareIgnoringPrefix(KeyValue.KeyComparator c, int common, KeyValue less, KeyValue greater) { int cmp = c.compareIgnoringPrefix(common, less.getBuffer(), less.getOffset() + KeyValue.ROW_OFFSET, less.getKeyLength(), greater.getBuffer(), greater.getOffset() + KeyValue.ROW_OFFSET, greater.getKeyLength()); } {noformat} And this is the new result: [without patch 6200] {noformat} Compare {famia:qualia} with {famib:qualia}, run for 1,000,000 times. used time - 50 Compare {fami:qualia} with {fami:qualib}, run for 1,000,000 times. used time - 58 {noformat} [with patch 6200] {noformat} Compare {famia:qualia} with {famib:qualia}, run for 1,000,000 times. used time - 56 Compare {fami:qualia} with {fami:qualib}, run for 1,000,000 times. used time - 64 {noformat} KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch, HBASE-6200-90-v2.patch, HBASE-6200-92-v2.patch, HBASE-6200-94-v2.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-6200: Attachment: (was: PerformanceTestCase-6200-94.patch) KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch, HBASE-6200-90-v2.patch, HBASE-6200-92-v2.patch, HBASE-6200-94-v2.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-6200: Attachment: PerformanceTest-trunk.patch KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch, HBASE-6200-90-v2.patch, HBASE-6200-92-v2.patch, HBASE-6200-94-v2.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-6200: Attachment: (was: PerformanceTest-trunk.patch) KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch, HBASE-6200-90-v2.patch, HBASE-6200-92-v2.patch, HBASE-6200-94-v2.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-6200: Attachment: (was: HBASE-6200-92-v2.patch) KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-6200: Attachment: (was: HBASE-6200-90-v2.patch) KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-6200: Attachment: (was: HBASE-6200-94-v2.patch) KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6267) hbase.store.delete.expired.storefile should be true by default
[ https://issues.apache.org/jira/browse/HBASE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401322#comment-13401322 ] Hudson commented on HBASE-6267: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #69 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/69/]) HBASE-6267. hbase.store.delete.expired.storefile should be true by default (Revision 1353812) Result = FAILURE apurtell : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java hbase.store.delete.expired.storefile should be true by default -- Key: HBASE-6267 URL: https://issues.apache.org/jira/browse/HBASE-6267 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6267-0.94.patch, HBASE-6267.patch HBASE-5199 introduces this logic into Store: {code} + // Delete the expired store files before the compaction selection. + if (conf.getBoolean(hbase.store.delete.expired.storefile, false) + (ttl != Long.MAX_VALUE) (this.scanInfo.minVersions == 0)) { +CompactSelection expiredSelection = compactSelection +.selectExpiredStoreFilesToCompact( +EnvironmentEdgeManager.currentTimeMillis() - this.ttl); + +// If there is any expired store files, delete them by compaction. +if (expiredSelection != null) { + return expiredSelection; +} + } {code} Is there any reason why that should not be default {{true}}? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401325#comment-13401325 ] ramkrishna.s.vasudevan commented on HBASE-6269: --- @ShiXing Yes, I agree that instead of getting the highest store file's scanner we get the second highest. And since in this case comparing for '0' should be fine i feel. Its better we fix this tho it may be very rare to get this problem. Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after
[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice
[ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401326#comment-13401326 ] Jonathan Hsieh commented on HBASE-6228: --- Is there anyway we can add tests to these subtle recovery fixes? Part of me says we should just take a something like lock on the region (in zk, possibly moving it into RIT) before we start fixing them up like this to make this obvious and to eliminate these classes of races. Fixup daughters twice cause daughter region assigned twice --- Key: HBASE-6228 URL: https://issues.apache.org/jira/browse/HBASE-6228 Project: HBase Issue Type: Bug Components: master Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch First, how fixup daughters twice happen? 1.we will fixupDaughters at the last of HMaster#finishInitialization 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor. The detail is as the following: Suppose region A is a splitted parent region, and its daughter region B is missing 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter. 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401383#comment-13401383 ] ShiXing commented on HBASE-6269: @anoop There are 2 StoreFiles after flush two times, the sf2's sequenceId sf1's sequenceId. When get: step1. the sf2 is the highest StoreFileScanner, and it enforceSeek() in KeyValueHeap.pollRealKV(), so the KeyValue2 is read out from StoreFile by real seek. And it compares to the fake KeyValue(called FakeKeyValue) that generated by KeyValue.createFirstOnRow() in StoreScanner.next(), and the FakeKeyValue's row, family, qualifier, timestamp, memstoreTS(always 0 for StoreFileScanner) are the same as KeyValue2 excepts Key type is Maximum, and Key type in KeyValue2 is Put. And the {code}comparator.compare(curKV=KeyValue2, nextKV=FakeKeyValue) = 251 0{code}. It means that the highest StoreFileScanner's highest KeyValue is not higher than the second. Followed is the value for example {code} KeyValue2 : putRow/family:qualifier/1234567/Put/vlen=6/ts=0 FakeKeyValue : putRow/family:qualifier/1234567/Maximum/vlen=0/ts=0 {code} And then the second highest StoreFileScanner becomes the highest, and the highest is added to the heap. Step2. The sf1's highest KeyValue is read out , we call it KeyValue1, the real value is the same as KeyValue2 fetched again by heap.peek(): {code} KeyValue1 : putRow/family:qualifier/1234567/Put/vlen=6/ts=0 {code} Step3. KeyValue1 compares KeyValue2, and the {code}comparator.compare(curKV=KeyValue1, nextKV=KeyValue2) = 0{code}, and return the sf1's scanner as the highest StoreFileScanner. My solution is that: If all the highest KeyValue read out from the StoreFileScanners are the same(compare return 0), then we should keep the Scanners orig order by sequenceId. Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs =
[jira] [Updated] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ShiXing updated HBASE-6269: --- Attachment: HBASE-6269-trunk-V1.patch Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-trunk-V1.patch, HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after flush after put value2 : value0 {code} I analyze the code for StoreFileScanner with lazy seek, the StoreFileScanners are sorted by SequenceId, so the latest StoreFile is on the top KeyValueHeap, and the KeyValue for latest StoreFile will comapre to the second latest
[jira] [Updated] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6269: -- Status: Patch Available (was: Open) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-trunk-V1.patch, HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after flush after put value2 : value0 {code} I analyze the code for StoreFileScanner with lazy seek, the StoreFileScanners are sorted by SequenceId, so the latest StoreFile is on the top KeyValueHeap, and the KeyValue for latest StoreFile will
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6200: -- Status: Patch Available (was: Open) KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.1, 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6270) If all data is locally cached, undo locking and context switching so we are cpu bound
stack created HBASE-6270: Summary: If all data is locally cached, undo locking and context switching so we are cpu bound Key: HBASE-6270 URL: https://issues.apache.org/jira/browse/HBASE-6270 Project: HBase Issue Type: Bug Components: performance Reporter: stack See Dhruba's blog here towards the end where he talks about HBase: http://hadoopblog.blogspot.com.es/2012/05/hadoop-and-solid-state-drives.html He says that when all data is local and cached, we bind ourselves up with locks and context switching, so much so, that we are unable to use all CPU. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6200: -- Attachment: 6200-trunk-v4.txt Patch v4 reorders some assignments so that variables are calculated immediately before their usage. KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch, 6200-trunk-v4.txt As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5967) OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type
[ https://issues.apache.org/jira/browse/HBASE-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-5967: -- Attachment: 5967-v2.patch Re-attaching patch v2. OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type - Key: HBASE-5967 URL: https://issues.apache.org/jira/browse/HBASE-5967 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: 5967-v2.patch, HBASE-5967-v2.patch, HBASE-5967.patch, master.log I saw this error in the master log: Caused by: java.lang.IllegalArgumentException: Method org.apache.hadoop.hbase.master.MXBean.getRegionServers has parameter or return type that cannot be translated into an open type at com.sun.jmx.mbeanserver.ConvertingMethod.from(ConvertingMethod.java:32) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:63) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:33) at com.sun.jmx.mbeanserver.MBeanAnalyzer.initMaps(MBeanAnalyzer.java:118) at com.sun.jmx.mbeanserver.MBeanAnalyzer.init(MBeanAnalyzer.java:99) ... 14 more Caused by: javax.management.openmbean.OpenDataException: Cannot convert type: java.util.Mapjava.lang.String, org.apache.hadoop.hbase.ServerLoad at com.sun.jmx.mbeanserver.OpenConverter.openDataException(OpenConverter.jav -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401454#comment-13401454 ] stack commented on HBASE-6200: -- Thanks for taking time on perf Jieshan. +1 on commit (It looks like you have enough tests). KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch, 6200-trunk-v4.txt As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5967) OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type
[ https://issues.apache.org/jira/browse/HBASE-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401455#comment-13401455 ] stack commented on HBASE-5967: -- Is this same as HBASE-5971? (If so, lets close HBASE-5971) +1 on patch (I would have go another route banging my head trying to make ServerLoad resolve as an OpenData type... That would have taken 100x times longer and in the end might not have worked... This is better way to go). OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type - Key: HBASE-5967 URL: https://issues.apache.org/jira/browse/HBASE-5967 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: 5967-v2.patch, HBASE-5967-v2.patch, HBASE-5967.patch, master.log I saw this error in the master log: Caused by: java.lang.IllegalArgumentException: Method org.apache.hadoop.hbase.master.MXBean.getRegionServers has parameter or return type that cannot be translated into an open type at com.sun.jmx.mbeanserver.ConvertingMethod.from(ConvertingMethod.java:32) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:63) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:33) at com.sun.jmx.mbeanserver.MBeanAnalyzer.initMaps(MBeanAnalyzer.java:118) at com.sun.jmx.mbeanserver.MBeanAnalyzer.init(MBeanAnalyzer.java:99) ... 14 more Caused by: javax.management.openmbean.OpenDataException: Cannot convert type: java.util.Mapjava.lang.String, org.apache.hadoop.hbase.ServerLoad at com.sun.jmx.mbeanserver.OpenConverter.openDataException(OpenConverter.jav -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401463#comment-13401463 ] Otis Gospodnetic commented on HBASE-6261: - @Andrew See https://twitter.com/otisg/status/217487624804376576 Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401483#comment-13401483 ] Hadoop QA commented on HBASE-6200: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533487/6200-trunk-v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol org.apache.hadoop.hbase.security.access.TestAccessController Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2260//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2260//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2260//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2260//console This message is automatically generated. KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch, 6200-trunk-v4.txt As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5967) OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type
[ https://issues.apache.org/jira/browse/HBASE-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401486#comment-13401486 ] Hadoop QA commented on HBASE-5967: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533490/5967-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2261//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2261//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2261//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2261//console This message is automatically generated. OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type - Key: HBASE-5967 URL: https://issues.apache.org/jira/browse/HBASE-5967 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: 5967-v2.patch, HBASE-5967-v2.patch, HBASE-5967.patch, master.log I saw this error in the master log: Caused by: java.lang.IllegalArgumentException: Method org.apache.hadoop.hbase.master.MXBean.getRegionServers has parameter or return type that cannot be translated into an open type at com.sun.jmx.mbeanserver.ConvertingMethod.from(ConvertingMethod.java:32) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:63) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:33) at com.sun.jmx.mbeanserver.MBeanAnalyzer.initMaps(MBeanAnalyzer.java:118) at com.sun.jmx.mbeanserver.MBeanAnalyzer.init(MBeanAnalyzer.java:99) ... 14 more Caused by: javax.management.openmbean.OpenDataException: Cannot convert type: java.util.Mapjava.lang.String, org.apache.hadoop.hbase.ServerLoad at com.sun.jmx.mbeanserver.OpenConverter.openDataException(OpenConverter.jav -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5967) OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type
[ https://issues.apache.org/jira/browse/HBASE-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401497#comment-13401497 ] Zhihong Ted Yu commented on HBASE-5967: --- I ran the 4 failed tests manually and they passed. Integrated to trunk. Thanks for the patch, Gregory. Thanks for the review, Stack. OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type - Key: HBASE-5967 URL: https://issues.apache.org/jira/browse/HBASE-5967 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: 5967-v2.patch, HBASE-5967-v2.patch, HBASE-5967.patch, master.log I saw this error in the master log: Caused by: java.lang.IllegalArgumentException: Method org.apache.hadoop.hbase.master.MXBean.getRegionServers has parameter or return type that cannot be translated into an open type at com.sun.jmx.mbeanserver.ConvertingMethod.from(ConvertingMethod.java:32) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:63) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:33) at com.sun.jmx.mbeanserver.MBeanAnalyzer.initMaps(MBeanAnalyzer.java:118) at com.sun.jmx.mbeanserver.MBeanAnalyzer.init(MBeanAnalyzer.java:99) ... 14 more Caused by: javax.management.openmbean.OpenDataException: Cannot convert type: java.util.Mapjava.lang.String, org.apache.hadoop.hbase.ServerLoad at com.sun.jmx.mbeanserver.OpenConverter.openDataException(OpenConverter.jav -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401508#comment-13401508 ] Zhihong Ted Yu commented on HBASE-6200: --- I ran the two tests listed above and they passed. @Jieshan: Please prepare patches for 0.94, 0.92 and 0.90 branches based on patch v4. KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch, 6200-trunk-v4.txt As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401512#comment-13401512 ] Andrew Purtell commented on HBASE-6261: --- Which provoked this response: https://twitter.com/ted_dunning/status/217488314297626625 {quote} The basic techniques from the Mahout OnlineSummarizer will work for this. {quote} Would be great if any subsequent conversation happen on this JIRA instead of in twitterspace. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6271) In-memory region state is inconsistent
Jimmy Xiang created HBASE-6271: -- Summary: In-memory region state is inconsistent Key: HBASE-6271 URL: https://issues.apache.org/jira/browse/HBASE-6271 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang AssignmentManger stores region state related information in several places: regionsInTransition, regions (region info to server name map), and servers (server name to region info set map). However the access to these places is not coordinated properly. It leads to inconsistent in-memory region state information. Sometimes, some region could even be offline, and not in transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6272) In-memory region state is inconsistent
Jimmy Xiang created HBASE-6272: -- Summary: In-memory region state is inconsistent Key: HBASE-6272 URL: https://issues.apache.org/jira/browse/HBASE-6272 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang AssignmentManger stores region state related information in several places: regionsInTransition, regions (region info to server name map), and servers (server name to region info set map). However the access to these places is not coordinated properly. It leads to inconsistent in-memory region state information. Sometimes, some region could even be offline, and not in transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6271) In-memory region state is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang resolved HBASE-6271. Resolution: Duplicate Duplicate of HBASE-6272. Clicked twice? Close this one. In-memory region state is inconsistent -- Key: HBASE-6271 URL: https://issues.apache.org/jira/browse/HBASE-6271 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang AssignmentManger stores region state related information in several places: regionsInTransition, regions (region info to server name map), and servers (server name to region info set map). However the access to these places is not coordinated properly. It leads to inconsistent in-memory region state information. Sometimes, some region could even be offline, and not in transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6269: -- Status: Open (was: Patch Available) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-trunk-V1.patch, HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after flush after put value2 : value0 {code} I analyze the code for StoreFileScanner with lazy seek, the StoreFileScanners are sorted by SequenceId, so the latest StoreFile is on the top KeyValueHeap, and the KeyValue for latest StoreFile will comapre to the
[jira] [Updated] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6269: -- Status: Patch Available (was: Open) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: HBASE-6269-trunk-V1.patch, HBASE-6269-v1.patch When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush after put value0 : value0 get value after flush after put value0 : value0 get value before flush after put value1 : value1 get value after flush after put value1 : value0 get value before flush after put value2 : value2 get value after flush after put value2 : value0 {code} I analyze the code for StoreFileScanner with lazy seek, the StoreFileScanners are sorted by SequenceId, so the latest StoreFile is on the top KeyValueHeap, and the KeyValue for latest StoreFile will comapre to the
[jira] [Resolved] (HBASE-6240) Race in HCM.getMaster stalls clients
[ https://issues.apache.org/jira/browse/HBASE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan resolved HBASE-6240. --- Resolution: Fixed Assignee: ramkrishna.s.vasudevan Hadoop Flags: Reviewed Committed to 0.94. Thanks JD for the patch and review. Thanks to Ted for the review. Will open a follow up JIRA to address JD's comments over here. Race in HCM.getMaster stalls clients Key: HBASE-6240 URL: https://issues.apache.org/jira/browse/HBASE-6240 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-6240.patch, HBASE-6240_1_0.94.patch I found this issue trying to run YCSB on 0.94, I don't think it exists on any other branch. I believe that this was introduced in HBASE-5058 Allow HBaseAdmin to use an existing connection. The issue is that in HCM.getMaster it does this recipe: # Check if the master is null and runs (if so, return) # Grab a lock on masterLock # nullify this.master # try to get a new master The issue happens at 3, it should re-run 1 since while you're waiting on the lock someone else could have already fixed it for you. What happens right now is that the threads are all able to set the master to null before others are able to get out of getMaster and it's a complete mess. Figuring it out took me some time because it doesn't manifest itself right away, silent retries are done in the background. Basically the first clue was this: {noformat} Error doing get: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Tue Jun 19 23:40:46 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:47 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:48 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:49 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:51 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:53 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:57 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:01 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:09 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:25 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed {noformat} This was caused by the little dance up in HBaseAdmin where it deletes stale connections... which are not stale at all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6273) HMasterInterface.isMasterRunning() requires clean up
ramkrishna.s.vasudevan created HBASE-6273: - Summary: HMasterInterface.isMasterRunning() requires clean up Key: HBASE-6273 URL: https://issues.apache.org/jira/browse/HBASE-6273 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Reporter: ramkrishna.s.vasudevan Fix For: 0.94.1 This JIRA is in reference to JD's comments regarding the clean up needed in isMasterRunning(). Refer to https://issues.apache.org/jira/browse/HBASE-6240?focusedCommentId=13400772page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13400772 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6240) Race in HCM.getMaster stalls clients
[ https://issues.apache.org/jira/browse/HBASE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401534#comment-13401534 ] ramkrishna.s.vasudevan commented on HBASE-6240: --- HBASE-6273 raised. Race in HCM.getMaster stalls clients Key: HBASE-6240 URL: https://issues.apache.org/jira/browse/HBASE-6240 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-6240.patch, HBASE-6240_1_0.94.patch I found this issue trying to run YCSB on 0.94, I don't think it exists on any other branch. I believe that this was introduced in HBASE-5058 Allow HBaseAdmin to use an existing connection. The issue is that in HCM.getMaster it does this recipe: # Check if the master is null and runs (if so, return) # Grab a lock on masterLock # nullify this.master # try to get a new master The issue happens at 3, it should re-run 1 since while you're waiting on the lock someone else could have already fixed it for you. What happens right now is that the threads are all able to set the master to null before others are able to get out of getMaster and it's a complete mess. Figuring it out took me some time because it doesn't manifest itself right away, silent retries are done in the background. Basically the first clue was this: {noformat} Error doing get: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Tue Jun 19 23:40:46 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:47 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:48 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:49 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:51 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:53 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:57 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:01 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:09 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:25 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed {noformat} This was caused by the little dance up in HBaseAdmin where it deletes stale connections... which are not stale at all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice
[ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401539#comment-13401539 ] ramkrishna.s.vasudevan commented on HBASE-6228: --- @Jon So you not ok with this fix Jon? Fixup daughters twice cause daughter region assigned twice --- Key: HBASE-6228 URL: https://issues.apache.org/jira/browse/HBASE-6228 Project: HBase Issue Type: Bug Components: master Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch First, how fixup daughters twice happen? 1.we will fixupDaughters at the last of HMaster#finishInitialization 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor. The detail is as the following: Suppose region A is a splitted parent region, and its daughter region B is missing 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter. 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401542#comment-13401542 ] ramkrishna.s.vasudevan commented on HBASE-6060: --- @Stack So for the latest version of patch, can we move the step where the node is changed from OFFLINE to OPENING and let the remaining part be in the OpenRegionHandler? Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: rajeshbabu Fix For: 0.96.0, 0.94.1, 0.92.3 Attachments: 6060-94-v3.patch, 6060-94-v4.patch, 6060-94-v4_1.patch, 6060-94-v4_1.patch, 6060-trunk.patch, 6060-trunk.patch, 6060-trunk_2.patch, 6060-trunk_3.patch, 6060_alternative_suggestion.txt, 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, HBASE-6060-92.patch, HBASE-6060-94.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6205) Support an option to keep data of dropped table for some time
[ https://issues.apache.org/jira/browse/HBASE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401544#comment-13401544 ] Devaraj Das commented on HBASE-6205: Yeah, agree with Ramkrishna on the points.. I'd expect little impact on the hbck tool but haven't thought about it deeply enough (ideally, the hbck tool shouldn't need to change - it should treat disable_delete tables the same way as it treats disabled tables today). Support an option to keep data of dropped table for some time - Key: HBASE-6205 URL: https://issues.apache.org/jira/browse/HBASE-6205 Project: HBase Issue Type: New Feature Affects Versions: 0.94.0, 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6205.patch, HBASE-6205v2.patch, HBASE-6205v3.patch, HBASE-6205v4.patch, HBASE-6205v5.patch User may drop table accidentally because of error code or other uncertain reasons. Unfortunately, it happens in our environment because one user make a mistake between production cluster and testing cluster. So, I just give a suggestion, do we need to support an option to keep data of dropped table for some time, e.g. 1 day In the patch: We make a new dir named .trashtables in the rood dir. In the DeleteTableHandler, we move files in dropped table's dir to trash table dir instead of deleting them directly. And Create new class TrashCleaner which will clean dropped tables if it is time out with a period check. Default keep time for dropped tables is 1 day, and check period is 1 hour. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6274) Proto files should be in the same palce
Jimmy Xiang created HBASE-6274: -- Summary: Proto files should be in the same palce Key: HBASE-6274 URL: https://issues.apache.org/jira/browse/HBASE-6274 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Jimmy Xiang Priority: Trivial Fix For: 0.96.0 Currently, proto files are under hbase-server/src/main/protobuf and hbase-server/src/protobuf. It's better to put them together. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6273) HMasterInterface.isMasterRunning() requires clean up
[ https://issues.apache.org/jira/browse/HBASE-6273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401552#comment-13401552 ] Jean-Daniel Cryans commented on HBASE-6273: --- I think we should have two exceptions: - MasterNotRunning (you can contact the machine, but you get a connection refused and maybe I'd include PleaseHoldException) - MasterUnreachable (Unknown host, EOF, and probably other IOEs) This will really help operability, it happened a couple of time on the mailing list that someone would say I got MasterNotRunning but it's running I can use it when all they had was a connectivity issue. I'd prefer we don't do this for a point release. HMasterInterface.isMasterRunning() requires clean up Key: HBASE-6273 URL: https://issues.apache.org/jira/browse/HBASE-6273 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Reporter: ramkrishna.s.vasudevan Fix For: 0.94.1 This JIRA is in reference to JD's comments regarding the clean up needed in isMasterRunning(). Refer to https://issues.apache.org/jira/browse/HBASE-6240?focusedCommentId=13400772page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13400772 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5967) OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type
[ https://issues.apache.org/jira/browse/HBASE-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401556#comment-13401556 ] Hudson commented on HBASE-5967: --- Integrated in HBase-TRUNK #3074 (See [https://builds.apache.org/job/HBase-TRUNK/3074/]) HBASE-5967 OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type (Gregory) (Revision 1354098) Result = SUCCESS tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ServerLoad.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java OpenDataException because HBaseProtos.ServerLoad cannot be converted to an open data type - Key: HBASE-5967 URL: https://issues.apache.org/jira/browse/HBASE-5967 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: 5967-v2.patch, HBASE-5967-v2.patch, HBASE-5967.patch, master.log I saw this error in the master log: Caused by: java.lang.IllegalArgumentException: Method org.apache.hadoop.hbase.master.MXBean.getRegionServers has parameter or return type that cannot be translated into an open type at com.sun.jmx.mbeanserver.ConvertingMethod.from(ConvertingMethod.java:32) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:63) at com.sun.jmx.mbeanserver.MXBeanIntrospector.mFrom(MXBeanIntrospector.java:33) at com.sun.jmx.mbeanserver.MBeanAnalyzer.initMaps(MBeanAnalyzer.java:118) at com.sun.jmx.mbeanserver.MBeanAnalyzer.init(MBeanAnalyzer.java:99) ... 14 more Caused by: javax.management.openmbean.OpenDataException: Cannot convert type: java.util.Mapjava.lang.String, org.apache.hadoop.hbase.ServerLoad at com.sun.jmx.mbeanserver.OpenConverter.openDataException(OpenConverter.jav -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401566#comment-13401566 ] Jimmy Xiang commented on HBASE-6272: One of the example is: {code} void regionOnline(HRegionInfo regionInfo, ServerName sn) { // no lock concurrency ok. this.regionsInTransition.remove(regionInfo.getEncodedName()); synchronized (this.regions) { // Add check ServerName oldSn = this.regions.get(regionInfo); if (oldSn != null) LOG.warn(Overwriting + regionInfo.getEncodedName() + on + oldSn + with + sn); if (isServerOnline(sn)) { this.regions.put(regionInfo, sn); addToServers(sn, regionInfo); this.regions.notifyAll(); } else { LOG.info(The server is not in online servers, ServerName= + sn.getServerName() + , region= + regionInfo.getEncodedName()); } } {code} If the server is not online any more, it is not in transition, nor online. In-memory region state is inconsistent -- Key: HBASE-6272 URL: https://issues.apache.org/jira/browse/HBASE-6272 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang AssignmentManger stores region state related information in several places: regionsInTransition, regions (region info to server name map), and servers (server name to region info set map). However the access to these places is not coordinated properly. It leads to inconsistent in-memory region state information. Sometimes, some region could even be offline, and not in transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5061) StoreFileLocalityChecker
[ https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-5061: -- Attachment: (was: StoreFileLocalityChecker.java) StoreFileLocalityChecker Key: HBASE-5061 URL: https://issues.apache.org/jira/browse/HBASE-5061 Project: HBase Issue Type: New Feature Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor org.apache.hadoop.hbase.HFileLocalityChecker [options] A tool to report the number of local and nonlocal HFile blocks, and the ratio of as a percentage. Where options are: |-f file|Analyze a store file| |-r region|Analyze all store files for the region| |-t table|Analyze all store files for regions of the table served by the local regionserver| |-h host|Consider host local, defaults to the local host| |-v|Verbose operation| -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5061) StoreFileLocalityChecker
[ https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-5061: -- Attachment: HBASE-5061-0.94.patch HBASE-5061.patch StoreFileLocalityChecker Key: HBASE-5061 URL: https://issues.apache.org/jira/browse/HBASE-5061 Project: HBase Issue Type: New Feature Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: HBASE-5061-0.94.patch, HBASE-5061.patch org.apache.hadoop.hbase.HFileLocalityChecker [options] A tool to report the number of local and nonlocal HFile blocks, and the ratio of as a percentage. Where options are: |-f file|Analyze a store file| |-r region|Analyze all store files for the region| |-t table|Analyze all store files for regions of the table served by the local regionserver| |-h host|Consider host local, defaults to the local host| |-v|Verbose operation| -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5061) StoreFileLocalityChecker
[ https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-5061: -- Affects Version/s: 0.94.1 0.96.0 Status: Patch Available (was: Open) StoreFileLocalityChecker Key: HBASE-5061 URL: https://issues.apache.org/jira/browse/HBASE-5061 Project: HBase Issue Type: New Feature Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: HBASE-5061-0.94.patch, HBASE-5061.patch org.apache.hadoop.hbase.HFileLocalityChecker [options] A tool to report the number of local and nonlocal HFile blocks, and the ratio of as a percentage. Where options are: |-f file|Analyze a store file| |-r region|Analyze all store files for the region| |-t table|Analyze all store files for regions of the table served by the local regionserver| |-h host|Consider host local, defaults to the local host| |-v|Verbose operation| -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker
[ https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401571#comment-13401571 ] Andrew Purtell commented on HBASE-5061: --- Updated patch adds '-j' option to produce JSON output. StoreFileLocalityChecker Key: HBASE-5061 URL: https://issues.apache.org/jira/browse/HBASE-5061 Project: HBase Issue Type: New Feature Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: HBASE-5061-0.94.patch, HBASE-5061.patch org.apache.hadoop.hbase.HFileLocalityChecker [options] A tool to report the number of local and nonlocal HFile blocks, and the ratio of as a percentage. Where options are: |-f file|Analyze a store file| |-r region|Analyze all store files for the region| |-t table|Analyze all store files for regions of the table served by the local regionserver| |-h host|Consider host local, defaults to the local host| |-v|Verbose operation| -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6263) Use default mode for HBase Thrift gateway if not specified
[ https://issues.apache.org/jira/browse/HBASE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6263: -- Component/s: (was: scripts) Assignee: Andrew Purtell Summary: Use default mode for HBase Thrift gateway if not specified (was: Binscript should pick a default mode for HBase Thrift gateway) Use default mode for HBase Thrift gateway if not specified -- Key: HBASE-6263 URL: https://issues.apache.org/jira/browse/HBASE-6263 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.94.0, 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Labels: noob Attachments: HBASE-6263-0.94.patch, HBASE-6263.patch The Thrift gateway should start with a default mode if one is not selected. Currently, instead we see: {noformat} Exception in thread main java.lang.AssertionError: Exactly one option out of [-hsha, -nonblocking, -threadpool, -threadedselector] has to be specified at org.apache.hadoop.hbase.thrift.ThriftServerRunner$ImplType.setServerImpl(ThriftServerRunner.java:201) at org.apache.hadoop.hbase.thrift.ThriftServer.processOptions(ThriftServer.java:169) at org.apache.hadoop.hbase.thrift.ThriftServer.doMain(ThriftServer.java:85) at org.apache.hadoop.hbase.thrift.ThriftServer.main(ThriftServer.java:192) {noformat} See also BIGTOP-648. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6263) Use default mode for HBase Thrift gateway if not specified
[ https://issues.apache.org/jira/browse/HBASE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6263: -- Attachment: HBASE-6263-0.94.patch HBASE-6263.patch Use default mode for HBase Thrift gateway if not specified -- Key: HBASE-6263 URL: https://issues.apache.org/jira/browse/HBASE-6263 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.94.0, 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Labels: noob Attachments: HBASE-6263-0.94.patch, HBASE-6263.patch The Thrift gateway should start with a default mode if one is not selected. Currently, instead we see: {noformat} Exception in thread main java.lang.AssertionError: Exactly one option out of [-hsha, -nonblocking, -threadpool, -threadedselector] has to be specified at org.apache.hadoop.hbase.thrift.ThriftServerRunner$ImplType.setServerImpl(ThriftServerRunner.java:201) at org.apache.hadoop.hbase.thrift.ThriftServer.processOptions(ThriftServer.java:169) at org.apache.hadoop.hbase.thrift.ThriftServer.doMain(ThriftServer.java:85) at org.apache.hadoop.hbase.thrift.ThriftServer.main(ThriftServer.java:192) {noformat} See also BIGTOP-648. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6263) Use default mode for HBase Thrift gateway if not specified
[ https://issues.apache.org/jira/browse/HBASE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6263: -- Affects Version/s: (was: 0.92.1) Status: Patch Available (was: Open) Use default mode for HBase Thrift gateway if not specified -- Key: HBASE-6263 URL: https://issues.apache.org/jira/browse/HBASE-6263 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.94.0, 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Labels: noob Attachments: HBASE-6263-0.94.patch, HBASE-6263.patch The Thrift gateway should start with a default mode if one is not selected. Currently, instead we see: {noformat} Exception in thread main java.lang.AssertionError: Exactly one option out of [-hsha, -nonblocking, -threadpool, -threadedselector] has to be specified at org.apache.hadoop.hbase.thrift.ThriftServerRunner$ImplType.setServerImpl(ThriftServerRunner.java:201) at org.apache.hadoop.hbase.thrift.ThriftServer.processOptions(ThriftServer.java:169) at org.apache.hadoop.hbase.thrift.ThriftServer.doMain(ThriftServer.java:85) at org.apache.hadoop.hbase.thrift.ThriftServer.main(ThriftServer.java:192) {noformat} See also BIGTOP-648. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6263) Use default mode for HBase Thrift gateway if not specified
[ https://issues.apache.org/jira/browse/HBASE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6263: -- Attachment: (was: HBASE-6263-0.94.patch) Use default mode for HBase Thrift gateway if not specified -- Key: HBASE-6263 URL: https://issues.apache.org/jira/browse/HBASE-6263 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.94.0, 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Labels: noob Attachments: HBASE-6263-0.94.patch, HBASE-6263.patch The Thrift gateway should start with a default mode if one is not selected. Currently, instead we see: {noformat} Exception in thread main java.lang.AssertionError: Exactly one option out of [-hsha, -nonblocking, -threadpool, -threadedselector] has to be specified at org.apache.hadoop.hbase.thrift.ThriftServerRunner$ImplType.setServerImpl(ThriftServerRunner.java:201) at org.apache.hadoop.hbase.thrift.ThriftServer.processOptions(ThriftServer.java:169) at org.apache.hadoop.hbase.thrift.ThriftServer.doMain(ThriftServer.java:85) at org.apache.hadoop.hbase.thrift.ThriftServer.main(ThriftServer.java:192) {noformat} See also BIGTOP-648. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6263) Use default mode for HBase Thrift gateway if not specified
[ https://issues.apache.org/jira/browse/HBASE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6263: -- Attachment: (was: HBASE-6263.patch) Use default mode for HBase Thrift gateway if not specified -- Key: HBASE-6263 URL: https://issues.apache.org/jira/browse/HBASE-6263 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.94.0, 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Labels: noob Attachments: HBASE-6263-0.94.patch, HBASE-6263.patch The Thrift gateway should start with a default mode if one is not selected. Currently, instead we see: {noformat} Exception in thread main java.lang.AssertionError: Exactly one option out of [-hsha, -nonblocking, -threadpool, -threadedselector] has to be specified at org.apache.hadoop.hbase.thrift.ThriftServerRunner$ImplType.setServerImpl(ThriftServerRunner.java:201) at org.apache.hadoop.hbase.thrift.ThriftServer.processOptions(ThriftServer.java:169) at org.apache.hadoop.hbase.thrift.ThriftServer.doMain(ThriftServer.java:85) at org.apache.hadoop.hbase.thrift.ThriftServer.main(ThriftServer.java:192) {noformat} See also BIGTOP-648. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6263) Use default mode for HBase Thrift gateway if not specified
[ https://issues.apache.org/jira/browse/HBASE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6263: -- Attachment: HBASE-6263-0.94.patch HBASE-6263.patch Use default mode for HBase Thrift gateway if not specified -- Key: HBASE-6263 URL: https://issues.apache.org/jira/browse/HBASE-6263 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.94.0, 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Labels: noob Attachments: HBASE-6263-0.94.patch, HBASE-6263.patch The Thrift gateway should start with a default mode if one is not selected. Currently, instead we see: {noformat} Exception in thread main java.lang.AssertionError: Exactly one option out of [-hsha, -nonblocking, -threadpool, -threadedselector] has to be specified at org.apache.hadoop.hbase.thrift.ThriftServerRunner$ImplType.setServerImpl(ThriftServerRunner.java:201) at org.apache.hadoop.hbase.thrift.ThriftServer.processOptions(ThriftServer.java:169) at org.apache.hadoop.hbase.thrift.ThriftServer.doMain(ThriftServer.java:85) at org.apache.hadoop.hbase.thrift.ThriftServer.main(ThriftServer.java:192) {noformat} See also BIGTOP-648. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6275) Add conditional Hadoop properties assigment
Lars George created HBASE-6275: -- Summary: Add conditional Hadoop properties assigment Key: HBASE-6275 URL: https://issues.apache.org/jira/browse/HBASE-6275 Project: HBase Issue Type: Improvement Components: client, master, regionserver Affects Versions: 0.96.0 Reporter: Lars George Priority: Minor Fix For: 0.96.0 See https://issues.apache.org/jira/browse/HBASE-3639, we should use VersionInfo to put the proper one in, yet only one of them. Currently we always get this message when you start a daemon or the shell: {noformat}2012-06-25 16:13:44,819 WARN org.apache.hadoop.conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS{noformat} As well as this subsequently sporting the same issue: {noformat}2012-06-25 16:13:44,819 WARN org.apache.hadoop.conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id{noformat} And the shell does: {noformat}12/06/25 16:05:26 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available{noformat} Talking to Stack he suggest: {quote}We should make a little function under util to do it because it will be reused in a bunch of places (in daemons, shell, out in scripts, etc).{quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6263) Use default mode for HBase Thrift gateway if not specified
[ https://issues.apache.org/jira/browse/HBASE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401589#comment-13401589 ] Hadoop QA commented on HBASE-6263: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533517/HBASE-6263-0.94.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2263//console This message is automatically generated. Use default mode for HBase Thrift gateway if not specified -- Key: HBASE-6263 URL: https://issues.apache.org/jira/browse/HBASE-6263 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.94.0, 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Labels: noob Attachments: HBASE-6263-0.94.patch, HBASE-6263.patch The Thrift gateway should start with a default mode if one is not selected. Currently, instead we see: {noformat} Exception in thread main java.lang.AssertionError: Exactly one option out of [-hsha, -nonblocking, -threadpool, -threadedselector] has to be specified at org.apache.hadoop.hbase.thrift.ThriftServerRunner$ImplType.setServerImpl(ThriftServerRunner.java:201) at org.apache.hadoop.hbase.thrift.ThriftServer.processOptions(ThriftServer.java:169) at org.apache.hadoop.hbase.thrift.ThriftServer.doMain(ThriftServer.java:85) at org.apache.hadoop.hbase.thrift.ThriftServer.main(ThriftServer.java:192) {noformat} See also BIGTOP-648. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6166) Allow thrift to start wih the server type specified in config
[ https://issues.apache.org/jira/browse/HBASE-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark resolved HBASE-6166. -- Resolution: Duplicate Dupe HBASE-6263 Allow thrift to start wih the server type specified in config - Key: HBASE-6166 URL: https://issues.apache.org/jira/browse/HBASE-6166 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Currently the thrift server type must be specified on the command line. If it's already in config it shouldn't be needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6170) Timeouts for row lock and scan should be separate
[ https://issues.apache.org/jira/browse/HBASE-6170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HBASE-6170: Attachment: HBASE-6170v1.patch Last run failed due to OOM exception. Resubmitting the patch to get another run. Chris Timeouts for row lock and scan should be separate - Key: HBASE-6170 URL: https://issues.apache.org/jira/browse/HBASE-6170 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.0 Reporter: Otis Gospodnetic Assignee: Chris Trezzo Priority: Minor Fix For: 0.96.0 Attachments: HBASE-6170v1.patch, HBASE-6170v1.patch Apparently the timeout used for row locking and for scanning is global. It would be better to have two separate timeouts. (opening the issue to make Lars George happy) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6275) Add conditional Hadoop properties assignment
[ https://issues.apache.org/jira/browse/HBASE-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-6275: --- Summary: Add conditional Hadoop properties assignment (was: Add conditional Hadoop properties assigment) Add conditional Hadoop properties assignment Key: HBASE-6275 URL: https://issues.apache.org/jira/browse/HBASE-6275 Project: HBase Issue Type: Improvement Components: client, master, regionserver Affects Versions: 0.96.0 Reporter: Lars George Priority: Minor Fix For: 0.96.0 See https://issues.apache.org/jira/browse/HBASE-3639, we should use VersionInfo to put the proper one in, yet only one of them. Currently we always get this message when you start a daemon or the shell: {noformat}2012-06-25 16:13:44,819 WARN org.apache.hadoop.conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS{noformat} As well as this subsequently sporting the same issue: {noformat}2012-06-25 16:13:44,819 WARN org.apache.hadoop.conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id{noformat} And the shell does: {noformat}12/06/25 16:05:26 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available{noformat} Talking to Stack he suggest: {quote}We should make a little function under util to do it because it will be reused in a bunch of places (in daemons, shell, out in scripts, etc).{quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker
[ https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401612#comment-13401612 ] Hadoop QA commented on HBASE-5061: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533510/HBASE-5061-0.94.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2262//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2262//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2262//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2262//console This message is automatically generated. StoreFileLocalityChecker Key: HBASE-5061 URL: https://issues.apache.org/jira/browse/HBASE-5061 Project: HBase Issue Type: New Feature Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: HBASE-5061-0.94.patch, HBASE-5061.patch org.apache.hadoop.hbase.HFileLocalityChecker [options] A tool to report the number of local and nonlocal HFile blocks, and the ratio of as a percentage. Where options are: |-f file|Analyze a store file| |-r region|Analyze all store files for the region| |-t table|Analyze all store files for regions of the table served by the local regionserver| |-h host|Consider host local, defaults to the local host| |-v|Verbose operation| -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6240) Race in HCM.getMaster stalls clients
[ https://issues.apache.org/jira/browse/HBASE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401614#comment-13401614 ] Hudson commented on HBASE-6240: --- Integrated in HBase-0.94 #282 (See [https://builds.apache.org/job/HBase-0.94/282/]) HBASE-6240 Race in HCM.getMaster stalls clients Submitted by:J-D, Ram Reviewed by:J-D, Ted (Revision 1354116) Result = FAILURE ramkrishna : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java Race in HCM.getMaster stalls clients Key: HBASE-6240 URL: https://issues.apache.org/jira/browse/HBASE-6240 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-6240.patch, HBASE-6240_1_0.94.patch I found this issue trying to run YCSB on 0.94, I don't think it exists on any other branch. I believe that this was introduced in HBASE-5058 Allow HBaseAdmin to use an existing connection. The issue is that in HCM.getMaster it does this recipe: # Check if the master is null and runs (if so, return) # Grab a lock on masterLock # nullify this.master # try to get a new master The issue happens at 3, it should re-run 1 since while you're waiting on the lock someone else could have already fixed it for you. What happens right now is that the threads are all able to set the master to null before others are able to get out of getMaster and it's a complete mess. Figuring it out took me some time because it doesn't manifest itself right away, silent retries are done in the background. Basically the first clue was this: {noformat} Error doing get: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Tue Jun 19 23:40:46 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:47 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:48 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:49 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:51 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:53 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:57 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:01 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:09 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:25 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed {noformat} This was caused by the little dance up in HBaseAdmin where it deletes stale connections... which are not stale at all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6170) Timeouts for row lock and scan should be separate
[ https://issues.apache.org/jira/browse/HBASE-6170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401638#comment-13401638 ] Hadoop QA commented on HBASE-6170: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533519/HBASE-6170v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher org.apache.hadoop.hbase.client.TestAdmin Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2264//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2264//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2264//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2264//console This message is automatically generated. Timeouts for row lock and scan should be separate - Key: HBASE-6170 URL: https://issues.apache.org/jira/browse/HBASE-6170 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.0 Reporter: Otis Gospodnetic Assignee: Chris Trezzo Priority: Minor Fix For: 0.96.0 Attachments: HBASE-6170v1.patch, HBASE-6170v1.patch Apparently the timeout used for row locking and for scanning is global. It would be better to have two separate timeouts. (opening the issue to make Lars George happy) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6200: -- Status: Open (was: Patch Available) KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.1, 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-trunk-v2.patch, 6200-trunk-v3.patch, 6200-trunk-v4.txt As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401665#comment-13401665 ] Zhihong Ted Yu commented on HBASE-6226: --- @Matt: Uploading patch onto https://reviews.apache.org is another option. move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Attachments: HBASE-6226-v1.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4145) Provide metrics for hbase client
[ https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401683#comment-13401683 ] Jean-Daniel Cryans commented on HBASE-4145: --- I just stumbled upon this code, it seems there's an issue in {{TableRecordReaderImpl}}. Calling restart() does this: {code} public void restart(byte[] firstRow) throws IOException { currentScan = new Scan(scan); {code} Which by itself is fine since the metrics will be copied from *scan* to *currentScan*, except that it's *currentScan* that has the updated metrics not *scan*. In other words, *currentScan* is the object that is used for scanning so it contains the metrics. If restart() is called, that object is overwritten by the original definition of the {{Scan}}. I think to fix this we could grab the metrics from *currentScan* first then set them back on the new object. Provide metrics for hbase client Key: HBASE-4145 URL: https://issues.apache.org/jira/browse/HBASE-4145 Project: HBase Issue Type: Improvement Reporter: Ming Ma Assignee: Ming Ma Fix For: 0.94.0 Attachments: HBaseClientSideMetrics.jpg Sometimes it is useful to get some metrics from hbase client point of view. This will help understand the metrics for scan/TableInputFormat map job scenario. What to capture, for example, for each ResultScanner object, 1. The number of RPC calls to RSs. 2. The delta time between consecutive RPC calls in the current serialized scan implementation. 3. The number of RPC retry to RSs. 4. The number of NotServingRegionException got. 5. The number of remote RPC calls. This excludes those call that hbase client calls the RS on the same machine. 6. The number of regions accessed. How to capture 1. Metrics framework works for a fixed number of metrics. It doesn't fit this scenario. 2. Use some TBD solution in HBase to capture such dynamic metrics. If we assume there is a solution in HBase that HBase client can use to log such kind of metrics, TableInputFormat can pass in mapreduce task ID as application scan ID to HBase client as small addition to existing scan API; and HBase client can log metrics accordingly with such ID. That will allow query, analysis later on the metrics data for specific map reduce job. 3. Expose via MapReduce counter. It lacks certain features, for example, there is no good way to access the metrics on per map instance; the MapReduce framework only performs sum on the counter values so it is tricky to find the max of certain metrics in all mapper instances. However, it might be good enough for now. With this approach, the metrics value will be available via MapReduce counter. a) Have ResultScanner return a new ResultScannerMetrics interface. b) TableInputFormat will access data from ResultScannerMetrics and populate MapReduce counters accordingly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice
[ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401690#comment-13401690 ] Jonathan Hsieh commented on HBASE-6228: --- I'm -0. (Not going to block if others are ok with it, but am just uncomfortable since there are no tests). I've only recently started spending time looking at the recovery/bugs/races but my general impression is that I cannot easily tell if this patch (and several similar to it) are just pushing a race from one place to another. If we add tests then we can detect the fact that this change doesn't re-introduce a previously solved problem. I haven't thought out the locking idea yet but it seems if we have state races, locks could simplify the reasoning that may eliminate classes of subtle bugs. Fixup daughters twice cause daughter region assigned twice --- Key: HBASE-6228 URL: https://issues.apache.org/jira/browse/HBASE-6228 Project: HBase Issue Type: Bug Components: master Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch First, how fixup daughters twice happen? 1.we will fixupDaughters at the last of HMaster#finishInitialization 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor. The detail is as the following: Suppose region A is a splitted parent region, and its daughter region B is missing 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter. 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6200: -- Attachment: 6200-0.94.txt Patch for 0.94 KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-0.92.txt, 6200-0.94.txt, 6200-trunk-v2.patch, 6200-trunk-v3.patch, 6200-trunk-v4.txt As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6170) Timeouts for row lock and scan should be separate
[ https://issues.apache.org/jira/browse/HBASE-6170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HBASE-6170: Attachment: HBASE-6170v1.patch One more run. Chris Timeouts for row lock and scan should be separate - Key: HBASE-6170 URL: https://issues.apache.org/jira/browse/HBASE-6170 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.0 Reporter: Otis Gospodnetic Assignee: Chris Trezzo Priority: Minor Fix For: 0.96.0 Attachments: HBASE-6170v1.patch, HBASE-6170v1.patch, HBASE-6170v1.patch Apparently the timeout used for row locking and for scanning is global. It would be better to have two separate timeouts. (opening the issue to make Lars George happy) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6276) TestClassLoading is racy
Andrew Purtell created HBASE-6276: - Summary: TestClassLoading is racy Key: HBASE-6276 URL: https://issues.apache.org/jira/browse/HBASE-6276 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6276) TestClassLoading is racy
[ https://issues.apache.org/jira/browse/HBASE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6276: -- Attachment: HBASE-6276-0.94.patch HBASE-6276.patch TestClassLoading is racy Key: HBASE-6276 URL: https://issues.apache.org/jira/browse/HBASE-6276 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: HBASE-6276-0.94.patch, HBASE-6276.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6276) TestClassLoading is racy
[ https://issues.apache.org/jira/browse/HBASE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-6276. --- Resolution: Fixed Committed trivial patch to trunk, 0.94, and 0.92 branches. TestClassLoading passes locally on all. TestClassLoading is racy Key: HBASE-6276 URL: https://issues.apache.org/jira/browse/HBASE-6276 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: HBASE-6276-0.94.patch, HBASE-6276.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6276) TestClassLoading is racy
[ https://issues.apache.org/jira/browse/HBASE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6276: -- Priority: Minor (was: Major) TestClassLoading is racy Key: HBASE-6276 URL: https://issues.apache.org/jira/browse/HBASE-6276 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: HBASE-6276-0.94.patch, HBASE-6276.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6170) Timeouts for row lock and scan should be separate
[ https://issues.apache.org/jira/browse/HBASE-6170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401721#comment-13401721 ] Zhihong Ted Yu commented on HBASE-6170: --- Minor comment: {code} + * The lease timeout period for client scans (milliseconds). + */ + private final int scannerLeaseTimeoutPeriod; {code} 'client scans' - 'client scanners' {code} -this.leases = new Leases((int) conf.getLong( +this.leases = new Leases(conf.getInt( {code} I would suggest changing the getInt() calls back to getLong(). Timeouts for row lock and scan should be separate - Key: HBASE-6170 URL: https://issues.apache.org/jira/browse/HBASE-6170 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.0 Reporter: Otis Gospodnetic Assignee: Chris Trezzo Priority: Minor Fix For: 0.96.0 Attachments: HBASE-6170v1.patch, HBASE-6170v1.patch, HBASE-6170v1.patch Apparently the timeout used for row locking and for scanning is global. It would be better to have two separate timeouts. (opening the issue to make Lars George happy) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6276) TestClassLoading is racy
[ https://issues.apache.org/jira/browse/HBASE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401732#comment-13401732 ] Zhihong Ted Yu commented on HBASE-6276: --- I ran TestClassLoading twice in trunk. I got the following failure twice: {code} testClassLoadingFromLocalFS(org.apache.hadoop.hbase.coprocessor.TestClassLoading) Time elapsed: 0.126 sec ERROR! org.apache.hadoop.hbase.TableExistsException: org.apache.hadoop.hbase.TableExistsException: TestClassLoading at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:165) at $Proxy21.createTable(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$MasterHandler.invoke(HConnectionManager.java:1565) at org.apache.hadoop.hbase.client.$Proxy22.createTable(Unknown Source) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:512) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:508) at org.apache.hadoop.hbase.client.HBaseAdmin.execute(HBaseAdmin.java:1983) at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:508) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:411) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:347) at org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLocalFS(TestClassLoading.java:284) {code} TestClassLoading is racy Key: HBASE-6276 URL: https://issues.apache.org/jira/browse/HBASE-6276 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: HBASE-6276-0.94.patch, HBASE-6276.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6277) Metrics for scan object are overwritten when restart() is called
Zhihong Ted Yu created HBASE-6277: - Summary: Metrics for scan object are overwritten when restart() is called Key: HBASE-6277 URL: https://issues.apache.org/jira/browse/HBASE-6277 Project: HBase Issue Type: Bug Reporter: Zhihong Ted Yu From HBASE-4145: There's an issue in {{TableRecordReaderImpl}}. Calling restart() does this: {code} public void restart(byte[] firstRow) throws IOException { currentScan = new Scan(scan); {code} Which by itself is fine since the metrics will be copied from *scan* to *currentScan*, except that it's *currentScan* that has the updated metrics not *scan*. In other words, *currentScan* is the object that is used for scanning so it contains the metrics. If restart() is called, that object is overwritten by the original definition of the {{Scan}}. I think to fix this we could grab the metrics from *currentScan* first then set them back on the new object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4145) Provide metrics for hbase client
[ https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401734#comment-13401734 ] Zhihong Ted Yu commented on HBASE-4145: --- @J-D: HBASE-6277 has been created to address your finding. Provide metrics for hbase client Key: HBASE-4145 URL: https://issues.apache.org/jira/browse/HBASE-4145 Project: HBase Issue Type: Improvement Reporter: Ming Ma Assignee: Ming Ma Fix For: 0.94.0 Attachments: HBaseClientSideMetrics.jpg Sometimes it is useful to get some metrics from hbase client point of view. This will help understand the metrics for scan/TableInputFormat map job scenario. What to capture, for example, for each ResultScanner object, 1. The number of RPC calls to RSs. 2. The delta time between consecutive RPC calls in the current serialized scan implementation. 3. The number of RPC retry to RSs. 4. The number of NotServingRegionException got. 5. The number of remote RPC calls. This excludes those call that hbase client calls the RS on the same machine. 6. The number of regions accessed. How to capture 1. Metrics framework works for a fixed number of metrics. It doesn't fit this scenario. 2. Use some TBD solution in HBase to capture such dynamic metrics. If we assume there is a solution in HBase that HBase client can use to log such kind of metrics, TableInputFormat can pass in mapreduce task ID as application scan ID to HBase client as small addition to existing scan API; and HBase client can log metrics accordingly with such ID. That will allow query, analysis later on the metrics data for specific map reduce job. 3. Expose via MapReduce counter. It lacks certain features, for example, there is no good way to access the metrics on per map instance; the MapReduce framework only performs sum on the counter values so it is tricky to find the max of certain metrics in all mapper instances. However, it might be good enough for now. With this approach, the metrics value will be available via MapReduce counter. a) Have ResultScanner return a new ResultScannerMetrics interface. b) TableInputFormat will access data from ResultScannerMetrics and populate MapReduce counters accordingly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira