[jira] [Commented] (HBASE-3892) Table can't disable
[ https://issues.apache.org/jira/browse/HBASE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045889#comment-13045889 ] gaojinchao commented on HBASE-3892: --- No, It need review and merge. Table can't disable --- Key: HBASE-3892 URL: https://issues.apache.org/jira/browse/HBASE-3892 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: gaojinchao Fix For: 0.90.4 Attachments: AssignmentManager_90v2.patch, AssignmentManager_90v3.patch, logs.rar In TimeoutMonitor : if node exists and node state is RS_ZK_REGION_CLOSED We should send a zk message again when close region is timeout. in this case, It may be loss some message. I See. It seems like a bug. This is my analysis. // disable table and master sent Close message to region server, Region state was set PENDING_CLOSE 2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. 2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 // received splitting message and cleared Region state (PENDING_CLOSE) 2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) 2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT:
[jira] [Created] (HBASE-3961) Add Delete.setWriteToWAL functionality
Add Delete.setWriteToWAL functionality -- Key: HBASE-3961 URL: https://issues.apache.org/jira/browse/HBASE-3961 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: Bruno Dumon For puts, write to WAL can be disabled, but for deletes this functionality is missing. The regionserver internally already passes around a writeToWAL flag, but it is not possible to set this from the client. The attached patch introduces this. This changes the serialization format of Delete, so bumped up the version. I verified manually that the WAL is indeed not growing when writeToWAL is set to false. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3961) Add Delete.setWriteToWAL functionality
[ https://issues.apache.org/jira/browse/HBASE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno Dumon updated HBASE-3961: --- Attachment: delete-writetowal-patch.txt patch against trunk r1133369 Add Delete.setWriteToWAL functionality -- Key: HBASE-3961 URL: https://issues.apache.org/jira/browse/HBASE-3961 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: Bruno Dumon Attachments: delete-writetowal-patch.txt For puts, write to WAL can be disabled, but for deletes this functionality is missing. The regionserver internally already passes around a writeToWAL flag, but it is not possible to set this from the client. The attached patch introduces this. This changes the serialization format of Delete, so bumped up the version. I verified manually that the WAL is indeed not growing when writeToWAL is set to false. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3962) HConnectionManager.getConnection(HBaseConfiguration) returns new connection in default HTable constructor
HConnectionManager.getConnection(HBaseConfiguration) returns new connection in default HTable constructor - Key: HBASE-3962 URL: https://issues.apache.org/jira/browse/HBASE-3962 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.1 Reporter: Philippe The HBase instance are currently indexed by Configuration, which since HBASE-1976 does not have any other equivalence that the object equivalence. So, everytime a new configuration is passed to the method a new connection is created. If we create many HTable connections with the same configuration, there is no problem: HBaseConfiguration config = HBaseConfiguration.create(); HTable table 1 = new HTable(config, table1); // init connection HTable table 2 = new HTable(config, table2); // re-use connection HTable table 3 = new HTable(config, table3); // re-use connection However, if we call the default constructor, or re-call HBaseConfiguration.create();, we will pass a new instance of the configuration to the constructor. This will cause many connections to be created: HTable table 1 = new HTable(table1); // init connection HTable table 2 = new HTable(table2); // init new connection HTable table 3 = new HTable(table3); // init new connection I know connection should be pooled, but sometimes we have to create a new connection, and without having access to a previously instanced configuration object. Since zookeeper has a max client connection (default was 30, now is 10), after creating 30 instances of HTable, we can no longer access to the database. In addition to this, the HBASE_INSTANCES map does not close the connection when removing the eldest entry. So if we have a larger maxConnection value than the hard-coded MAX_CACHED_HBASE_INSTANCES variable, connections will remain but won't be closed. MAX_CACHED_HBASE_INSTANCES should actually be set from the hbase.zookeeper.property.maxClientCnxns parameter (value + 1). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3961) Add Delete.setWriteToWAL functionality
[ https://issues.apache.org/jira/browse/HBASE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045990#comment-13045990 ] Andrew Purtell commented on HBASE-3961: --- +1 Going to commit when local tests pass. Thanks for the patch Bruno! Add Delete.setWriteToWAL functionality -- Key: HBASE-3961 URL: https://issues.apache.org/jira/browse/HBASE-3961 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: Bruno Dumon Attachments: delete-writetowal-patch.txt For puts, write to WAL can be disabled, but for deletes this functionality is missing. The regionserver internally already passes around a writeToWAL flag, but it is not possible to set this from the client. The attached patch introduces this. This changes the serialization format of Delete, so bumped up the version. I verified manually that the WAL is indeed not growing when writeToWAL is set to false. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-3961) Add Delete.setWriteToWAL functionality
[ https://issues.apache.org/jira/browse/HBASE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell reassigned HBASE-3961: - Assignee: Bruno Dumon Add Delete.setWriteToWAL functionality -- Key: HBASE-3961 URL: https://issues.apache.org/jira/browse/HBASE-3961 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: Bruno Dumon Assignee: Bruno Dumon Attachments: delete-writetowal-patch.txt For puts, write to WAL can be disabled, but for deletes this functionality is missing. The regionserver internally already passes around a writeToWAL flag, but it is not possible to set this from the client. The attached patch introduces this. This changes the serialization format of Delete, so bumped up the version. I verified manually that the WAL is indeed not growing when writeToWAL is set to false. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3961) Add Delete.setWriteToWAL functionality
[ https://issues.apache.org/jira/browse/HBASE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-3961. --- Resolution: Fixed Fix Version/s: 0.92.0 Committed. Relevant local tests pass ok. Add Delete.setWriteToWAL functionality -- Key: HBASE-3961 URL: https://issues.apache.org/jira/browse/HBASE-3961 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: Bruno Dumon Assignee: Bruno Dumon Fix For: 0.92.0 Attachments: delete-writetowal-patch.txt For puts, write to WAL can be disabled, but for deletes this functionality is missing. The regionserver internally already passes around a writeToWAL flag, but it is not possible to set this from the client. The attached patch introduces this. This changes the serialization format of Delete, so bumped up the version. I verified manually that the WAL is indeed not growing when writeToWAL is set to false. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3529) Add search to HBase
[ https://issues.apache.org/jira/browse/HBASE-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046016#comment-13046016 ] Alex Baranau commented on HBASE-3529: - Another problem we faced: looks like there's an issue in TestLuceneCoprocessor tests life-cycle or smth else: * the testSearchRPC test fails if we run mvn clean -Dtest=TestLuceneCoprocessor test, other 2 pass (it fails on first assert: expected 20, but found 10) * if I add @Ignore to other two tests, i.e. the maven command runs only testSearchRPC, it works well Add search to HBase --- Key: HBASE-3529 URL: https://issues.apache.org/jira/browse/HBASE-3529 Project: HBase Issue Type: Improvement Affects Versions: 0.90.0 Reporter: Jason Rutherglen Attachments: HBASE-3529.patch Using the Apache Lucene library we can add freetext search to HBase. The advantages of this are: * HBase is highly scalable and distributed * HBase is realtime * Lucene is a fast inverted index and will soon be realtime (see LUCENE-2312) * Lucene offers many types of queries not currently available in HBase (eg, AND, OR, NOT, phrase, etc) * It's easier to build scalable realtime systems on top of already architecturally sound, scalable realtime data system, eg, HBase. * Scaling realtime search will be as simple as scaling HBase. Phase 1 - Indexing: * Integrate Lucene into HBase such that an index mirrors a given region. This means cascading add, update, and deletes between a Lucene index and an HBase region (and vice versa). * Define meta-data to mark a region as indexed, and use a Solr schema to allow the user to define the fields and analyzers. * Integrate with the HLog to ensure that index recovery can occur properly (eg, on region server failure) * Mirror region splits with indexes (use Lucene's IndexSplitter?) * When a region is written to HDFS, also write the corresponding Lucene index to HDFS. * A row key will be the ID of a given Lucene document. The Lucene docstore will explicitly not be used because the document/row data is stored in HBase. We will need to solve what the best data structure for efficiently mapping a docid - row key is. It could be a docstore, field cache, column stride fields, or some other mechanism. * Write unit tests for the above Phase 2 - Queries: * Enable distributed Lucene queries * Regions that have Lucene indexes are inherently available and may be searched on, meaning there's no need for a separate search related system in Zookeeper. * Integrate search with HBase's RPC mechanism -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3529) Add search to HBase
[ https://issues.apache.org/jira/browse/HBASE-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046023#comment-13046023 ] Jason Rutherglen commented on HBASE-3529: - Hi Alex, I have new code I will commit to Github. Add search to HBase --- Key: HBASE-3529 URL: https://issues.apache.org/jira/browse/HBASE-3529 Project: HBase Issue Type: Improvement Affects Versions: 0.90.0 Reporter: Jason Rutherglen Attachments: HBASE-3529.patch Using the Apache Lucene library we can add freetext search to HBase. The advantages of this are: * HBase is highly scalable and distributed * HBase is realtime * Lucene is a fast inverted index and will soon be realtime (see LUCENE-2312) * Lucene offers many types of queries not currently available in HBase (eg, AND, OR, NOT, phrase, etc) * It's easier to build scalable realtime systems on top of already architecturally sound, scalable realtime data system, eg, HBase. * Scaling realtime search will be as simple as scaling HBase. Phase 1 - Indexing: * Integrate Lucene into HBase such that an index mirrors a given region. This means cascading add, update, and deletes between a Lucene index and an HBase region (and vice versa). * Define meta-data to mark a region as indexed, and use a Solr schema to allow the user to define the fields and analyzers. * Integrate with the HLog to ensure that index recovery can occur properly (eg, on region server failure) * Mirror region splits with indexes (use Lucene's IndexSplitter?) * When a region is written to HDFS, also write the corresponding Lucene index to HDFS. * A row key will be the ID of a given Lucene document. The Lucene docstore will explicitly not be used because the document/row data is stored in HBase. We will need to solve what the best data structure for efficiently mapping a docid - row key is. It could be a docstore, field cache, column stride fields, or some other mechanism. * Write unit tests for the above Phase 2 - Queries: * Enable distributed Lucene queries * Regions that have Lucene indexes are inherently available and may be searched on, meaning there's no need for a separate search related system in Zookeeper. * Integrate search with HBase's RPC mechanism -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3529) Add search to HBase
[ https://issues.apache.org/jira/browse/HBASE-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046026#comment-13046026 ] Alex Baranau commented on HBASE-3529: - Thank you! Berlin is waiting! (kidding, we are going to leave very soon) Add search to HBase --- Key: HBASE-3529 URL: https://issues.apache.org/jira/browse/HBASE-3529 Project: HBase Issue Type: Improvement Affects Versions: 0.90.0 Reporter: Jason Rutherglen Attachments: HBASE-3529.patch Using the Apache Lucene library we can add freetext search to HBase. The advantages of this are: * HBase is highly scalable and distributed * HBase is realtime * Lucene is a fast inverted index and will soon be realtime (see LUCENE-2312) * Lucene offers many types of queries not currently available in HBase (eg, AND, OR, NOT, phrase, etc) * It's easier to build scalable realtime systems on top of already architecturally sound, scalable realtime data system, eg, HBase. * Scaling realtime search will be as simple as scaling HBase. Phase 1 - Indexing: * Integrate Lucene into HBase such that an index mirrors a given region. This means cascading add, update, and deletes between a Lucene index and an HBase region (and vice versa). * Define meta-data to mark a region as indexed, and use a Solr schema to allow the user to define the fields and analyzers. * Integrate with the HLog to ensure that index recovery can occur properly (eg, on region server failure) * Mirror region splits with indexes (use Lucene's IndexSplitter?) * When a region is written to HDFS, also write the corresponding Lucene index to HDFS. * A row key will be the ID of a given Lucene document. The Lucene docstore will explicitly not be used because the document/row data is stored in HBase. We will need to solve what the best data structure for efficiently mapping a docid - row key is. It could be a docstore, field cache, column stride fields, or some other mechanism. * Write unit tests for the above Phase 2 - Queries: * Enable distributed Lucene queries * Regions that have Lucene indexes are inherently available and may be searched on, meaning there's no need for a separate search related system in Zookeeper. * Integrate search with HBase's RPC mechanism -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3963) Schedule all log-spliiting at startup all at once
Schedule all log-spliiting at startup all at once - Key: HBASE-3963 URL: https://issues.apache.org/jira/browse/HBASE-3963 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani When distributed log splitting is enabled then it is better to call splitLog() for all region servers simultaneously. A large number of splitlog tasks will get scheduled - one for each log file. But a splitlog-worker (region server) executes only one task at a time and there shouldn't be a danger of DFS overload. Scheduling all the tasks at once ensures maximum parallelism. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1364) [performance] Distributed splitting of regionserver commit logs
[ https://issues.apache.org/jira/browse/HBASE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046044#comment-13046044 ] Prakash Khemani commented on HBASE-1364: Filed https://issues.apache.org/jira/browse/HBASE-3963. Will try to get this done. [performance] Distributed splitting of regionserver commit logs --- Key: HBASE-1364 URL: https://issues.apache.org/jira/browse/HBASE-1364 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: stack Assignee: Prakash Khemani Priority: Critical Fix For: 0.92.0 Attachments: 1364-v5.txt, HBASE-1364.patch, org.apache.hadoop.hbase.master.TestDistributedLogSplitting-output.txt Time Spent: 8h Remaining Estimate: 0h HBASE-1008 has some improvements to our log splitting on regionserver crash; but it needs to run even faster. (Below is from HBASE-1008) In bigtable paper, the split is distributed. If we're going to have 1000 logs, we need to distribute or at least multithread the splitting. 1. As is, regions starting up expect to find one reconstruction log only. Need to make it so pick up a bunch of edit logs and it should be fine that logs are elsewhere in hdfs in an output directory written by all split participants whether multithreaded or a mapreduce-like distributed process (Lets write our distributed sort first as a MR so we learn whats involved; distributed sort, as much as possible should use MR framework pieces). On startup, regions go to this directory and pick up the files written by split participants deleting and clearing the dir when all have been read in. Making it so can take multiple logs for input, can also make the split process more robust rather than current tenuous process which loses all edits if it doesn't make it to the end without error. 2. Each column family rereads the reconstruction log to find its edits. Need to fix that. Split can sort the edits by column family so store only reads its edits. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1364) [performance] Distributed splitting of regionserver commit logs
[ https://issues.apache.org/jira/browse/HBASE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046048#comment-13046048 ] stack commented on HBASE-1364: -- @mingjian Since you are looking the distributed code now, maybe you'd be up for having a go at HBASE-3963? Or at least posting a patch that you've tried for Prakash and/or I to review? Thanks. [performance] Distributed splitting of regionserver commit logs --- Key: HBASE-1364 URL: https://issues.apache.org/jira/browse/HBASE-1364 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: stack Assignee: Prakash Khemani Priority: Critical Fix For: 0.92.0 Attachments: 1364-v5.txt, HBASE-1364.patch, org.apache.hadoop.hbase.master.TestDistributedLogSplitting-output.txt Time Spent: 8h Remaining Estimate: 0h HBASE-1008 has some improvements to our log splitting on regionserver crash; but it needs to run even faster. (Below is from HBASE-1008) In bigtable paper, the split is distributed. If we're going to have 1000 logs, we need to distribute or at least multithread the splitting. 1. As is, regions starting up expect to find one reconstruction log only. Need to make it so pick up a bunch of edit logs and it should be fine that logs are elsewhere in hdfs in an output directory written by all split participants whether multithreaded or a mapreduce-like distributed process (Lets write our distributed sort first as a MR so we learn whats involved; distributed sort, as much as possible should use MR framework pieces). On startup, regions go to this directory and pick up the files written by split participants deleting and clearing the dir when all have been read in. Making it so can take multiple logs for input, can also make the split process more robust rather than current tenuous process which loses all edits if it doesn't make it to the end without error. 2. Each column family rereads the reconstruction log to find its edits. Need to fix that. Split can sort the edits by column family so store only reads its edits. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3946) The splitted region can be online again while the standby hmaster becomes the active one
[ https://issues.apache.org/jira/browse/HBASE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3946: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to branch and trunk. Thanks for the patch Jieshan. The splitted region can be online again while the standby hmaster becomes the active one Key: HBASE-3946 URL: https://issues.apache.org/jira/browse/HBASE-3946 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.4 Attachments: HBASE-3946-V2.patch, HBASE-3946.patch (The cluster has two HMatser, one active and one standby) 1.While the active HMaster shutdown, the standby one would become the active one, and went into the processFailover() method: if (regionCount == 0) { LOG.info(Master startup proceeding: cluster startup); this.assignmentManager.cleanoutUnassigned(); this.assignmentManager.assignAllUserRegions(); } else { LOG.info(Master startup proceeding: master failover); this.assignmentManager.processFailover(); } 2.After that, the user regions would be rebuild. MapHServerInfo,ListPairHRegionInfo,Result deadServers = rebuildUserRegions(); 3.Here's how the rebuildUserRegions worked. All the regions(contain the splitted regions) would be added to the offlineRegions of offlineServers. for (Result result : results) { PairHRegionInfo,HServerInfo region = MetaReader.metaRowToRegionPairWithInfo(result); if (region == null) continue; HServerInfo regionLocation = region.getSecond(); HRegionInfo regionInfo = region.getFirst(); if (regionLocation == null) { // Region not being served, add to region map with no assignment // If this needs to be assigned out, it will also be in ZK as RIT this.regions.put(regionInfo, null); } else if (!serverManager.isServerOnline( regionLocation.getServerName())) { // Region is located on a server that isn't online ListPairHRegionInfo,Result offlineRegions = offlineServers.get(regionLocation); if (offlineRegions == null) { offlineRegions = new ArrayListPairHRegionInfo,Result(1); offlineServers.put(regionLocation, offlineRegions); } offlineRegions.add(new PairHRegionInfo,Result(regionInfo, result)); } else { // Region is being served and on an active server regions.put(regionInfo, regionLocation); addToServers(regionLocation, regionInfo); } } 4.It seems that all the offline regions will be added to RIT and online again: ZKAssign will creat node for each offline never consider the splitted ones. AssignmentManager# processDeadServers private void processDeadServers( MapHServerInfo, ListPairHRegionInfo, Result deadServers) throws IOException, KeeperException { for (Map.EntryHServerInfo, ListPairHRegionInfo,Result deadServer : deadServers.entrySet()) { ListPairHRegionInfo,Result regions = deadServer.getValue(); for (PairHRegionInfo,Result region : regions) { HRegionInfo regionInfo = region.getFirst(); Result result = region.getSecond(); // If region was in transition (was in zk) force it offline for reassign try { ZKAssign.createOrForceNodeOffline(watcher, regionInfo, master.getServerName()); } catch (KeeperException.NoNodeException nne) { // This is fine } // Process with existing RS shutdown code ServerShutdownHandler.processDeadRegion(regionInfo, result, this, this.catalogTracker); } } } AssignmentManager# processFailover // Process list of dead servers processDeadServers(deadServers); // Check existing regions in transition ListString nodes = ZKUtil.listChildrenAndWatchForNewChildren(watcher, watcher.assignmentZNode); if (nodes.isEmpty()) { LOG.info(No regions in transition in ZK to process on failover); return; } LOG.info(Failed-over master needs to process + nodes.size() + regions in transition); for (String encodedRegionName: nodes) { processRegionInTransition(encodedRegionName, null); } So I think before add the region into RIT, check it at first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3964) Add maxResults per row per CF to Get and Scan.
Add maxResults per row per CF to Get and Scan. -- Key: HBASE-3964 URL: https://issues.apache.org/jira/browse/HBASE-3964 Project: HBase Issue Type: New Feature Components: client, regionserver Reporter: Madhuwanti Vaidya Assignee: Madhuwanti Vaidya Priority: Minor -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3965) Expose major and minor compaction queue status.
Expose major and minor compaction queue status. --- Key: HBASE-3965 URL: https://issues.apache.org/jira/browse/HBASE-3965 Project: HBase Issue Type: Improvement Components: master, metrics Affects Versions: 0.90.2 Reporter: Lohit Vijayarenu Priority: Minor Fix For: 0.92.0 It would be good to have metrics (or information) about major and minor compaction queue exposed via WebUI. (plus if we can get it to metrics, to say number of pending major/minor compactions) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3962) HConnectionManager.getConnection(HBaseConfiguration) returns new connection in default HTable constructor
[ https://issues.apache.org/jira/browse/HBASE-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046138#comment-13046138 ] Ted Yu commented on HBASE-3962: --- In trunk, HConnectionManager.getConnection() constructs HConnectionKey from Configuration. Meaning, the identity of Configuration has been redefined. Also, connection is closed in finalizer. HConnectionManager.getConnection(HBaseConfiguration) returns new connection in default HTable constructor - Key: HBASE-3962 URL: https://issues.apache.org/jira/browse/HBASE-3962 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.1 Reporter: Philippe The HBase instance are currently indexed by Configuration, which since HBASE-1976 does not have any other equivalence that the object equivalence. So, everytime a new configuration is passed to the method a new connection is created. If we create many HTable connections with the same configuration, there is no problem: HBaseConfiguration config = HBaseConfiguration.create(); HTable table 1 = new HTable(config, table1); // init connection HTable table 2 = new HTable(config, table2); // re-use connection HTable table 3 = new HTable(config, table3); // re-use connection However, if we call the default constructor, or re-call HBaseConfiguration.create();, we will pass a new instance of the configuration to the constructor. This will cause many connections to be created: HTable table 1 = new HTable(table1); // init connection HTable table 2 = new HTable(table2); // init new connection HTable table 3 = new HTable(table3); // init new connection I know connection should be pooled, but sometimes we have to create a new connection, and without having access to a previously instanced configuration object. Since zookeeper has a max client connection (default was 30, now is 10), after creating 30 instances of HTable, we can no longer access to the database. In addition to this, the HBASE_INSTANCES map does not close the connection when removing the eldest entry. So if we have a larger maxConnection value than the hard-coded MAX_CACHED_HBASE_INSTANCES variable, connections will remain but won't be closed. MAX_CACHED_HBASE_INSTANCES should actually be set from the hbase.zookeeper.property.maxClientCnxns parameter (value + 1). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3966) troubleshooting.xml - added section for web UI for master regionserver
troubleshooting.xml - added section for web UI for master regionserver Key: HBASE-3966 URL: https://issues.apache.org/jira/browse/HBASE-3966 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Several folks on the dist-list didn't know about the hbase web-interfaces. Added a sub-section in Troubleshooting\tools for this (builtin tools). Moved existing tools into external tools sub-section. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3966) troubleshooting.xml - added section for web UI for master regionserver
[ https://issues.apache.org/jira/browse/HBASE-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-3966: - Attachment: troubleshooting_HBASE_3966.xml.patch troubleshooting.xml - added section for web UI for master regionserver Key: HBASE-3966 URL: https://issues.apache.org/jira/browse/HBASE-3966 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: troubleshooting_HBASE_3966.xml.patch Several folks on the dist-list didn't know about the hbase web-interfaces. Added a sub-section in Troubleshooting\tools for this (builtin tools). Moved existing tools into external tools sub-section. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3966) troubleshooting.xml - added section for web UI for master regionserver
[ https://issues.apache.org/jira/browse/HBASE-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-3966: - Status: Patch Available (was: Open) troubleshooting.xml - added section for web UI for master regionserver Key: HBASE-3966 URL: https://issues.apache.org/jira/browse/HBASE-3966 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: troubleshooting_HBASE_3966.xml.patch Several folks on the dist-list didn't know about the hbase web-interfaces. Added a sub-section in Troubleshooting\tools for this (builtin tools). Moved existing tools into external tools sub-section. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-2842) Support BloomFilter error rate on a per-family basis
[ https://issues.apache.org/jira/browse/HBASE-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HBASE-2842: --- Priority: Minor (was: Trivial) Support BloomFilter error rate on a per-family basis Key: HBASE-2842 URL: https://issues.apache.org/jira/browse/HBASE-2842 Project: HBase Issue Type: Improvement Components: filters, ipc, regionserver, rest, thrift Reporter: Nicolas Spiegelberg Assignee: Ming Ma Priority: Minor The error rate for bloom filters is currently set by the io.hfile.bloom.error.rate global variable. Todd suggested at the last HUG that it would be nice to have per-family config options instead. Trace the Bloom Type code to implement this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3529) Add search to HBase
[ https://issues.apache.org/jira/browse/HBASE-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046258#comment-13046258 ] Otis Gospodnetic commented on HBASE-3529: - A few more comments/questions for Jason: * I see PKIndexSplitter usage for splitting the index when a region splits. I see you split the index, open 2 IndexWriters for 2 new Lucene indices, but then either you are not adding documents to them, or I'm not seeing it? * Are there issues around distributed search? It looks like it wasn't in your github branch. * What will happen when a region changes its location/regionserver for whatever reason? I see HDFS-2004 got -1ed and you said without that search will be slow. Do you have an alternative plan? * What is the reason for storing those 2 extra row fields? (the UID one at the other one... I think it's called rowStr or something like that) * What about storing the index in HBase itself? (a la Solandra, I suppose) Would this be doable? Would it make things simpler in the sense that any splitting or moving around, etc. may be handled by HBase and we wouldn't have to make sure the Lucene index always mirrors what's in a region and make sure it follows the region wherever it goes? Lars' idea/question, and I hope I didn't misunderstand or misrepresent his ideas. Add search to HBase --- Key: HBASE-3529 URL: https://issues.apache.org/jira/browse/HBASE-3529 Project: HBase Issue Type: Improvement Affects Versions: 0.90.0 Reporter: Jason Rutherglen Attachments: HBASE-3529.patch Using the Apache Lucene library we can add freetext search to HBase. The advantages of this are: * HBase is highly scalable and distributed * HBase is realtime * Lucene is a fast inverted index and will soon be realtime (see LUCENE-2312) * Lucene offers many types of queries not currently available in HBase (eg, AND, OR, NOT, phrase, etc) * It's easier to build scalable realtime systems on top of already architecturally sound, scalable realtime data system, eg, HBase. * Scaling realtime search will be as simple as scaling HBase. Phase 1 - Indexing: * Integrate Lucene into HBase such that an index mirrors a given region. This means cascading add, update, and deletes between a Lucene index and an HBase region (and vice versa). * Define meta-data to mark a region as indexed, and use a Solr schema to allow the user to define the fields and analyzers. * Integrate with the HLog to ensure that index recovery can occur properly (eg, on region server failure) * Mirror region splits with indexes (use Lucene's IndexSplitter?) * When a region is written to HDFS, also write the corresponding Lucene index to HDFS. * A row key will be the ID of a given Lucene document. The Lucene docstore will explicitly not be used because the document/row data is stored in HBase. We will need to solve what the best data structure for efficiently mapping a docid - row key is. It could be a docstore, field cache, column stride fields, or some other mechanism. * Write unit tests for the above Phase 2 - Queries: * Enable distributed Lucene queries * Regions that have Lucene indexes are inherently available and may be searched on, meaning there's no need for a separate search related system in Zookeeper. * Integrate search with HBase's RPC mechanism -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3529) Add search to HBase
[ https://issues.apache.org/jira/browse/HBASE-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046267#comment-13046267 ] Jason Rutherglen commented on HBASE-3529: - Otis, I think many of your questions have been addressed in this issue, though indeed the comment trail is long at this point. bq. Do you have an alternative plan? https://issues.apache.org/jira/browse/HBASE-3529?focusedCommentId=13040465page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13040465 bq. Are there issues around distributed search? It looks like it wasn't in your github branch https://issues.apache.org/jira/browse/HBASE-3529?focusedCommentId=13042913page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13042913 bq. What about storing the index in HBase itself? I think that's a great idea to test, though in a different Jira issue. bq. PKIndexSplitter That's LUCENE-2919. Given it's not been committed I may need to bring it over into the HBase search source tree. Add search to HBase --- Key: HBASE-3529 URL: https://issues.apache.org/jira/browse/HBASE-3529 Project: HBase Issue Type: Improvement Affects Versions: 0.90.0 Reporter: Jason Rutherglen Attachments: HBASE-3529.patch Using the Apache Lucene library we can add freetext search to HBase. The advantages of this are: * HBase is highly scalable and distributed * HBase is realtime * Lucene is a fast inverted index and will soon be realtime (see LUCENE-2312) * Lucene offers many types of queries not currently available in HBase (eg, AND, OR, NOT, phrase, etc) * It's easier to build scalable realtime systems on top of already architecturally sound, scalable realtime data system, eg, HBase. * Scaling realtime search will be as simple as scaling HBase. Phase 1 - Indexing: * Integrate Lucene into HBase such that an index mirrors a given region. This means cascading add, update, and deletes between a Lucene index and an HBase region (and vice versa). * Define meta-data to mark a region as indexed, and use a Solr schema to allow the user to define the fields and analyzers. * Integrate with the HLog to ensure that index recovery can occur properly (eg, on region server failure) * Mirror region splits with indexes (use Lucene's IndexSplitter?) * When a region is written to HDFS, also write the corresponding Lucene index to HDFS. * A row key will be the ID of a given Lucene document. The Lucene docstore will explicitly not be used because the document/row data is stored in HBase. We will need to solve what the best data structure for efficiently mapping a docid - row key is. It could be a docstore, field cache, column stride fields, or some other mechanism. * Write unit tests for the above Phase 2 - Queries: * Enable distributed Lucene queries * Regions that have Lucene indexes are inherently available and may be searched on, meaning there's no need for a separate search related system in Zookeeper. * Integrate search with HBase's RPC mechanism -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3529) Add search to HBase
[ https://issues.apache.org/jira/browse/HBASE-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046274#comment-13046274 ] Otis Gospodnetic commented on HBASE-3529: - Re https://issues.apache.org/jira/browse/HBASE-3529?focusedCommentId=13042913page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13042913 Does that mean that in order to implement distributed search you'll immediately convert this to HBase+Solr instead of HBase+Lucene, so that you don't have to do Lucene-level distributed search? If so, what about NRTness that will be lost until Solr gets NRT search? Add search to HBase --- Key: HBASE-3529 URL: https://issues.apache.org/jira/browse/HBASE-3529 Project: HBase Issue Type: Improvement Affects Versions: 0.90.0 Reporter: Jason Rutherglen Attachments: HBASE-3529.patch Using the Apache Lucene library we can add freetext search to HBase. The advantages of this are: * HBase is highly scalable and distributed * HBase is realtime * Lucene is a fast inverted index and will soon be realtime (see LUCENE-2312) * Lucene offers many types of queries not currently available in HBase (eg, AND, OR, NOT, phrase, etc) * It's easier to build scalable realtime systems on top of already architecturally sound, scalable realtime data system, eg, HBase. * Scaling realtime search will be as simple as scaling HBase. Phase 1 - Indexing: * Integrate Lucene into HBase such that an index mirrors a given region. This means cascading add, update, and deletes between a Lucene index and an HBase region (and vice versa). * Define meta-data to mark a region as indexed, and use a Solr schema to allow the user to define the fields and analyzers. * Integrate with the HLog to ensure that index recovery can occur properly (eg, on region server failure) * Mirror region splits with indexes (use Lucene's IndexSplitter?) * When a region is written to HDFS, also write the corresponding Lucene index to HDFS. * A row key will be the ID of a given Lucene document. The Lucene docstore will explicitly not be used because the document/row data is stored in HBase. We will need to solve what the best data structure for efficiently mapping a docid - row key is. It could be a docstore, field cache, column stride fields, or some other mechanism. * Write unit tests for the above Phase 2 - Queries: * Enable distributed Lucene queries * Regions that have Lucene indexes are inherently available and may be searched on, meaning there's no need for a separate search related system in Zookeeper. * Integrate search with HBase's RPC mechanism -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3967) Add support to HFileOutputFormat based bulk imports to add Delete mutations
Add support to HFileOutputFormat based bulk imports to add Delete mutations --- Key: HBASE-3967 URL: https://issues.apache.org/jira/browse/HBASE-3967 Project: HBase Issue Type: Improvement Reporter: Kannan Muthukkaruppan During bulk imports, it'll be useful to be able to do delete mutations (either to delete data that already exists in HBase or was inserted earlier during this run of the import). For example, we have a use case, where we are processing a log of data which may have both inserts and deletes in the mix and we want to upload that into HBase using the bulk import mechanism. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3967) Support deletes in HFileOutputFormat based bulk import mechanism
[ https://issues.apache.org/jira/browse/HBASE-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kannan Muthukkaruppan updated HBASE-3967: - Summary: Support deletes in HFileOutputFormat based bulk import mechanism (was: Add support to HFileOutputFormat based bulk imports to add Delete mutations) Support deletes in HFileOutputFormat based bulk import mechanism Key: HBASE-3967 URL: https://issues.apache.org/jira/browse/HBASE-3967 Project: HBase Issue Type: Improvement Reporter: Kannan Muthukkaruppan During bulk imports, it'll be useful to be able to do delete mutations (either to delete data that already exists in HBase or was inserted earlier during this run of the import). For example, we have a use case, where we are processing a log of data which may have both inserts and deletes in the mix and we want to upload that into HBase using the bulk import mechanism. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3968) HLog Pretty Printer
HLog Pretty Printer --- Key: HBASE-3968 URL: https://issues.apache.org/jira/browse/HBASE-3968 Project: HBase Issue Type: New Feature Components: io, regionserver, util Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor We currently have a rudimentary way to print HLog data, but it is limited and currently prints key-only information. We need extend this functionality, similar to how we developed HFile's pretty printer. Ideas for functionality: - filter by sequence_id - filter by row / region - option to print values in addition to key info - option to print output in JSON format (so scripts can easily parse for analysis) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1364) [performance] Distributed splitting of regionserver commit logs
[ https://issues.apache.org/jira/browse/HBASE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046299#comment-13046299 ] mingjian commented on HBASE-1364: - @stack Prakash I will attach a patch in HBASE-3963. [performance] Distributed splitting of regionserver commit logs --- Key: HBASE-1364 URL: https://issues.apache.org/jira/browse/HBASE-1364 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: stack Assignee: Prakash Khemani Priority: Critical Fix For: 0.92.0 Attachments: 1364-v5.txt, HBASE-1364.patch, org.apache.hadoop.hbase.master.TestDistributedLogSplitting-output.txt Time Spent: 8h Remaining Estimate: 0h HBASE-1008 has some improvements to our log splitting on regionserver crash; but it needs to run even faster. (Below is from HBASE-1008) In bigtable paper, the split is distributed. If we're going to have 1000 logs, we need to distribute or at least multithread the splitting. 1. As is, regions starting up expect to find one reconstruction log only. Need to make it so pick up a bunch of edit logs and it should be fine that logs are elsewhere in hdfs in an output directory written by all split participants whether multithreaded or a mapreduce-like distributed process (Lets write our distributed sort first as a MR so we learn whats involved; distributed sort, as much as possible should use MR framework pieces). On startup, regions go to this directory and pick up the files written by split participants deleting and clearing the dir when all have been read in. Making it so can take multiple logs for input, can also make the split process more robust rather than current tenuous process which loses all edits if it doesn't make it to the end without error. 2. Each column family rereads the reconstruction log to find its edits. Need to fix that. Split can sort the edits by column family so store only reads its edits. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3723) Major compact should be done when there is only one storefile and some keyvalue is outdated.
[ https://issues.apache.org/jira/browse/HBASE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046320#comment-13046320 ] stack commented on HBASE-3723: -- Committed to branch. Major compact should be done when there is only one storefile and some keyvalue is outdated. Key: HBASE-3723 URL: https://issues.apache.org/jira/browse/HBASE-3723 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.0, 0.90.1 Reporter: zhoushuaifeng Fix For: 0.90.2 Attachments: hbase-3723.txt In the function store.isMajorCompaction: if (filesToCompact.size() == 1) { // Single file StoreFile sf = filesToCompact.get(0); long oldest = (sf.getReader().timeRangeTracker == null) ? Long.MIN_VALUE : now - sf.getReader().timeRangeTracker.minimumTimestamp; if (sf.isMajorCompaction() (this.ttl == HConstants.FOREVER || oldest this.ttl)) { if (LOG.isDebugEnabled()) { LOG.debug(Skipping major compaction of + this.storeNameStr + because one (major) compacted file only and oldestTime + oldest + ms is ttl= + this.ttl); } } } else { When there is only one storefile in the store, and some keyvalues' TTL are overtime, the majorcompactchecker should send this region to the compactquene and run a majorcompact to clean these outdated data. But according to the code in 0.90.1, it will do nothing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3892) Table can't disable
[ https://issues.apache.org/jira/browse/HBASE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-3892: -- Attachment: AssignmentManager_90v4.patch Table can't disable --- Key: HBASE-3892 URL: https://issues.apache.org/jira/browse/HBASE-3892 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: gaojinchao Fix For: 0.90.4 Attachments: AssignmentManager_90v3.patch, AssignmentManager_90v4.patch, logs.rar In TimeoutMonitor : if node exists and node state is RS_ZK_REGION_CLOSED We should send a zk message again when close region is timeout. in this case, It may be loss some message. I See. It seems like a bug. This is my analysis. // disable table and master sent Close message to region server, Region state was set PENDING_CLOSE 2011-05-08 17:44:25,745 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) for region ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. 2011-05-08 17:44:45,530 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:45:45,542 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 // received splitting message and cleared Region state (PENDING_CLOSE) 2011-05-08 17:46:45,303 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4418fb197685a21f77e151e401cf8b66 on serverName=C4C4.site,60020,1304820199467, load=(requests=0, regions=123, usedHeap=4097, maxHeap=8175) 2011-05-08 17:46:45,538 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:47:45,548 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:48:45,545 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:49:46,108 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:50:46,105 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:51:46,117 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: ufdr,2011050812#8613817306227#0516,1304845660567.8e9a3b05abe1c3a692999cf5e8dfd9dd.: Daughters; ufdr,2011050812#8613817306227#0516,1304847764729.5e4bca85c33fa6605ffc9a5c2eb94e62., ufdr,2011050812#8613817398167#4032,1304847764729.4418fb197685a21f77e151e401cf8b66. from C4C4.site,60020,1304820199467 2011-05-08 17:52:46,112 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: