[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache
[ https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211290#comment-13211290 ] Zhihong Yu commented on HBASE-5347: --- In filterRow(), we should deref the KeyValue which is removed from the list of KeyValues parameter. In 0.89-fb branch, I found DependentColumnFilter where the above change should be added. GC free memory management in Level-1 Block Cache Key: HBASE-5347 URL: https://issues.apache.org/jira/browse/HBASE-5347 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani Attachments: D1635.5.patch On eviction of a block from the block-cache, instead of waiting for the garbage collecter to reuse its memory, reuse the block right away. This will require us to keep reference counts on the HFile blocks. Once we have the reference counts in place we can do our own simple blocks-out-of-slab allocation for the block-cache. This will help us with * reducing gc pressure, especially in the old generation * making it possible to have non-java-heap memory backing the HFile blocks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5200) AM.ProcessRegionInTransition() and AM.handleRegion() race thus leaving the region assignment inconsistent
[ https://issues.apache.org/jira/browse/HBASE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211390#comment-13211390 ] ramkrishna.s.vasudevan commented on HBASE-5200: --- Thanks for the commit Stack. I was just thought of committing if it had not been. AM.ProcessRegionInTransition() and AM.handleRegion() race thus leaving the region assignment inconsistent - Key: HBASE-5200 URL: https://issues.apache.org/jira/browse/HBASE-5200 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.94.0, 0.92.1 Attachments: 5200-test.txt, 5200-v2.txt, 5200-v3.txt, 5200-v4-092.txt, 5200-v4.txt, 5200-v4no-prefix.txt, HBASE-5200.patch, HBASE-5200_1.patch, HBASE-5200_trunk_latest_with_test_2.patch, TEST-org.apache.hadoop.hbase.master.TestRestartCluster.xml, hbase-5200_90_latest.patch, hbase-5200_90_latest_new.patch This is the scenario Consider a case where the balancer is going on thus trying to close regions in a RS. Before we could close a master switch happens. On Master switch the set of nodes that are in RIT is collected and we first get Data and start watching the node After that the node data is added into RIT. Now by this time (before adding to RIT) if the RS to which close was called does a transition in AM.handleRegion() we miss the handling saying RIT state was null. {code} 2012-01-13 10:50:46,358 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region a66d281d231dfcaea97c270698b26b6f from server HOST-192-168-47-205,20020,1326363111288 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states 2012-01-13 10:50:46,358 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region c12e53bfd48ddc5eec507d66821c4d23 from server HOST-192-168-47-205,20020,1326363111288 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states 2012-01-13 10:50:46,358 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 59ae13de8c1eb325a0dd51f4902d2052 from server HOST-192-168-47-205,20020,1326363111288 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states 2012-01-13 10:50:46,359 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region f45bc9614d7575f35244849af85aa078 from server HOST-192-168-47-205,20020,1326363111288 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states 2012-01-13 10:50:46,359 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region cc3ecd7054fe6cd4a1159ed92fd62641 from server HOST-192-168-47-204,20020,1326342744518 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states 2012-01-13 10:50:46,359 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 3af40478a17fee96b4a192b22c90d5a2 from server HOST-192-168-47-205,20020,1326363111288 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states 2012-01-13 10:50:46,359 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region e6096a8466e730463e10d3d61f809b92 from server HOST-192-168-47-204,20020,1326342744518 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states 2012-01-13 10:50:46,359 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 4806781a1a23066f7baed22b4d237e24 from server HOST-192-168-47-204,20020,1326342744518 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states 2012-01-13 10:50:46,359 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region d69e104131accaefe21dcc01fddc7629 from server HOST-192-168-47-205,20020,1326363111288 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states {code} In branch the CLOSING node is created by RS thus leading to more inconsistency. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211422#comment-13211422 ] nkeywal commented on HBASE-5399: In v9 I added some comments and fixed some issues. bq. I'd say lets not check master is there till we need it. Seems like a PITA going ahead and checking master on construction. This changes the behavior but I think its one thats ok to change. Agreed, removed in v9. bq. You are doing your own Callables. You've seen the Callables that go on in HTable. Any reason you avoid them? I suppose this is different in that you want to let go of the shared master. Looks fine. I could also add the master management in the HTable callables. I would still need the one I wrote for HBaseAdmin, but it could be useful to be able to use master from the HTable callables? I still need to work on the unit tests, some large tests are failing today: {noformat} Failed tests: testBackgroundEvictionThread[1](org.apache.hadoop.hbase.io.hfile .TestLruBlockCache) testShutdownFixupWhenDaughterHasSplit(org.apache.hadoop.hbase.regionserver.Tes tSplitTransactionOnCluster): expected:1 but was:0 queueFailover(org.apache.hadoop.hbase.replication.TestReplication): Waited too much time for queueFailover replication testMergeTool(org.apache.hadoop.hbase.util.TestMergeTool): 'merging regions 0 and 1' failed with errCode -1 Tests in error: test3686a(org.apache.hadoop.hbase.client.TestScannerTimeout): 64142ms passed s ince the last invocation, timeout is currently set to 1 {noformat} For some of them it could be random and unrelated, but I am sure there are some real issues for of them. I am on vacations next week; so I will come back to it the week after... Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399_inprogress.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA.
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211435#comment-13211435 ] nkeywal commented on HBASE-5399: Moreover, v9 patch work on trunk as of today. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399_inprogress.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5435) TestForceCacheImportantBlocks fails with OutOfMemoryError
TestForceCacheImportantBlocks fails with OutOfMemoryError - Key: HBASE-5435 URL: https://issues.apache.org/jira/browse/HBASE-5435 Project: HBase Issue Type: Test Reporter: Zhihong Yu Fix For: 0.94.0 Here is related stack trace (see https://builds.apache.org/job/HBase-TRUNK/2665/testReport/org.apache.hadoop.hbase.io.hfile/TestForceCacheImportantBlocks/testCacheBlocks_1_/): {code} Caused by: java.lang.OutOfMemoryError at java.util.zip.Deflater.init(Native Method) at java.util.zip.Deflater.init(Deflater.java:124) at java.util.zip.GZIPOutputStream.init(GZIPOutputStream.java:46) at java.util.zip.GZIPOutputStream.init(GZIPOutputStream.java:58) at org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec$ReusableGzipOutputStream$ResetableGZIPOutputStream.init(ReusableStreamGzipCodec.java:79) at org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec$ReusableGzipOutputStream.init(ReusableStreamGzipCodec.java:90) at org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec.createOutputStream(ReusableStreamGzipCodec.java:130) at org.apache.hadoop.io.compress.GzipCodec.createOutputStream(GzipCodec.java:101) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createPlainCompressionStream(Compression.java:239) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createCompressionStream(Compression.java:223) at org.apache.hadoop.hbase.io.hfile.HFileWriterV1.getCompressingStream(HFileWriterV1.java:270) at org.apache.hadoop.hbase.io.hfile.HFileWriterV1.close(HFileWriterV1.java:416) at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:1115) at org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:706) at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:633) at org.apache.hadoop.hbase.regionserver.Store.access$400(Store.java:106) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache
[ https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211610#comment-13211610 ] Lars Hofhansl commented on HBASE-5347: -- @Ted: That might be right thing to do in filterRow(), but I am slowly getting this feeling that this will put a lot of onus to account for this on various HBase parts (such as filters). GC free memory management in Level-1 Block Cache Key: HBASE-5347 URL: https://issues.apache.org/jira/browse/HBASE-5347 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani Attachments: D1635.5.patch On eviction of a block from the block-cache, instead of waiting for the garbage collecter to reuse its memory, reuse the block right away. This will require us to keep reference counts on the HFile blocks. Once we have the reference counts in place we can do our own simple blocks-out-of-slab allocation for the block-cache. This will help us with * reducing gc pressure, especially in the old generation * making it possible to have non-java-heap memory backing the HFile blocks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211612#comment-13211612 ] Lars Hofhansl commented on HBASE-5399: -- @Nicolas: Could you upload the patch to RB? That would make it easier to review. I'm loving all changes discussed here so far :) Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399_inprogress.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3149) Make flush decisions per column family
[ https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211614#comment-13211614 ] Lars Hofhansl commented on HBASE-3149: -- @Nicolas: Interesting bit about hstore.compaction.min.size. I'm curious, is 4MB something that works specifically for your setup, or would you generally recommend it setting it that low? It probably has to do with whether compression is enabled, how many CFs and relative sizes, etc. Maybe instead of defaulting it to flushsize, we could default it to flushsize/2 or flushsize/4...? Make flush decisions per column family -- Key: HBASE-3149 URL: https://issues.apache.org/jira/browse/HBASE-3149 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Karthik Ranganathan Assignee: Nicolas Spiegelberg Today, the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211619#comment-13211619 ] Lars Hofhansl commented on HBASE-4365: -- Wouldn't we potentially do a lot of splitting when there are many regionservers? (Maybe I am not grokking this fully) If we take the square of the of the number of regions, and say we have 10gb regions and flush size of 128mb, we'd read the 10gb after at 9 regions of the table on the same regionserver. We were planning a region size of 5gb and flush size of 256mb, that would still be 5 regions. (10gb/128mb ~ 78, 5gb/256mb ~ 19) Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0, 0.92.1 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211619#comment-13211619 ] Lars Hofhansl edited comment on HBASE-4365 at 2/20/12 12:39 AM: Wouldn't we potentially do a lot of splitting when there are many regionservers? (Maybe I am not grokking this fully) If we take the square of the of the number of regions, and say we have 10gb regions and flush size of 128mb, we'd reach the 10gb after at 9 regions of the table on the same regionserver. We were planning a region size of 5gb and flush size of 256mb, that would still be 5 regions. (10gb/128mb ~ 78, 5gb/256mb ~ 19) was (Author: lhofhansl): Wouldn't we potentially do a lot of splitting when there are many regionservers? (Maybe I am not grokking this fully) If we take the square of the of the number of regions, and say we have 10gb regions and flush size of 128mb, we'd read the 10gb after at 9 regions of the table on the same regionserver. We were planning a region size of 5gb and flush size of 256mb, that would still be 5 regions. (10gb/128mb ~ 78, 5gb/256mb ~ 19) Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0, 0.92.1 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211620#comment-13211620 ] Lars Hofhansl commented on HBASE-4365: -- Never use a calculator... 10gb/128mb = 80, 5bgb/256mb = 20. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0, 0.92.1 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5332) Deterministic Compaction Jitter
[ https://issues.apache.org/jira/browse/HBASE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211641#comment-13211641 ] Phabricator commented on HBASE-5332: nspiegelberg has commented on the revision [jira] [HBASE-5332] Deterministic Compaction Jitter. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:1067 I can null check. note that this variable is not null-checked in a lot of places in Store.java. Wish there was a way to guarantee non-null src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:1065 I was worried about a case where this check happened while a major compaction was finishing up (see Store.completeCompaction). Besides per-file compactions, the HRegionServer.CompactionChecker thread is another area where this race can occur. Maybe this isn't a practical problem since we don't cache this calculation anymore the major compaction interval should normally be very large so a race condition won't trigger a major compaction. src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:1068 Thinking about it more, the pathname alone should suffice since it's a 'crypographically strong' random UUID REVISION DETAIL https://reviews.facebook.net/D1785 Deterministic Compaction Jitter --- Key: HBASE-5332 URL: https://issues.apache.org/jira/browse/HBASE-5332 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Attachments: D1785.1.patch, D1785.2.patch Currently, we add jitter to a compaction using delay + jitter*(1 - 2*Math.random()). Since this is non-deterministic, we can get major compaction storms on server restart as half the Stores that were set to delay + jitter will now be set to delay - jitter. We need a more deterministic way to jitter major compactions so this information can persist across server restarts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5332) Deterministic Compaction Jitter
[ https://issues.apache.org/jira/browse/HBASE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211642#comment-13211642 ] Nicolas Spiegelberg commented on HBASE-5332: @Lars: note that the operation you describe is operating on a double and converting to a long at the end, so there is no loss of granularity. Also note that the other proposed changes would 2x and 3x the jitter from the user-configured percentage value. Deterministic Compaction Jitter --- Key: HBASE-5332 URL: https://issues.apache.org/jira/browse/HBASE-5332 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Attachments: D1785.1.patch, D1785.2.patch Currently, we add jitter to a compaction using delay + jitter*(1 - 2*Math.random()). Since this is non-deterministic, we can get major compaction storms on server restart as half the Stores that were set to delay + jitter will now be set to delay - jitter. We need a more deterministic way to jitter major compactions so this information can persist across server restarts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5332) Deterministic Compaction Jitter
[ https://issues.apache.org/jira/browse/HBASE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5332: --- Attachment: D1785.3.patch nspiegelberg updated the revision [jira] [HBASE-5332] Deterministic Compaction Jitter. Reviewers: JIRA, Kannan, aaiyer, stack Addressed Kannan's peer review. Raised compaction delay because of false positives when the jitter brought it to 3 sec. REVISION DETAIL https://reviews.facebook.net/D1785 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java Deterministic Compaction Jitter --- Key: HBASE-5332 URL: https://issues.apache.org/jira/browse/HBASE-5332 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Attachments: D1785.1.patch, D1785.2.patch, D1785.3.patch Currently, we add jitter to a compaction using delay + jitter*(1 - 2*Math.random()). Since this is non-deterministic, we can get major compaction storms on server restart as half the Stores that were set to delay + jitter will now be set to delay - jitter. We need a more deterministic way to jitter major compactions so this information can persist across server restarts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5424) HTable meet NPE when call getRegionInfo()
[ https://issues.apache.org/jira/browse/HBASE-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhiyuan.dai updated HBASE-5424: --- Attachment: HBase-5424_1.patch HTable meet NPE when call getRegionInfo() - Key: HBASE-5424 URL: https://issues.apache.org/jira/browse/HBASE-5424 Project: HBase Issue Type: Bug Affects Versions: 0.90.1, 0.90.5 Reporter: junhua yang Attachments: HBASE-5424.patch, HBase-5424_1.patch Original Estimate: 48h Remaining Estimate: 48h We meet NPE when call getRegionInfo() in testing environment. Exception in thread main java.lang.NullPointerException at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75) at org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119) at org.apache.hadoop.hbase.client.HTable$2.processRow(HTable.java:395) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:190) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:73) at org.apache.hadoop.hbase.client.HTable.getRegionsInfo(HTable.java:418) This NPE also make the table.jsp can't show the region information of this table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5424) HTable meet NPE when call getRegionInfo()
[ https://issues.apache.org/jira/browse/HBASE-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211649#comment-13211649 ] Hadoop QA commented on HBASE-5424: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515208/HBase-5424_1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/991//console This message is automatically generated. HTable meet NPE when call getRegionInfo() - Key: HBASE-5424 URL: https://issues.apache.org/jira/browse/HBASE-5424 Project: HBase Issue Type: Bug Affects Versions: 0.90.1, 0.90.5 Reporter: junhua yang Attachments: HBASE-5424.patch, HBase-5424_1.patch Original Estimate: 48h Remaining Estimate: 48h We meet NPE when call getRegionInfo() in testing environment. Exception in thread main java.lang.NullPointerException at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75) at org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119) at org.apache.hadoop.hbase.client.HTable$2.processRow(HTable.java:395) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:190) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:73) at org.apache.hadoop.hbase.client.HTable.getRegionsInfo(HTable.java:418) This NPE also make the table.jsp can't show the region information of this table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5431) Improve delete marker handling in Import M/R jobs
[ https://issues.apache.org/jira/browse/HBASE-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211681#comment-13211681 ] Hudson commented on HBASE-5431: --- Integrated in HBase-TRUNK-security #117 (See [https://builds.apache.org/job/HBase-TRUNK-security/117/]) HBASE-5431 Improve delete marker handling in Import M/R jobs (Revision 1290955) Result = FAILURE larsh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java Improve delete marker handling in Import M/R jobs - Key: HBASE-5431 URL: https://issues.apache.org/jira/browse/HBASE-5431 Project: HBase Issue Type: Sub-task Components: mapreduce Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 5431.txt Import currently create a new Delete object for each delete KV found in a result object. This can be improved with the new Delete API that allows adding a delete KV to a Delete object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5075) regionserver crashed and failover
[ https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhiyuan.dai updated HBASE-5075: --- Attachment: (was: 5075.patch) regionserver crashed and failover - Key: HBASE-5075 URL: https://issues.apache.org/jira/browse/HBASE-5075 Project: HBase Issue Type: Improvement Components: monitoring, regionserver, replication, zookeeper Affects Versions: 0.92.1 Reporter: zhiyuan.dai Fix For: 0.90.5 Attachments: HBase-5075-src.patch regionserver crashed,it is too long time to notify hmaster.when hmaster know regionserver's shutdown,it is long time to fetch the hlog's lease. hbase is a online db, availability is very important. i have a idea to improve availability, monitor node to check regionserver's pid.if this pid not exsits,i think the rs down,i will delete the znode,and force close the hlog file. so the period maybe 100ms. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5075) regionserver crashed and failover
[ https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhiyuan.dai updated HBASE-5075: --- Attachment: HBase-5075-src.patch regionserver crashed and failover - Key: HBASE-5075 URL: https://issues.apache.org/jira/browse/HBASE-5075 Project: HBase Issue Type: Improvement Components: monitoring, regionserver, replication, zookeeper Affects Versions: 0.92.1 Reporter: zhiyuan.dai Fix For: 0.90.5 Attachments: HBase-5075-src.patch regionserver crashed,it is too long time to notify hmaster.when hmaster know regionserver's shutdown,it is long time to fetch the hlog's lease. hbase is a online db, availability is very important. i have a idea to improve availability, monitor node to check regionserver's pid.if this pid not exsits,i think the rs down,i will delete the znode,and force close the hlog file. so the period maybe 100ms. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5075) regionserver crashed and failover
[ https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhiyuan.dai updated HBASE-5075: --- Attachment: Degion of Failure Detection.pdf regionserver crashed and failover - Key: HBASE-5075 URL: https://issues.apache.org/jira/browse/HBASE-5075 Project: HBase Issue Type: Improvement Components: monitoring, regionserver, replication, zookeeper Affects Versions: 0.92.1 Reporter: zhiyuan.dai Fix For: 0.90.5 Attachments: Degion of Failure Detection.pdf, HBase-5075-src.patch regionserver crashed,it is too long time to notify hmaster.when hmaster know regionserver's shutdown,it is long time to fetch the hlog's lease. hbase is a online db, availability is very important. i have a idea to improve availability, monitor node to check regionserver's pid.if this pid not exsits,i think the rs down,i will delete the znode,and force close the hlog file. so the period maybe 100ms. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5075) regionserver crashed and failover
[ https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211691#comment-13211691 ] zhiyuan.dai commented on HBASE-5075: @stack @Lars Hofhansl First the rpc method getRSPidAndRsZknode is to fetch PID and znode which includes domain and service port,this way is reliable. If we use processes list, there may be some misjudgment. Second there is a supervisor called RegionServerFailureDetection,we first start regionserver, and then start RegionServerFailureDetection.RegionServerFailureDetection is a watchdog of RegionServer. Then the supervisor(RegionServerFailureDetection) of regionserver fetch PID and znode by getRSPidAndRsZknode. RegionServerFailureDetection doesn't have any relationship with long GC. RegionServerFailureDetection first check whether PID is alive and the check service port is alive. regionserver crashed and failover - Key: HBASE-5075 URL: https://issues.apache.org/jira/browse/HBASE-5075 Project: HBase Issue Type: Improvement Components: monitoring, regionserver, replication, zookeeper Affects Versions: 0.92.1 Reporter: zhiyuan.dai Fix For: 0.90.5 Attachments: Degion of Failure Detection.pdf, HBase-5075-src.patch regionserver crashed,it is too long time to notify hmaster.when hmaster know regionserver's shutdown,it is long time to fetch the hlog's lease. hbase is a online db, availability is very important. i have a idea to improve availability, monitor node to check regionserver's pid.if this pid not exsits,i think the rs down,i will delete the znode,and force close the hlog file. so the period maybe 100ms. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211703#comment-13211703 ] nkeywal commented on HBASE-5399: @Lars and all: done see https://reviews.apache.org/r/3967/ I sent it to the whole hbase group, I hope it's the right thing to do. I'm on vacation this week so I will see the comments after the 27th... Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399_inprogress.patch, 5399_inprogress.v3.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira