[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

2012-02-19 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211290#comment-13211290
 ] 

Zhihong Yu commented on HBASE-5347:
---

In filterRow(), we should deref the KeyValue which is removed from the list of 
KeyValues parameter.
In 0.89-fb branch, I found DependentColumnFilter where the above change should 
be added.

 GC free memory management in Level-1 Block Cache
 

 Key: HBASE-5347
 URL: https://issues.apache.org/jira/browse/HBASE-5347
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Attachments: D1635.5.patch


 On eviction of a block from the block-cache, instead of waiting for the 
 garbage collecter to reuse its memory, reuse the block right away.
 This will require us to keep reference counts on the HFile blocks. Once we 
 have the reference counts in place we can do our own simple 
 blocks-out-of-slab allocation for the block-cache.
 This will help us with
 * reducing gc pressure, especially in the old generation
 * making it possible to have non-java-heap memory backing the HFile blocks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5200) AM.ProcessRegionInTransition() and AM.handleRegion() race thus leaving the region assignment inconsistent

2012-02-19 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211390#comment-13211390
 ] 

ramkrishna.s.vasudevan commented on HBASE-5200:
---

Thanks for the commit Stack.  I was just thought of committing if it had not 
been.

 AM.ProcessRegionInTransition() and AM.handleRegion() race thus leaving the 
 region assignment inconsistent
 -

 Key: HBASE-5200
 URL: https://issues.apache.org/jira/browse/HBASE-5200
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.94.0, 0.92.1

 Attachments: 5200-test.txt, 5200-v2.txt, 5200-v3.txt, 
 5200-v4-092.txt, 5200-v4.txt, 5200-v4no-prefix.txt, HBASE-5200.patch, 
 HBASE-5200_1.patch, HBASE-5200_trunk_latest_with_test_2.patch, 
 TEST-org.apache.hadoop.hbase.master.TestRestartCluster.xml, 
 hbase-5200_90_latest.patch, hbase-5200_90_latest_new.patch


 This is the scenario
 Consider a case where the balancer is going on thus trying to close regions 
 in a RS.
 Before we could close a master switch happens.  
 On Master switch the set of nodes that are in RIT is collected and we first 
 get Data and start watching the node
 After that the node data is added into RIT.
 Now by this time (before adding to RIT) if the RS to which close was called 
 does a transition in AM.handleRegion() we miss the handling saying RIT state 
 was null.
 {code}
 2012-01-13 10:50:46,358 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 a66d281d231dfcaea97c270698b26b6f from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,358 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 c12e53bfd48ddc5eec507d66821c4d23 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,358 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 59ae13de8c1eb325a0dd51f4902d2052 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 f45bc9614d7575f35244849af85aa078 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 cc3ecd7054fe6cd4a1159ed92fd62641 from server 
 HOST-192-168-47-204,20020,1326342744518 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 3af40478a17fee96b4a192b22c90d5a2 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 e6096a8466e730463e10d3d61f809b92 from server 
 HOST-192-168-47-204,20020,1326342744518 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 4806781a1a23066f7baed22b4d237e24 from server 
 HOST-192-168-47-204,20020,1326342744518 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 d69e104131accaefe21dcc01fddc7629 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 {code}
 In branch the CLOSING node is created by RS thus leading to more 
 inconsistency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-02-19 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211422#comment-13211422
 ] 

nkeywal commented on HBASE-5399:


In v9 I added some comments and fixed some issues. 

bq. I'd say lets not check master is there till we need it. Seems like a PITA 
going ahead and checking master on construction. This changes the behavior but 
I think its one thats ok to change.

Agreed, removed in v9.

bq. You are doing your own Callables. You've seen the Callables that go on in 
HTable. Any reason you avoid them? I suppose this is different in that you want 
to let go of the shared master. Looks fine.

I could also add the master management in the HTable callables. I would still 
need the one I wrote for HBaseAdmin, but it could be useful to be able to use 
master from the HTable callables?

I still need to work on the unit tests, some large tests are failing today:
{noformat}
Failed tests:   testBackgroundEvictionThread[1](org.apache.hadoop.hbase.io.hfile
.TestLruBlockCache)
  testShutdownFixupWhenDaughterHasSplit(org.apache.hadoop.hbase.regionserver.Tes
tSplitTransactionOnCluster): expected:1 but was:0
  queueFailover(org.apache.hadoop.hbase.replication.TestReplication): Waited too
 much time for queueFailover replication
  testMergeTool(org.apache.hadoop.hbase.util.TestMergeTool): 'merging regions 0
and 1' failed with errCode -1

Tests in error:
  test3686a(org.apache.hadoop.hbase.client.TestScannerTimeout): 64142ms passed s
ince the last invocation, timeout is currently set to 1
{noformat}

For some of them it could be random and unrelated, but I am sure there are some 
real issues for of them.
I am on vacations next week; so I will come back to it the week after...



 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399_inprogress.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.

[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-02-19 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211435#comment-13211435
 ] 

nkeywal commented on HBASE-5399:


Moreover, v9 patch work on trunk as of today.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399_inprogress.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5435) TestForceCacheImportantBlocks fails with OutOfMemoryError

2012-02-19 Thread Zhihong Yu (Created) (JIRA)
TestForceCacheImportantBlocks fails with OutOfMemoryError
-

 Key: HBASE-5435
 URL: https://issues.apache.org/jira/browse/HBASE-5435
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
 Fix For: 0.94.0


Here is related stack trace (see 
https://builds.apache.org/job/HBase-TRUNK/2665/testReport/org.apache.hadoop.hbase.io.hfile/TestForceCacheImportantBlocks/testCacheBlocks_1_/):
{code}
Caused by: java.lang.OutOfMemoryError
at java.util.zip.Deflater.init(Native Method)
at java.util.zip.Deflater.init(Deflater.java:124)
at java.util.zip.GZIPOutputStream.init(GZIPOutputStream.java:46)
at java.util.zip.GZIPOutputStream.init(GZIPOutputStream.java:58)
at 
org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec$ReusableGzipOutputStream$ResetableGZIPOutputStream.init(ReusableStreamGzipCodec.java:79)
at 
org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec$ReusableGzipOutputStream.init(ReusableStreamGzipCodec.java:90)
at 
org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec.createOutputStream(ReusableStreamGzipCodec.java:130)
at 
org.apache.hadoop.io.compress.GzipCodec.createOutputStream(GzipCodec.java:101)
at 
org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createPlainCompressionStream(Compression.java:239)
at 
org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createCompressionStream(Compression.java:223)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV1.getCompressingStream(HFileWriterV1.java:270)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV1.close(HFileWriterV1.java:416)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:1115)
at 
org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:706)
at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:633)
at org.apache.hadoop.hbase.regionserver.Store.access$400(Store.java:106)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

2012-02-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211610#comment-13211610
 ] 

Lars Hofhansl commented on HBASE-5347:
--

@Ted: That might be right thing to do in filterRow(), but I am slowly getting 
this feeling that this will put a lot of onus to account for this on various 
HBase parts (such as filters).

 GC free memory management in Level-1 Block Cache
 

 Key: HBASE-5347
 URL: https://issues.apache.org/jira/browse/HBASE-5347
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Attachments: D1635.5.patch


 On eviction of a block from the block-cache, instead of waiting for the 
 garbage collecter to reuse its memory, reuse the block right away.
 This will require us to keep reference counts on the HFile blocks. Once we 
 have the reference counts in place we can do our own simple 
 blocks-out-of-slab allocation for the block-cache.
 This will help us with
 * reducing gc pressure, especially in the old generation
 * making it possible to have non-java-heap memory backing the HFile blocks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-02-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211612#comment-13211612
 ] 

Lars Hofhansl commented on HBASE-5399:
--

@Nicolas: Could you upload the patch to RB? That would make it easier to review.
I'm loving all changes discussed here so far :)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399_inprogress.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3149) Make flush decisions per column family

2012-02-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211614#comment-13211614
 ] 

Lars Hofhansl commented on HBASE-3149:
--

@Nicolas: Interesting bit about hstore.compaction.min.size. I'm curious, is 4MB 
something that works specifically for your setup, or would you generally 
recommend it setting it that low?
It probably has to do with whether compression is enabled, how many CFs and 
relative sizes, etc.

Maybe instead of defaulting it to flushsize, we could default it to flushsize/2 
or flushsize/4...?


 Make flush decisions per column family
 --

 Key: HBASE-3149
 URL: https://issues.apache.org/jira/browse/HBASE-3149
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Karthik Ranganathan
Assignee: Nicolas Spiegelberg

 Today, the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size

2012-02-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211619#comment-13211619
 ] 

Lars Hofhansl commented on HBASE-4365:
--

Wouldn't we potentially do a lot of splitting when there are many regionservers?
(Maybe I am not grokking this fully)

If we take the square of the of the number of regions, and say we have 10gb 
regions and flush size of 128mb, we'd read the 10gb after at 9 regions of the 
table on the same regionserver.
We were planning a region size of 5gb and flush size of 256mb, that would still 
be 5 regions.
(10gb/128mb ~ 78, 5gb/256mb ~ 19)


 Add a decent heuristic for region size
 --

 Key: HBASE-4365
 URL: https://issues.apache.org/jira/browse/HBASE-4365
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0, 0.92.1
Reporter: Todd Lipcon
Priority: Critical
  Labels: usability
 Attachments: 4365.txt


 A few of us were brainstorming this morning about what the default region 
 size should be. There were a few general points made:
 - in some ways it's better to be too-large than too-small, since you can 
 always split a table further, but you can't merge regions currently
 - with HFile v2 and multithreaded compactions there are fewer reasons to 
 avoid very-large regions (10GB+)
 - for small tables you may want a small region size just so you can 
 distribute load better across a cluster
 - for big tables, multi-GB is probably best

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4365) Add a decent heuristic for region size

2012-02-19 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211619#comment-13211619
 ] 

Lars Hofhansl edited comment on HBASE-4365 at 2/20/12 12:39 AM:


Wouldn't we potentially do a lot of splitting when there are many regionservers?
(Maybe I am not grokking this fully)

If we take the square of the of the number of regions, and say we have 10gb 
regions and flush size of 128mb, we'd reach the 10gb after at 9 regions of the 
table on the same regionserver.
We were planning a region size of 5gb and flush size of 256mb, that would still 
be 5 regions.
(10gb/128mb ~ 78, 5gb/256mb ~ 19)


  was (Author: lhofhansl):
Wouldn't we potentially do a lot of splitting when there are many 
regionservers?
(Maybe I am not grokking this fully)

If we take the square of the of the number of regions, and say we have 10gb 
regions and flush size of 128mb, we'd read the 10gb after at 9 regions of the 
table on the same regionserver.
We were planning a region size of 5gb and flush size of 256mb, that would still 
be 5 regions.
(10gb/128mb ~ 78, 5gb/256mb ~ 19)

  
 Add a decent heuristic for region size
 --

 Key: HBASE-4365
 URL: https://issues.apache.org/jira/browse/HBASE-4365
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0, 0.92.1
Reporter: Todd Lipcon
Priority: Critical
  Labels: usability
 Attachments: 4365.txt


 A few of us were brainstorming this morning about what the default region 
 size should be. There were a few general points made:
 - in some ways it's better to be too-large than too-small, since you can 
 always split a table further, but you can't merge regions currently
 - with HFile v2 and multithreaded compactions there are fewer reasons to 
 avoid very-large regions (10GB+)
 - for small tables you may want a small region size just so you can 
 distribute load better across a cluster
 - for big tables, multi-GB is probably best

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size

2012-02-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211620#comment-13211620
 ] 

Lars Hofhansl commented on HBASE-4365:
--

Never use a calculator... 10gb/128mb = 80, 5bgb/256mb = 20.


 Add a decent heuristic for region size
 --

 Key: HBASE-4365
 URL: https://issues.apache.org/jira/browse/HBASE-4365
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0, 0.92.1
Reporter: Todd Lipcon
Priority: Critical
  Labels: usability
 Attachments: 4365.txt


 A few of us were brainstorming this morning about what the default region 
 size should be. There were a few general points made:
 - in some ways it's better to be too-large than too-small, since you can 
 always split a table further, but you can't merge regions currently
 - with HFile v2 and multithreaded compactions there are fewer reasons to 
 avoid very-large regions (10GB+)
 - for small tables you may want a small region size just so you can 
 distribute load better across a cluster
 - for big tables, multi-GB is probably best

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5332) Deterministic Compaction Jitter

2012-02-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211641#comment-13211641
 ] 

Phabricator commented on HBASE-5332:


nspiegelberg has commented on the revision [jira] [HBASE-5332] Deterministic 
Compaction Jitter.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:1067 I can null 
check.  note that this variable is not null-checked in a lot of places in 
Store.java.  Wish there was a way to guarantee non-null
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:1065 I was 
worried about a case where this check happened while a major compaction was 
finishing up  (see Store.completeCompaction).  Besides per-file compactions, 
the HRegionServer.CompactionChecker thread is another area where this race can 
occur.

  Maybe this isn't a practical problem since we don't cache this calculation 
anymore  the major compaction interval should normally be very large so a race 
condition won't trigger a major compaction.
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:1068 Thinking 
about it more, the pathname alone should suffice since it's a 'crypographically 
strong' random UUID

REVISION DETAIL
  https://reviews.facebook.net/D1785


 Deterministic Compaction Jitter
 ---

 Key: HBASE-5332
 URL: https://issues.apache.org/jira/browse/HBASE-5332
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor
 Attachments: D1785.1.patch, D1785.2.patch


 Currently, we add jitter to a compaction using delay + jitter*(1 - 
 2*Math.random()).  Since this is non-deterministic, we can get major 
 compaction storms on server restart as half the Stores that were set to 
 delay + jitter will now be set to delay - jitter.  We need a more 
 deterministic way to jitter major compactions so this information can persist 
 across server restarts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5332) Deterministic Compaction Jitter

2012-02-19 Thread Nicolas Spiegelberg (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211642#comment-13211642
 ] 

Nicolas Spiegelberg commented on HBASE-5332:


@Lars: note that the operation you describe is operating on a double and 
converting to a long at the end, so there is no loss of granularity.  Also note 
that the other proposed changes would 2x and 3x the jitter from the 
user-configured percentage value.

 Deterministic Compaction Jitter
 ---

 Key: HBASE-5332
 URL: https://issues.apache.org/jira/browse/HBASE-5332
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor
 Attachments: D1785.1.patch, D1785.2.patch


 Currently, we add jitter to a compaction using delay + jitter*(1 - 
 2*Math.random()).  Since this is non-deterministic, we can get major 
 compaction storms on server restart as half the Stores that were set to 
 delay + jitter will now be set to delay - jitter.  We need a more 
 deterministic way to jitter major compactions so this information can persist 
 across server restarts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5332) Deterministic Compaction Jitter

2012-02-19 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5332:
---

Attachment: D1785.3.patch

nspiegelberg updated the revision [jira] [HBASE-5332] Deterministic Compaction 
Jitter.
Reviewers: JIRA, Kannan, aaiyer, stack

  Addressed Kannan's peer review.  Raised compaction delay because of false 
positives when the jitter brought it to 3 sec.

REVISION DETAIL
  https://reviews.facebook.net/D1785

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java


 Deterministic Compaction Jitter
 ---

 Key: HBASE-5332
 URL: https://issues.apache.org/jira/browse/HBASE-5332
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor
 Attachments: D1785.1.patch, D1785.2.patch, D1785.3.patch


 Currently, we add jitter to a compaction using delay + jitter*(1 - 
 2*Math.random()).  Since this is non-deterministic, we can get major 
 compaction storms on server restart as half the Stores that were set to 
 delay + jitter will now be set to delay - jitter.  We need a more 
 deterministic way to jitter major compactions so this information can persist 
 across server restarts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5424) HTable meet NPE when call getRegionInfo()

2012-02-19 Thread zhiyuan.dai (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhiyuan.dai updated HBASE-5424:
---

Attachment: HBase-5424_1.patch

 HTable meet NPE when call getRegionInfo()
 -

 Key: HBASE-5424
 URL: https://issues.apache.org/jira/browse/HBASE-5424
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.1, 0.90.5
Reporter: junhua yang
 Attachments: HBASE-5424.patch, HBase-5424_1.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 We meet NPE when call getRegionInfo() in testing environment.
 Exception in thread main java.lang.NullPointerException
 at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75)
 at org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119)
 at org.apache.hadoop.hbase.client.HTable$2.processRow(HTable.java:395)
 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:190)
 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:73)
 at org.apache.hadoop.hbase.client.HTable.getRegionsInfo(HTable.java:418)
 This NPE also make the table.jsp can't show the region information of this 
 table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5424) HTable meet NPE when call getRegionInfo()

2012-02-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211649#comment-13211649
 ] 

Hadoop QA commented on HBASE-5424:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12515208/HBase-5424_1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/991//console

This message is automatically generated.

 HTable meet NPE when call getRegionInfo()
 -

 Key: HBASE-5424
 URL: https://issues.apache.org/jira/browse/HBASE-5424
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.1, 0.90.5
Reporter: junhua yang
 Attachments: HBASE-5424.patch, HBase-5424_1.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 We meet NPE when call getRegionInfo() in testing environment.
 Exception in thread main java.lang.NullPointerException
 at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75)
 at org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119)
 at org.apache.hadoop.hbase.client.HTable$2.processRow(HTable.java:395)
 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:190)
 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:73)
 at org.apache.hadoop.hbase.client.HTable.getRegionsInfo(HTable.java:418)
 This NPE also make the table.jsp can't show the region information of this 
 table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5431) Improve delete marker handling in Import M/R jobs

2012-02-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211681#comment-13211681
 ] 

Hudson commented on HBASE-5431:
---

Integrated in HBase-TRUNK-security #117 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/117/])
HBASE-5431 Improve delete marker handling in Import M/R jobs (Revision 
1290955)

 Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java


 Improve delete marker handling in Import M/R jobs
 -

 Key: HBASE-5431
 URL: https://issues.apache.org/jira/browse/HBASE-5431
 Project: HBase
  Issue Type: Sub-task
  Components: mapreduce
Affects Versions: 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0

 Attachments: 5431.txt


 Import currently create a new Delete object for each delete KV found in a 
 result object.
 This can be improved with the new Delete API that allows adding a delete KV 
 to a Delete object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5075) regionserver crashed and failover

2012-02-19 Thread zhiyuan.dai (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhiyuan.dai updated HBASE-5075:
---

Attachment: (was: 5075.patch)

 regionserver crashed and failover
 -

 Key: HBASE-5075
 URL: https://issues.apache.org/jira/browse/HBASE-5075
 Project: HBase
  Issue Type: Improvement
  Components: monitoring, regionserver, replication, zookeeper
Affects Versions: 0.92.1
Reporter: zhiyuan.dai
 Fix For: 0.90.5

 Attachments: HBase-5075-src.patch


 regionserver crashed,it is too long time to notify hmaster.when hmaster know 
 regionserver's shutdown,it is long time to fetch the hlog's lease.
 hbase is a online db, availability is very important.
 i have a idea to improve availability, monitor node to check regionserver's 
 pid.if this pid not exsits,i think the rs down,i will delete the znode,and 
 force close the hlog file.
 so the period maybe 100ms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5075) regionserver crashed and failover

2012-02-19 Thread zhiyuan.dai (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhiyuan.dai updated HBASE-5075:
---

Attachment: HBase-5075-src.patch

 regionserver crashed and failover
 -

 Key: HBASE-5075
 URL: https://issues.apache.org/jira/browse/HBASE-5075
 Project: HBase
  Issue Type: Improvement
  Components: monitoring, regionserver, replication, zookeeper
Affects Versions: 0.92.1
Reporter: zhiyuan.dai
 Fix For: 0.90.5

 Attachments: HBase-5075-src.patch


 regionserver crashed,it is too long time to notify hmaster.when hmaster know 
 regionserver's shutdown,it is long time to fetch the hlog's lease.
 hbase is a online db, availability is very important.
 i have a idea to improve availability, monitor node to check regionserver's 
 pid.if this pid not exsits,i think the rs down,i will delete the znode,and 
 force close the hlog file.
 so the period maybe 100ms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5075) regionserver crashed and failover

2012-02-19 Thread zhiyuan.dai (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhiyuan.dai updated HBASE-5075:
---

Attachment: Degion of Failure Detection.pdf

 regionserver crashed and failover
 -

 Key: HBASE-5075
 URL: https://issues.apache.org/jira/browse/HBASE-5075
 Project: HBase
  Issue Type: Improvement
  Components: monitoring, regionserver, replication, zookeeper
Affects Versions: 0.92.1
Reporter: zhiyuan.dai
 Fix For: 0.90.5

 Attachments: Degion of Failure Detection.pdf, HBase-5075-src.patch


 regionserver crashed,it is too long time to notify hmaster.when hmaster know 
 regionserver's shutdown,it is long time to fetch the hlog's lease.
 hbase is a online db, availability is very important.
 i have a idea to improve availability, monitor node to check regionserver's 
 pid.if this pid not exsits,i think the rs down,i will delete the znode,and 
 force close the hlog file.
 so the period maybe 100ms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5075) regionserver crashed and failover

2012-02-19 Thread zhiyuan.dai (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211691#comment-13211691
 ] 

zhiyuan.dai commented on HBASE-5075:


@stack @Lars Hofhansl
First the rpc method getRSPidAndRsZknode is to fetch PID and znode which 
includes domain and service port,this way is reliable. If we use processes 
list, there may be some misjudgment.

Second there is a supervisor called RegionServerFailureDetection,we first start 
regionserver, and then start 
RegionServerFailureDetection.RegionServerFailureDetection is a watchdog of 
RegionServer.

Then the supervisor(RegionServerFailureDetection) of regionserver fetch PID and 
znode by getRSPidAndRsZknode.

RegionServerFailureDetection doesn't have any relationship with long GC.

RegionServerFailureDetection first check whether PID is alive and the check 
service port is alive.

 regionserver crashed and failover
 -

 Key: HBASE-5075
 URL: https://issues.apache.org/jira/browse/HBASE-5075
 Project: HBase
  Issue Type: Improvement
  Components: monitoring, regionserver, replication, zookeeper
Affects Versions: 0.92.1
Reporter: zhiyuan.dai
 Fix For: 0.90.5

 Attachments: Degion of Failure Detection.pdf, HBase-5075-src.patch


 regionserver crashed,it is too long time to notify hmaster.when hmaster know 
 regionserver's shutdown,it is long time to fetch the hlog's lease.
 hbase is a online db, availability is very important.
 i have a idea to improve availability, monitor node to check regionserver's 
 pid.if this pid not exsits,i think the rs down,i will delete the znode,and 
 force close the hlog file.
 so the period maybe 100ms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-02-19 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211703#comment-13211703
 ] 

nkeywal commented on HBASE-5399:


@Lars and all: done see https://reviews.apache.org/r/3967/
I sent it to the whole hbase group, I hope it's the right thing to do.

I'm on vacation this week so I will see the comments after the 27th... 


 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399_inprogress.patch, 5399_inprogress.v3.patch, 
 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira