[jira] [Updated] (HBASE-5514) Compile against hadoop 0.24-SNAPSHOT

2012-03-04 Thread Mingjie Lai (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingjie Lai updated HBASE-5514:
---

Attachment: HBASE-5514-3.patch

New patch:

- add 0.24 profile to pom.xml
- use reflection to determine which sync method to use in TestHLog and 
TestHLogSplit.

 Compile against hadoop 0.24-SNAPSHOT
 

 Key: HBASE-5514
 URL: https://issues.apache.org/jira/browse/HBASE-5514
 Project: HBase
  Issue Type: Bug
  Components: build, test
Affects Versions: 0.92.0, 0.94.0
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-5514-2.patch, HBASE-5514-3.patch, HBASE-5514.patch


 Need to compile hbase against the latest hadoop trunk which just had NN HA 
 merged in. 
 1) add a hadoop 0.24 profile
 2) HBASE-5480
 3) HADOOP-8124 removed deprecated Syncable.sync(). It brings compile errors 
 for hbase against hadoop trunk(0.24). TestHLogSplit and TestHLog still call 
 the deprecated sync(). Need to replace it with hflush() so the compilation 
 can pass. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221825#comment-13221825
 ] 

Phabricator commented on HBASE-5515:


sc has commented on the revision HBASE-5515 [jira] Add a processRow API that 
supports atomic multiple reads and writes on a row.

  I will also read HBASE-5229 to see what are the alternatives. Thanks!

REVISION DETAIL
  https://reviews.facebook.net/D2067


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API

2012-03-04 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5510:
--

Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

 Change in LB.randomAssignment(ListServerName servers) API
 ---

 Key: HBASE-5510
 URL: https://issues.apache.org/jira/browse/HBASE-5510
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBase-5510.patch, HBase-5510_2.patch


  In LB there is randomAssignment(ListServerName servers) API which will be 
 used by AM to assign
  a region from a down RS. [This will be also used in other cases like call to 
 assign() API from client]
  I feel it would be better to pass the HRegionInfo also into this method. 
 When the LB making a choice for a region
  assignment, when one RS is down, it would be nice that the LB knows for 
 which region it is doing this server selection.
 +Scenario+
  While one RS down, we wanted the regions to get moved to other RSs but a set 
 of regions stay together. We are having custom load balancer but with the 
 current way of LB interface this is not possible. Another way is I can allow 
 a random assignment of the regions at the RS down time. Later with a cluster 
 balance I can balance the regions as I need. But this might make regions 
 assign 1st to one RS and then again move to another. Also for some time 
 period my business use case can not get satisfied.
 Also I have seen some issue in JIRA which speaks about making sure that Root 
 and META regions always sit in some specific RSs. With the current LB API 
 this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API

2012-03-04 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5510:
--

Status: Patch Available  (was: Open)

 Change in LB.randomAssignment(ListServerName servers) API
 ---

 Key: HBASE-5510
 URL: https://issues.apache.org/jira/browse/HBASE-5510
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBase-5510.patch, HBase-5510_2.patch


  In LB there is randomAssignment(ListServerName servers) API which will be 
 used by AM to assign
  a region from a down RS. [This will be also used in other cases like call to 
 assign() API from client]
  I feel it would be better to pass the HRegionInfo also into this method. 
 When the LB making a choice for a region
  assignment, when one RS is down, it would be nice that the LB knows for 
 which region it is doing this server selection.
 +Scenario+
  While one RS down, we wanted the regions to get moved to other RSs but a set 
 of regions stay together. We are having custom load balancer but with the 
 current way of LB interface this is not possible. Another way is I can allow 
 a random assignment of the regions at the RS down time. Later with a cluster 
 balance I can balance the regions as I need. But this might make regions 
 assign 1st to one RS and then again move to another. Also for some time 
 period my business use case can not get satisfied.
 Also I have seen some issue in JIRA which speaks about making sure that Root 
 and META regions always sit in some specific RSs. With the current LB API 
 this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API

2012-03-04 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221909#comment-13221909
 ] 

Zhihong Yu commented on HBASE-5510:
---

Patch v2 looks good.

Minor comments:
{code}
+  balancer.randomAssignment(state.getRegion(),servers));
{code}
A space is needed between comma and servers.
{code}
+  public ServerName randomAssignment(HRegionInfo regionInfo,ListServerName 
servers);
{code}
Wrap long line, please.

 Change in LB.randomAssignment(ListServerName servers) API
 ---

 Key: HBASE-5510
 URL: https://issues.apache.org/jira/browse/HBASE-5510
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBase-5510.patch, HBase-5510_2.patch


  In LB there is randomAssignment(ListServerName servers) API which will be 
 used by AM to assign
  a region from a down RS. [This will be also used in other cases like call to 
 assign() API from client]
  I feel it would be better to pass the HRegionInfo also into this method. 
 When the LB making a choice for a region
  assignment, when one RS is down, it would be nice that the LB knows for 
 which region it is doing this server selection.
 +Scenario+
  While one RS down, we wanted the regions to get moved to other RSs but a set 
 of regions stay together. We are having custom load balancer but with the 
 current way of LB interface this is not possible. Another way is I can allow 
 a random assignment of the regions at the RS down time. Later with a cluster 
 balance I can balance the regions as I need. But this might make regions 
 assign 1st to one RS and then again move to another. Also for some time 
 period my business use case can not get satisfied.
 Also I have seen some issue in JIRA which speaks about making sure that Root 
 and META regions always sit in some specific RSs. With the current LB API 
 this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Open  (was: Patch Available)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Attachment: 5399_inprogress.v32.patch

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Patch Available  (was: Open)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221911#comment-13221911
 ] 

Hadoop QA commented on HBASE-5399:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12517002/5399_inprogress.v32.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 30 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1088//console

This message is automatically generated.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5514) Compile against hadoop 0.24-SNAPSHOT

2012-03-04 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5514:
--

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 Compile against hadoop 0.24-SNAPSHOT
 

 Key: HBASE-5514
 URL: https://issues.apache.org/jira/browse/HBASE-5514
 Project: HBase
  Issue Type: Bug
  Components: build, test
Affects Versions: 0.92.0, 0.94.0
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-5514-2.patch, HBASE-5514-3.patch, HBASE-5514.patch


 Need to compile hbase against the latest hadoop trunk which just had NN HA 
 merged in. 
 1) add a hadoop 0.24 profile
 2) HBASE-5480
 3) HADOOP-8124 removed deprecated Syncable.sync(). It brings compile errors 
 for hbase against hadoop trunk(0.24). TestHLogSplit and TestHLog still call 
 the deprecated sync(). Need to replace it with hflush() so the compilation 
 can pass. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5514) Compile against hadoop 0.24-SNAPSHOT

2012-03-04 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221912#comment-13221912
 ] 

Zhihong Yu commented on HBASE-5514:
---

Similar reflection code is introduced for both tests.
Can we extract the new code into a helper class or method in 
org.apache.hadoop.hbase.regionserver.wal package ?

{code}
+if (syncMethod != null) {
+  syncMethod.invoke(out, new Object[]{});
+}
{code}
Is the above check needed ? If getMethod() calls fail, there should be 
exception thrown that would fail the test.

 Compile against hadoop 0.24-SNAPSHOT
 

 Key: HBASE-5514
 URL: https://issues.apache.org/jira/browse/HBASE-5514
 Project: HBase
  Issue Type: Bug
  Components: build, test
Affects Versions: 0.92.0, 0.94.0
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-5514-2.patch, HBASE-5514-3.patch, HBASE-5514.patch


 Need to compile hbase against the latest hadoop trunk which just had NN HA 
 merged in. 
 1) add a hadoop 0.24 profile
 2) HBASE-5480
 3) HADOOP-8124 removed deprecated Syncable.sync(). It brings compile errors 
 for hbase against hadoop trunk(0.24). TestHLogSplit and TestHLog still call 
 the deprecated sync(). Need to replace it with hflush() so the compilation 
 can pass. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT

2012-03-04 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5512:
--

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 Add support for INCLUDE_AND_SEEK_USING_HINT
 ---

 Key: HBASE-5512
 URL: https://issues.apache.org/jira/browse/HBASE-5512
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
Assignee: Lars Hofhansl
 Attachments: 5512-v2.txt, 5512.txt


 This came up from HBASE-2038
 From Anoop:
 - What we wanted from the filter is include a row and then seek to the next 
 row which we are interested in. I cant see such a facility with our Filter 
 right now. Correct me if I am wrong. So suppose we already seeked to one row 
 and this need to be included in the result, then the Filter should return 
 INCLUDE. Then when the next next() call happens, then only we can return a 
 SEEK_USING_HINT. So one extra row reading is needed. This might create even 
 one unwanted HFileBlock fetch (who knows).
 Can we add reseek() at higher level?
 From Lars:
 Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the 
 INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, 
 I'm happy to do that, if that's the route we want to go with this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT

2012-03-04 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221917#comment-13221917
 ] 

Zhihong Yu commented on HBASE-5512:
---

For CustomIncludeAndNextUsingHintFilter.java:
{code}
+  public KeyValue getNextKeyHint(KeyValue kv) {
+if (count = seekHints.length) 
+  return null;
{code}
Braces should be added around return.
Can we make an assertion above ? Since filterAllRemaining() returns true for 
this case, I wonder why getNextKeyHint() would still be called.

 Add support for INCLUDE_AND_SEEK_USING_HINT
 ---

 Key: HBASE-5512
 URL: https://issues.apache.org/jira/browse/HBASE-5512
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
Assignee: Lars Hofhansl
 Attachments: 5512-v2.txt, 5512.txt


 This came up from HBASE-2038
 From Anoop:
 - What we wanted from the filter is include a row and then seek to the next 
 row which we are interested in. I cant see such a facility with our Filter 
 right now. Correct me if I am wrong. So suppose we already seeked to one row 
 and this need to be included in the result, then the Filter should return 
 INCLUDE. Then when the next next() call happens, then only we can return a 
 SEEK_USING_HINT. So one extra row reading is needed. This might create even 
 one unwanted HFileBlock fetch (who knows).
 Can we add reseek() at higher level?
 From Lars:
 Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the 
 INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, 
 I'm happy to do that, if that's the route we want to go with this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT

2012-03-04 Thread Zhihong Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221915#comment-13221915
 ] 

Zhihong Yu edited comment on HBASE-5512 at 3/4/12 3:45 PM:
---

CustomIncludeAndNextUsingHintFilter.java misses license and class javadoc.
In ScanQueryMatcher.java:
{code}
+boolean filterSeek = false;
{code}
Can the above variable be named such that the name reflects semantics for 
INCLUDE_AND_NEXT_USING_HINT ?
That would make understanding the following code easier:
{code}
+if (filterSeek) {
+  switch(colChecker) {
+  case INCLUDE:
{code}

  was (Author: zhi...@ebaysf.com):
CustomIncludeAndNextUsingHintFilter.java misses license and class javadoc.
{code}
+boolean filterSeek = false;
{code}
Can the above variable be named such that the name reflects semantics for 
INCLUDE_AND_NEXT_USING_HINT ?
That would make understanding the following code easier:
{code}
+if (filterSeek) {
+  switch(colChecker) {
+  case INCLUDE:
{code}
  
 Add support for INCLUDE_AND_SEEK_USING_HINT
 ---

 Key: HBASE-5512
 URL: https://issues.apache.org/jira/browse/HBASE-5512
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
Assignee: Lars Hofhansl
 Attachments: 5512-v2.txt, 5512.txt


 This came up from HBASE-2038
 From Anoop:
 - What we wanted from the filter is include a row and then seek to the next 
 row which we are interested in. I cant see such a facility with our Filter 
 right now. Correct me if I am wrong. So suppose we already seeked to one row 
 and this need to be included in the result, then the Filter should return 
 INCLUDE. Then when the next next() call happens, then only we can return a 
 SEEK_USING_HINT. So one extra row reading is needed. This might create even 
 one unwanted HFileBlock fetch (who knows).
 Can we add reseek() at higher level?
 From Lars:
 Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the 
 INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, 
 I'm happy to do that, if that's the route we want to go with this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221920#comment-13221920
 ] 

Hadoop QA commented on HBASE-5510:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12516978/HBase-5510_2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -129 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 154 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1089//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1089//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1089//console

This message is automatically generated.

 Change in LB.randomAssignment(ListServerName servers) API
 ---

 Key: HBASE-5510
 URL: https://issues.apache.org/jira/browse/HBASE-5510
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBase-5510.patch, HBase-5510_2.patch


  In LB there is randomAssignment(ListServerName servers) API which will be 
 used by AM to assign
  a region from a down RS. [This will be also used in other cases like call to 
 assign() API from client]
  I feel it would be better to pass the HRegionInfo also into this method. 
 When the LB making a choice for a region
  assignment, when one RS is down, it would be nice that the LB knows for 
 which region it is doing this server selection.
 +Scenario+
  While one RS down, we wanted the regions to get moved to other RSs but a set 
 of regions stay together. We are having custom load balancer but with the 
 current way of LB interface this is not possible. Another way is I can allow 
 a random assignment of the regions at the RS down time. Later with a cluster 
 balance I can balance the regions as I need. But this might make regions 
 assign 1st to one RS and then again move to another. Also for some time 
 period my business use case can not get satisfied.
 Also I have seen some issue in JIRA which speaks about making sure that Root 
 and META regions always sit in some specific RSs. With the current LB API 
 this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5514) Compile against hadoop 0.24-SNAPSHOT

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221923#comment-13221923
 ] 

Hadoop QA commented on HBASE-5514:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12516981/HBASE-5514-3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -129 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 154 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1090//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1090//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1090//console

This message is automatically generated.

 Compile against hadoop 0.24-SNAPSHOT
 

 Key: HBASE-5514
 URL: https://issues.apache.org/jira/browse/HBASE-5514
 Project: HBase
  Issue Type: Bug
  Components: build, test
Affects Versions: 0.92.0, 0.94.0
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-5514-2.patch, HBASE-5514-3.patch, HBASE-5514.patch


 Need to compile hbase against the latest hadoop trunk which just had NN HA 
 merged in. 
 1) add a hadoop 0.24 profile
 2) HBASE-5480
 3) HADOOP-8124 removed deprecated Syncable.sync(). It brings compile errors 
 for hbase against hadoop trunk(0.24). TestHLogSplit and TestHLog still call 
 the deprecated sync(). Need to replace it with hflush() so the compilation 
 can pass. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221926#comment-13221926
 ] 

Hadoop QA commented on HBASE-5512:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12516970/5512-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -129 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 154 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1091//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1091//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1091//console

This message is automatically generated.

 Add support for INCLUDE_AND_SEEK_USING_HINT
 ---

 Key: HBASE-5512
 URL: https://issues.apache.org/jira/browse/HBASE-5512
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
Assignee: Lars Hofhansl
 Attachments: 5512-v2.txt, 5512.txt


 This came up from HBASE-2038
 From Anoop:
 - What we wanted from the filter is include a row and then seek to the next 
 row which we are interested in. I cant see such a facility with our Filter 
 right now. Correct me if I am wrong. So suppose we already seeked to one row 
 and this need to be included in the result, then the Filter should return 
 INCLUDE. Then when the next next() call happens, then only we can return a 
 SEEK_USING_HINT. So one extra row reading is needed. This might create even 
 one unwanted HFileBlock fetch (who knows).
 Can we add reseek() at higher level?
 From Lars:
 Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the 
 INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, 
 I'm happy to do that, if that's the route we want to go with this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API

2012-03-04 Thread Anoop Sam John (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-5510:
--

Attachment: HBase-5010_3.patch

 Change in LB.randomAssignment(ListServerName servers) API
 ---

 Key: HBASE-5510
 URL: https://issues.apache.org/jira/browse/HBASE-5510
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBase-5010_3.patch, HBase-5510.patch, HBase-5510_2.patch


  In LB there is randomAssignment(ListServerName servers) API which will be 
 used by AM to assign
  a region from a down RS. [This will be also used in other cases like call to 
 assign() API from client]
  I feel it would be better to pass the HRegionInfo also into this method. 
 When the LB making a choice for a region
  assignment, when one RS is down, it would be nice that the LB knows for 
 which region it is doing this server selection.
 +Scenario+
  While one RS down, we wanted the regions to get moved to other RSs but a set 
 of regions stay together. We are having custom load balancer but with the 
 current way of LB interface this is not possible. Another way is I can allow 
 a random assignment of the regions at the RS down time. Later with a cluster 
 balance I can balance the regions as I need. But this might make regions 
 assign 1st to one RS and then again move to another. Also for some time 
 period my business use case can not get satisfied.
 Also I have seen some issue in JIRA which speaks about making sure that Root 
 and META regions always sit in some specific RSs. With the current LB API 
 this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API

2012-03-04 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221938#comment-13221938
 ] 

Zhihong Yu commented on HBASE-5510:
---

+1 on patch v3

I will integrate to TRUNK Monday afternoon if there is no objection.

 Change in LB.randomAssignment(ListServerName servers) API
 ---

 Key: HBASE-5510
 URL: https://issues.apache.org/jira/browse/HBASE-5510
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBase-5010_3.patch, HBase-5510.patch, HBase-5510_2.patch


  In LB there is randomAssignment(ListServerName servers) API which will be 
 used by AM to assign
  a region from a down RS. [This will be also used in other cases like call to 
 assign() API from client]
  I feel it would be better to pass the HRegionInfo also into this method. 
 When the LB making a choice for a region
  assignment, when one RS is down, it would be nice that the LB knows for 
 which region it is doing this server selection.
 +Scenario+
  While one RS down, we wanted the regions to get moved to other RSs but a set 
 of regions stay together. We are having custom load balancer but with the 
 current way of LB interface this is not possible. Another way is I can allow 
 a random assignment of the regions at the RS down time. Later with a cluster 
 balance I can balance the regions as I need. But this might make regions 
 assign 1st to one RS and then again move to another. Also for some time 
 period my business use case can not get satisfied.
 Also I have seen some issue in JIRA which speaks about making sure that Root 
 and META regions always sit in some specific RSs. With the current LB API 
 this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Open  (was: Patch Available)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5515:
---

Attachment: HBASE-5515.D2067.5.patch

sc updated the revision HBASE-5515 [jira] Add a processRow API that supports 
atomic multiple reads and writes on a row.
Reviewers: tedyu, dhruba, JIRA

  Forgot to import InterfaceAudience

REVISION DETAIL
  https://reviews.facebook.net/D2067

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/client/HTable.java
  src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
  src/main/java/org/apache/hadoop/hbase/client/RowProcessor.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
  src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
  src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java
  src/test/java/org/apache/hadoop/hbase/client/TestProcessRow.java
  src/test/java/org/apache/hadoop/hbase/io/TestHbaseObjectWritable.java


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(ListServerName servers) API

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221952#comment-13221952
 ] 

Hadoop QA commented on HBASE-5510:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517004/HBase-5010_3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -129 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 154 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1092//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1092//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1092//console

This message is automatically generated.

 Change in LB.randomAssignment(ListServerName servers) API
 ---

 Key: HBASE-5510
 URL: https://issues.apache.org/jira/browse/HBASE-5510
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBase-5010_3.patch, HBase-5510.patch, HBase-5510_2.patch


  In LB there is randomAssignment(ListServerName servers) API which will be 
 used by AM to assign
  a region from a down RS. [This will be also used in other cases like call to 
 assign() API from client]
  I feel it would be better to pass the HRegionInfo also into this method. 
 When the LB making a choice for a region
  assignment, when one RS is down, it would be nice that the LB knows for 
 which region it is doing this server selection.
 +Scenario+
  While one RS down, we wanted the regions to get moved to other RSs but a set 
 of regions stay together. We are having custom load balancer but with the 
 current way of LB interface this is not possible. Another way is I can allow 
 a random assignment of the regions at the RS down time. Later with a cluster 
 balance I can balance the regions as I need. But this might make regions 
 assign 1st to one RS and then again move to another. Also for some time 
 period my business use case can not get satisfied.
 Also I have seen some issue in JIRA which speaks about making sure that Root 
 and META regions always sit in some specific RSs. With the current LB API 
 this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221960#comment-13221960
 ] 

Phabricator commented on HBASE-5515:


sc has commented on the revision HBASE-5515 [jira] Add a processRow API that 
supports atomic multiple reads and writes on a row.

  @lhofhansl: After look at HBASE-5229, I think what you said makes sense. 
RowProcessor is some logic on server side and should belong to 
coprocessorExec() not a stand alone HTable API.

  I will make a change and update this patch soon. Thanks for pointing this out!

REVISION DETAIL
  https://reviews.facebook.net/D2067


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221959#comment-13221959
 ] 

Hadoop QA commented on HBASE-5515:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12517005/HBASE-5515.D2067.5.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -126 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 154 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1093//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1093//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1093//console

This message is automatically generated.

 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5515:
---

Attachment: HBASE-5515.D2067.6.patch

sc updated the revision HBASE-5515 [jira] Add a processRow API that supports 
atomic multiple reads and writes on a row.
Reviewers: tedyu, dhruba, JIRA

  Addressed Lar's review comments.
  Now the API is used from table.coprocessorExec().
  Thanks!

REVISION DETAIL
  https://reviews.facebook.net/D2067

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java
  src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
  src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java
  src/test/java/org/apache/hadoop/hbase/io/TestHbaseObjectWritable.java


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, 
 HBASE-5515.D2067.6.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5515:
---

Attachment: HBASE-5515.D2067.7.patch

sc updated the revision HBASE-5515 [jira] Add a processRow API that supports 
atomic multiple reads and writes on a row.
Reviewers: tedyu, dhruba, JIRA

  Fixed some comments

REVISION DETAIL
  https://reviews.facebook.net/D2067

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java
  src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
  src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java
  src/test/java/org/apache/hadoop/hbase/io/TestHbaseObjectWritable.java


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, 
 HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Attachment: 5399.v38.patch

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Patch Available  (was: Open)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5515:
---

Attachment: HBASE-5515.D2067.8.patch

sc updated the revision HBASE-5515 [jira] Add a processRow API that supports 
atomic multiple reads and writes on a row.
Reviewers: tedyu, dhruba, JIRA

  Add a static helper function to allow easier use of the RowProcessor

REVISION DETAIL
  https://reviews.facebook.net/D2067

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java
  src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
  src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java
  src/test/java/org/apache/hadoop/hbase/io/TestHbaseObjectWritable.java


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, 
 HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221973#comment-13221973
 ] 

Hadoop QA commented on HBASE-5399:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517013/5399.v38.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 30 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause mvn compile goal to fail.

-1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1095//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1095//console

This message is automatically generated.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Open  (was: Patch Available)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399_inprogress.patch, 
 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Attachment: 5399.v39.patch

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5399:
---

Status: Patch Available  (was: Open)

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221976#comment-13221976
 ] 

Hadoop QA commented on HBASE-5515:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12517011/HBASE-5515.D2067.7.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -128 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 154 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1094//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1094//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1094//console

This message is automatically generated.

 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, 
 HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221981#comment-13221981
 ] 

Hadoop QA commented on HBASE-5515:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12517014/HBASE-5515.D2067.8.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -128 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 154 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1096//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1096//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1096//console

This message is automatically generated.

 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, 
 HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221995#comment-13221995
 ] 

Hadoop QA commented on HBASE-5399:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517015/5399.v39.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 30 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -129 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 155 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestZooKeeper
  org.apache.hadoop.hbase.TestRegionRebalancing

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1097//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1097//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1097//console

This message is automatically generated.

 Cut the link between the client and the zookeeper ensemble
 --

 Key: HBASE-5399
 URL: https://issues.apache.org/jira/browse/HBASE-5399
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch


 The link is often considered as an issue, for various reasons. One of them 
 being that there is a limit on the number of connection that ZK can manage. 
 Stack was suggesting as well to remove the link to master from HConnection.
 There are choices to be made considering the existing API (that we don't want 
 to break).
 The first patches I will submit on hadoop-qa should not be committed: they 
 are here to show the progress on the direction taken.
 ZooKeeper is used for:
 - public getter, to let the client do whatever he wants, and close ZooKeeper 
 when closing the connection = we have to deprecate this but keep it.
 - read get master address to create a master = now done with a temporary 
 zookeeper connection
 - read root location = now done with a temporary zookeeper connection, but 
 questionable. Used in public function locateRegion. To be reworked.
 - read cluster id = now done once with a temporary zookeeper connection.
 - check if base done is available = now done once with a zookeeper 
 connection given as a parameter
 - isTableDisabled/isTableAvailable = public functions, now done with a 
 temporary zookeeper connection.
  - Called internally from HBaseAdmin and HTable
 - getCurrentNrHRS(): public function to get the number of region servers and 
 create a pool of thread = now done with a temporary zookeeper connection
 -
 Master is used for:
 - getMaster public getter, as for ZooKeeper = we have to deprecate this but 
 keep it.
 - isMasterRunning(): public function, used internally by HMerge  HBaseAdmin
 - getHTableDescriptor*: public functions offering access to the master.  = 
 we could make them using a temporary master connection as well.
 Main points are:
 - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
 strongly coupled architecture ;-). This can be changed, but requires a lot of 
 modifications in these classes (likely adding a class in the middle of the 
 hierarchy, something like that). Anyway, non connected client will always be 
 really slower, because it's a tcp connection, and establishing a tcp 
 connection is slow.
 - having a link between ZK and all the client seems to make sense for some 
 Use Cases. However, it won't scale if a TCP connection is required for every 
 client
 - if we move the table descriptor part away from the client, we need to find 
 a new place for it.
 - we will have the same issue if HBaseAdmin (for both ZK  Master), may be we 
 can put a timeout on the connection. That would make the whole system less 
 deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221999#comment-13221999
 ] 

Phabricator commented on HBASE-5515:


tedyu has commented on the revision HBASE-5515 [jira] Add a processRow API 
that supports atomic multiple reads and writes on a row.

INLINE COMMENTS
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java:210
 Can we remove the braces for this code block (ending at line 223) ?
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:4318 In the 
processor, multiple scans may be performed.
  Should we introduce timeout for this invocation ?
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:4368 What if 
the wal sync was unsuccessful ?
  We should prepare to remove keys from memstore.
  See line 2204 of doMiniBatchPut().
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:4364 Does 
step 9 have to be in the finally block ?

REVISION DETAIL
  https://reviews.facebook.net/D2067


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, 
 HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5515:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12517011/HBASE-5515.D2067.7.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -128 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 154 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1094//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1094//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1094//console

This message is automatically generated.)

 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, 
 HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5473) Metrics does not push pread time

2012-03-04 Thread dhruba borthakur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222094#comment-13222094
 ] 

dhruba borthakur commented on HBASE-5473:
-

This is committed to trunk. can this be closed?

 Metrics does not push pread time
 

 Key: HBASE-5473
 URL: https://issues.apache.org/jira/browse/HBASE-5473
 Project: HBase
  Issue Type: Bug
  Components: metrics
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Fix For: 0.94.0

 Attachments: D1947.1.patch, D1947.1.patch, D1947.1.patch, D1947.patch


 The RegionServerMetrics is not pushing the pread times to the MetricsRecord

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5074) support checksums in HBase block cache

2012-03-04 Thread dhruba borthakur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222096#comment-13222096
 ] 

dhruba borthakur commented on HBASE-5074:
-

This has been running successfully for days-on-end in my clusters. Stack: pl 
let me know if your testing showed anything amiss. Thanks.

 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, 
 D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, 
 D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, 
 D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, 
 D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5477) Cannot build RPM for hbase-0.92.0

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222112#comment-13222112
 ] 

Hudson commented on HBASE-5477:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5477 Cannot build RPM for hbase-0.92.0 (Revision 1294317)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/packages/rpm/spec/hbase.spec
* /hbase/branches/0.92/src/packages/update-hbase-env.sh


 Cannot build RPM for hbase-0.92.0
 -

 Key: HBASE-5477
 URL: https://issues.apache.org/jira/browse/HBASE-5477
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
 Environment: Operating system: CentOS 6.2
 {code}
 $ java -version
 java version 1.6.0_22
 OpenJDK Runtime Environment (IcedTea6 1.10.6) (rhel-1.43.1.10.6.el6_2-x86_64)
 OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
 {code}
 {code}
 $ mvn -v
 Warning: JAVA_HOME environment variable is not set.
 Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700)
 Java version: 1.6.0_22
 Java home: /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
 Default locale: en_US, platform encoding: UTF-8
 OS name: linux version: 2.6.32-220.el6.x86_64 arch: amd64 Family: unix
 {code}
Reporter: Benjamin Lee
 Fix For: 0.92.1, 0.94.0

 Attachments: build.log, hbase-0.92.0.patch


 Steps to reproduce:
 {code}
 tar xzvf hbase-0.92.0.tar.gz
 cd hbase-0.92.0
 mvn -Dmaven.test.skip.exec=true -P rpm install
 {code}
 Failure output and patch will be attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5286) bin/hbase's logic of adding Hadoop jar files to the classpath is fragile when presented with split packaged Hadoop 0.23 installation

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222114#comment-13222114
 ] 

Hudson commented on HBASE-5286:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5286 bin/hbase's logic of adding Hadoop jar files to the classpath is 
fragile when presented with split packaged Hadoop 0.23 installation (Revision 
129)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/bin/hbase
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/GetJavaProperty.java


 bin/hbase's logic of adding Hadoop jar files to the classpath is fragile when 
 presented with split packaged Hadoop 0.23 installation
 

 Key: HBASE-5286
 URL: https://issues.apache.org/jira/browse/HBASE-5286
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.92.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Fix For: 0.92.1, 0.94.0

 Attachments: HBASE-5286.patch.txt


 Here's the bit from bin/hbase that might need TLC now that Hadoop can be 
 spotted in the wild in split-package configuration:
 {noformat}
 #If avail, add Hadoop to the CLASSPATH and to the JAVA_LIBRARY_PATH
 if [ ! -z $HADOOP_HOME ]; then
   HADOOPCPPATH=
   if [ -z $HADOOP_CONF_DIR ]; then
 HADOOPCPPATH=$(append_path ${HADOOPCPPATH} ${HADOOP_HOME}/conf)
   else
 HADOOPCPPATH=$(append_path ${HADOOPCPPATH} ${HADOOP_CONF_DIR})
   fi
   if [ `echo ${HADOOP_HOME}/hadoop-core*.jar` != 
 ${HADOOP_HOME}/hadoop-core*.jar ] ; then
 HADOOPCPPATH=$(append_path ${HADOOPCPPATH} `ls 
 ${HADOOP_HOME}/hadoop-core*.jar | head -1`)
   else
 HADOOPCPPATH=$(append_path ${HADOOPCPPATH} `ls 
 ${HADOOP_HOME}/hadoop-common*.jar | head -1`)
 HADOOPCPPATH=$(append_path ${HADOOPCPPATH} `ls 
 ${HADOOP_HOME}/hadoop-hdfs*.jar | head -1`)
 HADOOPCPPATH=$(append_path ${HADOOPCPPATH} `ls 
 ${HADOOP_HOME}/hadoop-mapred*.jar | head -1`)
   fi
 {noformat}
 There's a couple of issues with the above code:
0. HADOOP_HOME is now deprecated in Hadoop 0.23
1. the list of jar files added to the class-path should be revised
2. we need to figure out a more robust way to get the jar files that are 
 needed to the classpath (things like hadoop-mapred*.jar tend to match 
 src/test jars as well)
 Better yet, it would be useful to look into whether we can transition HBase's 
 bin/hbase onto using bin/hadoop as a launcher script instead of direct JAVA 
 invocations (Pig, Hive, Sqoop and Mahout already do that)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5415) FSTableDescriptors should handle random folders in hbase.root.dir better

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222116#comment-13222116
 ] 

Hudson commented on HBASE-5415:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5415  FSTableDescriptors should handle random folders in
   hbase.root.dir better (Revision 1293041)

 Result = FAILURE
jdcryans : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java


 FSTableDescriptors should handle random folders in hbase.root.dir better
 

 Key: HBASE-5415
 URL: https://issues.apache.org/jira/browse/HBASE-5415
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0

 Attachments: HBASE-5415.patch


 I faked an upgrade on a test cluster using our dev data so I had to distcp 
 the data between the two clusters, but after starting up and doing the 
 migration and whatnot the web UI didn't show any table. The reason was in the 
 master's log:
 {quote}
 org.apache.hadoop.hbase.TableExistsException: No descriptor for 
 _distcp_logs_e0ehek
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:182)
 at 
 org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1554)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {quote}
 I don't think we need to show a full stack (just a WARN maybe), this 
 shouldn't kill the request (still see tables in the web UI), and why is that 
 a TableExistsException?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222113#comment-13222113
 ] 

Hudson commented on HBASE-5466:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5466  Opening a table also opens the metatable and never closes it - 
place JIRA at the right position in CHANGES.txt
   (Ashley Taylor) (Revision 1293048)
HBASE-5466  Opening a table also opens the metatable and never closes it 
   (Ashley Taylor) (Revision 1293045)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java


 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5, 0.92.0
Reporter: Ashley Taylor
Assignee: Ashley Taylor
 Fix For: 0.92.1

 Attachments: MetaScanner_HBASE_5466(2).patch, 
 MetaScanner_HBASE_5466(3).patch, MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5489) Add HTable accessor to get regions for a key range

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222115#comment-13222115
 ] 

Hudson commented on HBASE-5489:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5489 Addendum (Revision 1296008)
HBASE-5489 Add HTable accessor to get regions for a key range (Revision 1295767)
HBASE-5489 Add HTable accessor to get regions for a key range (Revision 1295727)

 Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HTable.java

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt


 Add HTable accessor to get regions for a key range
 --

 Key: HBASE-5489
 URL: https://issues.apache.org/jira/browse/HBASE-5489
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: David S. Wang
Assignee: David S. Wang
Priority: Minor
 Fix For: 0.92.1, 0.94.0

 Attachments: HBASE-5489-2.patch, HBASE-5489-3-0.92.1.patch, 
 HBASE-5489-3.patch, HBASE-5489-4.patch


 It would be nice to have an accessor to find all regions that overlap with a 
 particular range of keys. Right now, the only way to accomplish that is to 
 call HTable.getStartEndKeys(), then follow that with calls to 
 getRegionLocation() for the range of keys you are interested in.  This 
 algorithm has 2 drawbacks:
 * It returns more keys than is necessary most of the time.  This is 
 especially evident if there are a lot of regions comprising the table and the 
 range of keys is small.
 * It always does a scan of .META. via MetaScannerVisitor for at least 
 HTable.getStartEndKeys(), and perhaps for HRegionLocations that are not 
 already cached by the client.
 An accessor that limited its scans to a specified range could avoid scanning 
 .META. at all if the HRegionLocations being fetched were already cached by 
 the client, thereby potentially making this operation faster in common cases.
 Here's a proposal for the accessor:
   /**
* Get the corresponding regions for an arbitrary range of keys.
* p
* @param startRow Starting row in range, inclusive
* @param endRow Ending row in range, inclusive
* @return A list of HRegionLocations corresponding to the regions that
* contain the specified range
* @throws IOException if a remote or network exception occurs
*/
   public ListHRegionLocation getRegionsInRange(final byte [] startKey,
 final byte [] endKey) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5430) Fix licenses in 0.92.1 -- RAT plugin won't pass

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222118#comment-13222118
 ] 

Hudson commented on HBASE-5430:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5430 Fix licenses in 0.92.1 -- RAT plugin won't pass (Revision 
1296356)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/pom.xml


 Fix licenses in 0.92.1 -- RAT plugin won't pass
 ---

 Key: HBASE-5430
 URL: https://issues.apache.org/jira/browse/HBASE-5430
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Blocker
 Fix For: 0.92.1

 Attachments: 5430.txt


 Use the -Drelease profile to see we are missing 30 or so license.  Fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222119#comment-13222119
 ] 

Hudson commented on HBASE-5317:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5317  Fix TestHFileOutputFormat to work against hadoop 0.23
   (Gregory Taylor) (Revision 1293306)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/pom.xml
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestMetaMigrationRemovingHTD.java


 Fix TestHFileOutputFormat to work against hadoop 0.23
 -

 Key: HBASE-5317
 URL: https://issues.apache.org/jira/browse/HBASE-5317
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.0, 0.94.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.92.1, 0.94.0

 Attachments: HBASE-5317-v0.patch, HBASE-5317-v1.patch, 
 HBASE-5317-v3.patch, HBASE-5317-v4.patch, HBASE-5317-v5.patch, 
 HBASE-5317-v6.patch, HBASE-5317to0.92.patch, 
 TEST-org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.xml


 Running
 mvn -Dhadoop.profile=23 test -P localTests 
 -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
 yields this on 0.92:
 Failed tests:   
 testColumnFamilyCompression(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  HFile for column family info-A not found
 Tests in error: 
   test_TIMERANGE(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): 
 /home/gchanan/workspace/apache92/target/test-data/276cbd0c-c771-4f81-9ba8-c464c9dd7486/test_TIMERANGE_present/_temporary/0/_temporary/_attempt_200707121733_0001_m_00_0
  (Is a directory)
   
 testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
   
 testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 It looks like on trunk, this also results in an error:
   
 testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 I have a patch that fixes testColumnFamilyCompression and test_TIMERANGE, but 
 haven't fixed the other 3 yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5502) region_mover.rb fails to load regions back to original server for regions only containing empty tables.

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222121#comment-13222121
 ] 

Hudson commented on HBASE-5502:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5502 region_mover.rb fails to load regions back to original server 
for regions only containing empty tables. (Revision 1295705)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/bin/region_mover.rb


 region_mover.rb fails to load regions back to original server for regions 
 only containing empty tables.
 ---

 Key: HBASE-5502
 URL: https://issues.apache.org/jira/browse/HBASE-5502
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.92.0
 Environment: Ubuntu precise
Reporter: James Page
Priority: Minor
 Fix For: 0.92.1, 0.94.0

 Attachments: HDFS-5502.patch


 The region_mover loadRegion function incorrectly uses 'isSuccessfulScan':
 {noformat} 
   for r in regions
 exists = false
 begin
   exists = isSuccessfulScan(admin, r)
 rescue org.apache.hadoop.hbase.NotServingRegionException = e
   $LOG.info(Failed scan of  + e.message)
 end
 {noformat} 
 isSuccessfulScan throws an exception when it fails rather than returning 
 status.
 As a result empty regions don't get restored - this is the case in a fresh 
 install (which is how I discovered this) with no user table.
 Modifying the code to set exists IF isSuccessfulScan does not throw an 
 exception worked for me:
 {noformat}
   for r in regions
 exists = false
 begin
   isSuccessfulScan(admin, r)
   exists = true
 rescue org.apache.hadoop.hbase.NotServingRegionException = e
   $LOG.info(Failed scan of  + e.message)
 end
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5511) More doc on maven release process

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222120#comment-13222120
 ] 

Hudson commented on HBASE-5511:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5511 More doc on maven release process (Revision 1296318)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/pom.xml


 More doc on maven release process
 -

 Key: HBASE-5511
 URL: https://issues.apache.org/jira/browse/HBASE-5511
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.92.1, 0.94.0

 Attachments: doc.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5481) Uncaught UnknownHostException prevents HBase from starting

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222126#comment-13222126
 ] 

Hudson commented on HBASE-5481:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5481 Uncaught UnknownHostException prevents HBase from starting 
(Revision 1294218)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Uncaught UnknownHostException prevents HBase from starting
 --

 Key: HBASE-5481
 URL: https://issues.apache.org/jira/browse/HBASE-5481
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
 Fix For: 0.92.1, 0.94.0

 Attachments: 
 0001-Properly-handle-UnknownHostException-when-checking-M.patch, 
 5481-trunk.txt


 If a host gets decommissioned and its hostname no longer resolves, and it was 
 previously hosting ROOT or META, HBase won't be able to start up.  This 
 easily happens when moving across networks (e.g. developing HBase on a 
 laptop), but can also happen during cluster-wide maintenances where HBase is 
 shut down, then one or more nodes get decommissioned such that their 
 hostnames no longer resolve.
 {code}
 2012-02-26 20:05:48,339 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 -ROOT-,,0.70236052 to nowwhat.tsunanet.net,54092,1330315542087
 [...]
 2012-02-26 20:05:48,456 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
 Onlined -ROOT-,,0.70236052; next sequenceid=268
 2012-02-26 20:05:48,456 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:54092-0x135bcfbb0580001 Attempting to transition node 
 70236052/-ROOT- from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
 2012-02-26 20:05:48,458 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:54092-0x135bcfbb0580001 Successfully transitioned node 70236052 
 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
 2012-02-26 20:05:48,459 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, 
 server=nowwhat.tsunanet.net,54092,1330315542087, region=70236052/-ROOT-
 2012-02-26 20:05:48,459 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks 
 for region=-ROOT-,,0.70236052, daughter=false
 2012-02-26 20:05:48,460 INFO 
 org.apache.hadoop.hbase.catalog.RootLocationEditor: Setting ROOT region 
 location in ZooKeeper as nowwhat.tsunanet.net,54092,1330315542087
 2012-02-26 20:05:48,466 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Done with post open 
 deploy task for region=-ROOT-,,0.70236052, daughter=false
 2012-02-26 20:05:48,466 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:54092-0x135bcfbb0580001 Attempting to transition node 
 70236052/-ROOT- from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
 2012-02-26 20:05:48,468 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:54092-0x135bcfbb0580001 Successfully transitioned node 70236052 
 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
 2012-02-26 20:05:48,468 DEBUG 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region 
 transitioned to opened in zookeeper: {NAME = '-ROOT-,,0', STARTKEY = '', 
 ENDKEY = '', ENCODED = 70236052,}, server: 
 nowwhat.tsunanet.net,54092,1330315542087
 2012-02-26 20:05:48,468 DEBUG 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened 
 -ROOT-,,0.70236052 on server:nowwhat.tsunanet.net,54092,1330315542087
 2012-02-26 20:05:48,468 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, 
 server=nowwhat.tsunanet.net,54092,1330315542087, region=70236052/-ROOT-
 2012-02-26 20:05:48,470 INFO 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for -ROOT-,,0.70236052 from nowwhat.tsunanet.net,54092,1330315542087; 
 deleting unassigned node
 2012-02-26 20:05:48,470 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:54081-0x135bcfbb058 Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2012-02-26 20:05:48,472 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: The znode of region 
 -ROOT-,,0.70236052 has been deleted.
 2012-02-26 20:05:48,472 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:54081-0x135bcfbb058 Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2012-02-26 20:05:48,472 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the 
 region -ROOT-,,0.70236052 that was online on 
 

[jira] [Commented] (HBASE-5491) Remove HBaseConfiguration.create() call from coprocessor.Exec class

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222123#comment-13222123
 ] 

Hudson commented on HBASE-5491:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5491 Remove HBaseConfiguration.create() call from coprocessor.Exec 
class (Revision 1295430)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/coprocessor/Exec.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java


 Remove HBaseConfiguration.create() call from coprocessor.Exec class
 ---

 Key: HBASE-5491
 URL: https://issues.apache.org/jira/browse/HBASE-5491
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Affects Versions: 0.92.0
 Environment: all
Reporter: honghua zhu
 Fix For: 0.92.1

 Attachments: HBASE-5491-2.patch, HBASE-5491.patch


 Exec class has a field: private Configuration conf = 
 HBaseConfiguration.create()
 Client side generates an Exec instance of the class, each initiated 
 Statistics request by ExecRPCInvoker
 Is so HBaseConfiguration.create for each request needs to call
 When the server side deserialize the Exec Called once 
 HBaseConfiguration.create in,
 HBaseConfiguration.create is a time consuming operation.
 private Configuration conf = HBaseConfiguration.create();
 This code is only useful for testing code 
 (org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint.testExecDeserialization),
 other places with the Exec class, pass a Configuration come,
 so no need to conf field a default value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5351) hbase completebulkload to a new table fails in a race

2012-03-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222122#comment-13222122
 ] 

Hudson commented on HBASE-5351:
---

Integrated in HBase-0.92-security #96 (See 
[https://builds.apache.org/job/HBase-0.92-security/96/])
HBASE-5351 hbase completebulkload to a new table fails in a race (Revision 
1293480)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java


 hbase completebulkload to a new table fails in a race
 -

 Key: HBASE-5351
 URL: https://issues.apache.org/jira/browse/HBASE-5351
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.92.0, 0.94.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.92.1, 0.94.0

 Attachments: HBASE-5351-v1.patch, HBASE-5351-v2.patch, 
 HBASE-5351-v2.patch, HBASE-5351.patch


 I have a test that tests vanilla use of importtsv with importtsv.bulk.output 
 option followed by completebulkload to a new table.
 This sometimes fails as follows:
 11/12/19 15:02:39 WARN client.HConnectionManager$HConnectionImplementation: 
 Encountered problems when prefetch META table:
 org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
 table: ml_items_copy, row=ml_items_copy,,99
 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:157)
 at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52)
 at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130)
 at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:359)
 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127)
 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:875)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:929)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:817)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:781)
 at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:247)
 at org.apache.hadoop.hbase.client.HTable.init(HTable.java:211)
 at org.apache.hadoop.hbase.client.HTable.init(HTable.java:171)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.createTable(LoadIncrementalHFiles.java:673)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:697)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83)
 at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.main(LoadIncrementalHFiles.java:707)
 The race appears to be calling HbAdmin.createTableAsync(htd, keys) and then 
 creating an HTable object before that call has actually completed.
 The following change to 
 /src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 
 appears to fix the problem, but I have not been able to reproduce the race 
 reliably, in order to write a test.
 {code}
 -HTable table = new HTable(this.cfg, tableName);
 -
 -HConnection conn = table.getConnection();
  int ctr = 0;
 -while (!conn.isTableAvailable(table.getTableName())  
 (ctrTABLE_CREATE_MA
 +while (!this.hbAdmin.isTableAvailable(tableName)  
 (ctrTABLE_CREATE_MAX_R
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5519) Incorrect warning in splitlogmanager

2012-03-04 Thread Prakash Khemani (Created) (JIRA)
Incorrect warning in splitlogmanager


 Key: HBASE-5519
 URL: https://issues.apache.org/jira/browse/HBASE-5519
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani


because of recently added behavior - where the splitlogmanager timeout thread 
get's data from zk node just to check that the zk node is there ... we might 
have multiple watches firing without the task znode expiring.

remove the poor warning message. (internally, there was an assert that failed 
in Mikhail's tests)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5518) Incorrect warning in splitlogmanager

2012-03-04 Thread Prakash Khemani (Created) (JIRA)
Incorrect warning in splitlogmanager


 Key: HBASE-5518
 URL: https://issues.apache.org/jira/browse/HBASE-5518
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani


because of recently added behavior - where the splitlogmanager timeout thread 
get's data from zk node just to check that the zk node is there ... we might 
have multiple watches firing without the task znode expiring.

remove the poor warning message. (internally, there was an assert that failed 
in Mikhail's tests)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4348) Add metrics for regions in transition

2012-03-04 Thread Himanshu Vashishtha (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222173#comment-13222173
 ] 

Himanshu Vashishtha commented on HBASE-4348:


I have some some analysis and some questions regarding the metrics here. Please 
correct me if I am wrong.

RegionServer updates its metrics at fix intervals defined by 
hbase.regionserver.msginterval and the metrics are pushed to the record after 
specified interval. 
Master don't use this methodology; rather whenever the master statistics are 
changed, the caller is supposed to update the metrics value on its own.

I am wondering why we don't follow the same approach for master (because the 
stats we are recording are not changing that often?). The reason I am asking is 
because the regionsInTransition map is mutated at bunch of places in the 
AssignmentManager class; so either I should invoke the method to compute and 
push the stats at all such callstacks, or may be we can have a similar approach 
as used by RegionServer.

 Add metrics for regions in transition
 -

 Key: HBASE-4348
 URL: https://issues.apache.org/jira/browse/HBASE-4348
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Himanshu Vashishtha
Priority: Minor
  Labels: noob
 Attachments: 4348-v1.patch, 4348-v2.patch, RITs.png


 The following metrics would be useful for monitoring the master:
 - the number of regions in transition
 - the number of regions in transition that have been in transition for more 
 than a minute
 - how many seconds has the oldest region-in-transition been in transition

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4348) Add metrics for regions in transition

2012-03-04 Thread Himanshu Vashishtha (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222174#comment-13222174
 ] 

Himanshu Vashishtha commented on HBASE-4348:


I have done some analysis and some questions regarding the metrics here. Please 
correct me if I am wrong.

RegionServer updates its metrics at fix intervals defined by 
hbase.regionserver.msginterval and the metrics are pushed to the record after 
specified interval. 
Master don't use this methodology; rather whenever the master statistics are 
changed, the caller is supposed to update the metrics value on its own.

I am wondering why we don't follow the same approach for master (because the 
stats we are recording are not changing that often?). The reason I am asking is 
because the regionsInTransition map is mutated at bunch of places in the 
AssignmentManager class; so either I should invoke the method to compute and 
push the stats at all such callstacks, or may be we can have a similar approach 
as used by RegionServer.

 Add metrics for regions in transition
 -

 Key: HBASE-4348
 URL: https://issues.apache.org/jira/browse/HBASE-4348
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Himanshu Vashishtha
Priority: Minor
  Labels: noob
 Attachments: 4348-v1.patch, 4348-v2.patch, RITs.png


 The following metrics would be useful for monitoring the master:
 - the number of regions in transition
 - the number of regions in transition that have been in transition for more 
 than a minute
 - how many seconds has the oldest region-in-transition been in transition

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row

2012-03-04 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222175#comment-13222175
 ] 

Phabricator commented on HBASE-5515:


lhofhansl has commented on the revision HBASE-5515 [jira] Add a processRow API 
that supports atomic multiple reads and writes on a row.

  Thanks sc (Scott).
  This approach makes more sense to me. I'll review in more detail tomorrow.

REVISION DETAIL
  https://reviews.facebook.net/D2067


 Add a processRow API that supports atomic multiple reads and writes on a row
 

 Key: HBASE-5515
 URL: https://issues.apache.org/jira/browse/HBASE-5515
 Project: HBase
  Issue Type: New Feature
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.2.patch, 
 HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, 
 HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch


 We have modified HRegion.java internally to do some atomic row processing. It 
 will be nice to have a plugable API for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT

2012-03-04 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222180#comment-13222180
 ] 

Lars Hofhansl commented on HBASE-5512:
--

Let's wait for more input in HBASE-2038. I'm feeling half way include to -1 my 
own patch :)
Imagine we wanted to add INCLUDE_AND_NEXT_ROW as well as INCLUDE_AND_NEXT_COL 
to filter. These outcomes would have to merged with whatever the colum tracker 
returns.
It's not complicated, but will make the code hard to read with questionable 
gain (as I say in HBASE-2038). I think Anoop wants to do some performance 
testing.

 Add support for INCLUDE_AND_SEEK_USING_HINT
 ---

 Key: HBASE-5512
 URL: https://issues.apache.org/jira/browse/HBASE-5512
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
Assignee: Lars Hofhansl
 Attachments: 5512-v2.txt, 5512.txt


 This came up from HBASE-2038
 From Anoop:
 - What we wanted from the filter is include a row and then seek to the next 
 row which we are interested in. I cant see such a facility with our Filter 
 right now. Correct me if I am wrong. So suppose we already seeked to one row 
 and this need to be included in the result, then the Filter should return 
 INCLUDE. Then when the next next() call happens, then only we can return a 
 SEEK_USING_HINT. So one extra row reading is needed. This might create even 
 one unwanted HFileBlock fetch (who knows).
 Can we add reseek() at higher level?
 From Lars:
 Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the 
 INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, 
 I'm happy to do that, if that's the route we want to go with this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT

2012-03-04 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222180#comment-13222180
 ] 

Lars Hofhansl edited comment on HBASE-5512 at 3/5/12 6:36 AM:
--

Let's wait for more input in HBASE-2038. I'm feeling half way inclined to -1 my 
own patch :)
Imagine we wanted to add INCLUDE_AND_NEXT_ROW as well as INCLUDE_AND_NEXT_COL 
to filter. These outcomes would have to merged with whatever the colum tracker 
returns.
It's not complicated, but will make the code hard to read with questionable 
gain (as I say in HBASE-2038). I think Anoop wants to do some performance 
testing.

  was (Author: lhofhansl):
Let's wait for more input in HBASE-2038. I'm feeling half way include to -1 
my own patch :)
Imagine we wanted to add INCLUDE_AND_NEXT_ROW as well as INCLUDE_AND_NEXT_COL 
to filter. These outcomes would have to merged with whatever the colum tracker 
returns.
It's not complicated, but will make the code hard to read with questionable 
gain (as I say in HBASE-2038). I think Anoop wants to do some performance 
testing.
  
 Add support for INCLUDE_AND_SEEK_USING_HINT
 ---

 Key: HBASE-5512
 URL: https://issues.apache.org/jira/browse/HBASE-5512
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
Assignee: Lars Hofhansl
 Attachments: 5512-v2.txt, 5512.txt


 This came up from HBASE-2038
 From Anoop:
 - What we wanted from the filter is include a row and then seek to the next 
 row which we are interested in. I cant see such a facility with our Filter 
 right now. Correct me if I am wrong. So suppose we already seeked to one row 
 and this need to be included in the result, then the Filter should return 
 INCLUDE. Then when the next next() call happens, then only we can return a 
 SEEK_USING_HINT. So one extra row reading is needed. This might create even 
 one unwanted HFileBlock fetch (who knows).
 Can we add reseek() at higher level?
 From Lars:
 Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the 
 INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, 
 I'm happy to do that, if that's the route we want to go with this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5074) support checksums in HBase block cache

2012-03-04 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5074:
-

Fix Version/s: 0.94.0

Marking this for 0.94

 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, 
 D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, 
 D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, 
 D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, 
 D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-03-04 Thread chunhui shen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-5270:


Attachment: hbase-5270v10.patch

 Handle potential data loss due to concurrent processing of processFaileOver 
 and ServerShutdownHandler
 -

 Key: HBASE-5270
 URL: https://issues.apache.org/jira/browse/HBASE-5270
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Zhihong Yu
Assignee: chunhui shen
 Fix For: 0.92.2

 Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, 
 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, 
 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v10.patch, 
 hbase-5270v2.patch, hbase-5270v4.patch, hbase-5270v5.patch, 
 hbase-5270v6.patch, hbase-5270v7.patch, hbase-5270v8.patch, 
 hbase-5270v9.patch, sampletest.txt


 This JIRA continues the effort from HBASE-5179. Starting with Stack's 
 comments about patches for 0.92 and TRUNK:
 Reviewing 0.92v17
 isDeadServerInProgress is a new public method in ServerManager but it does 
 not seem to be used anywhere.
 Does isDeadRootServerInProgress need to be public? Ditto for meta version.
 This method param names are not right 'definitiveRootServer'; what is meant 
 by definitive? Do they need this qualifier?
 Is there anything in place to stop us expiring a server twice if its carrying 
 root and meta?
 What is difference between asking assignment manager isCarryingRoot and this 
 variable that is passed in? Should be doc'd at least. Ditto for meta.
 I think I've asked for this a few times - onlineServers needs to be 
 explained... either in javadoc or in comment. This is the param passed into 
 joinCluster. How does it arise? I think I know but am unsure. God love the 
 poor noob that comes awandering this code trying to make sense of it all.
 It looks like we get the list by trawling zk for regionserver znodes that 
 have not checked in. Don't we do this operation earlier in master setup? Are 
 we doing it again here?
 Though distributed split log is configured, we will do in master single 
 process splitting under some conditions with this patch. Its not explained in 
 code why we would do this. Why do we think master log splitting 'high 
 priority' when it could very well be slower. Should we only go this route if 
 distributed splitting is not going on. Do we know if concurrent distributed 
 log splitting and master splitting works?
 Why would we have dead servers in progress here in master startup? Because a 
 servershutdownhandler fired?
 This patch is different to the patch for 0.90. Should go into trunk first 
 with tests, then 0.92. Should it be in this issue? This issue is really hard 
 to follow now. Maybe this issue is for 0.90.x and new issue for more work on 
 this trunk patch?
 This patch needs to have the v18 differences applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-03-04 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222188#comment-13222188
 ] 

chunhui shen commented on HBASE-5270:
-

Patch v10 introduces Ted's review comment on r/4021.(Pass the failed tests also)
For the logic, I think it's healthy now. Waiting for other suggestions.

 Handle potential data loss due to concurrent processing of processFaileOver 
 and ServerShutdownHandler
 -

 Key: HBASE-5270
 URL: https://issues.apache.org/jira/browse/HBASE-5270
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Zhihong Yu
Assignee: chunhui shen
 Fix For: 0.92.2

 Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, 
 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, 
 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v10.patch, 
 hbase-5270v2.patch, hbase-5270v4.patch, hbase-5270v5.patch, 
 hbase-5270v6.patch, hbase-5270v7.patch, hbase-5270v8.patch, 
 hbase-5270v9.patch, sampletest.txt


 This JIRA continues the effort from HBASE-5179. Starting with Stack's 
 comments about patches for 0.92 and TRUNK:
 Reviewing 0.92v17
 isDeadServerInProgress is a new public method in ServerManager but it does 
 not seem to be used anywhere.
 Does isDeadRootServerInProgress need to be public? Ditto for meta version.
 This method param names are not right 'definitiveRootServer'; what is meant 
 by definitive? Do they need this qualifier?
 Is there anything in place to stop us expiring a server twice if its carrying 
 root and meta?
 What is difference between asking assignment manager isCarryingRoot and this 
 variable that is passed in? Should be doc'd at least. Ditto for meta.
 I think I've asked for this a few times - onlineServers needs to be 
 explained... either in javadoc or in comment. This is the param passed into 
 joinCluster. How does it arise? I think I know but am unsure. God love the 
 poor noob that comes awandering this code trying to make sense of it all.
 It looks like we get the list by trawling zk for regionserver znodes that 
 have not checked in. Don't we do this operation earlier in master setup? Are 
 we doing it again here?
 Though distributed split log is configured, we will do in master single 
 process splitting under some conditions with this patch. Its not explained in 
 code why we would do this. Why do we think master log splitting 'high 
 priority' when it could very well be slower. Should we only go this route if 
 distributed splitting is not going on. Do we know if concurrent distributed 
 log splitting and master splitting works?
 Why would we have dead servers in progress here in master startup? Because a 
 servershutdownhandler fired?
 This patch is different to the patch for 0.90. Should go into trunk first 
 with tests, then 0.92. Should it be in this issue? This issue is really hard 
 to follow now. Maybe this issue is for 0.90.x and new issue for more work on 
 this trunk patch?
 This patch needs to have the v18 differences applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira