[jira] [Updated] (HBASE-4095) Hlog may not be rolled in a long time if checkLowReplication's request of LogRoll is blocked

2011-08-17 Thread Jieshan Bean (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-4095:


Attachment: HBase-4095-V9-trunk.patch
HBase-4095-V9-branch.patch

 Hlog may not be rolled in a long time if checkLowReplication's request of 
 LogRoll is blocked
 

 Key: HBASE-4095
 URL: https://issues.apache.org/jira/browse/HBASE-4095
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.3
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.90.5

 Attachments: HBASE-4095-90-v2.patch, HBASE-4095-90.patch, 
 HBASE-4095-trunk-v2.patch, HBASE-4095-trunk.patch, 
 HBase-4095-V4-Branch.patch, HBase-4095-V5-Branch.patch, 
 HBase-4095-V5-trunk.patch, HBase-4095-V6-branch.patch, 
 HBase-4095-V6-trunk.patch, HBase-4095-V7-branch.patch, 
 HBase-4095-V7-trunk.patch, HBase-4095-V8-branch.patch, 
 HBase-4095-V8-trunk.patch, HBase-4095-V9-branch.patch, 
 HBase-4095-V9-trunk.patch, HlogFileIsVeryLarge.gif, 
 LatestLLTResults-20110810.rar, RelatedLogs2011-07-28.txt, 
 TestResultForPatch-V4.rar, flowChart-IntroductionToThePatch.gif, 
 hbase-root-regionserver-193-195-5-111.rar, surefire-report-V5-trunk.html, 
 surefire-report-branch.html


 Some large Hlog files(Larger than 10G) appeared in our environment, and I got 
 the reason why they got so huge:
 1. The replicas is less than the expect number. So the method of 
 checkLowReplication will be called each sync.
 2. The method checkLowReplication request log-roll first, and set 
 logRollRequested as true: 
 {noformat}
 private void checkLowReplication() {
 // if the number of replicas in HDFS has fallen below the initial
 // value, then roll logs.
 try {
   int numCurrentReplicas = getLogReplication();
   if (numCurrentReplicas != 0 
 numCurrentReplicas  this.initialReplication) {
   LOG.warn(HDFS pipeline error detected.  +
   Found  + numCurrentReplicas +  replicas but expecting  +
   this.initialReplication +  replicas.  +
Requesting close of hlog.);
   requestLogRoll();
   logRollRequested = true;
   }
 } catch (Exception e) {
   LOG.warn(Unable to invoke DFSOutputStream.getNumCurrentReplicas + e +
  still proceeding ahead...);
 }
 }
 {noformat}
 3.requestLogRoll() just commit the roll request. It may not execute in time, 
 for it must got the un-fair lock of cacheFlushLock.
 But the lock may be carried by the cacheflush threads.
 4.logRollRequested was true until the log-roll executed. So during the time, 
 each request of log-roll in sync() was skipped.
 Here's the logs while the problem happened(Please notice the file size of 
 hlog 193-195-5-111%3A20020.1309937386639 in the last row):
 2011-07-06 15:28:59,284 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: 
 HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas.  
 Requesting close of hlog.
 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: 
 Roll 
 /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119,
  entries=32434, filesize=239589754. New hlog 
 /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937386639
 2011-07-06 15:29:56,929 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: 
 HDFS pipeline error detected. Found 2 replicas but expecting 3 replicas.  
 Requesting close of hlog.
 2011-07-06 15:29:56,933 INFO org.apache.hadoop.hbase.regionserver.Store: 
 Renaming flushed file at 
 hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/.tmp/4656903854447026847
  to 
 hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983
 2011-07-06 15:29:57,391 INFO org.apache.hadoop.hbase.regionserver.Store: 
 Added 
 hdfs://193.195.5.112:9000/hbase/Htable_UFDR_034/a3780cf0c909d8cf8f8ed618b290cc95/value/8603005630220380983,
  entries=445880, sequenceid=248900, memsize=207.5m, filesize=130.1m
 2011-07-06 15:29:57,478 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
 Finished memstore flush of ~207.5m for region 
 Htable_UFDR_034,07664,1309936974158.a3780cf0c909d8cf8f8ed618b290cc95. in 
 10839ms, sequenceid=248900, compaction requested=false
 2011-07-06 15:28:59,236 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: 
 Roll 
 /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309926531955,
  entries=216459, filesize=2370387468. New hlog 
 /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119
 2011-07-06 15:29:46,714 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: 
 Roll 
 /hbase/.logs/193-195-5-111,20020,1309922880081/193-195-5-111%3A20020.1309937339119,
  entries=32434, 

[jira] [Updated] (HBASE-4176) Exposing HBase Filters to the Thrift API

2011-08-17 Thread Anirudh Todi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anirudh Todi updated HBASE-4176:


Attachment: book.xml

Hi Stack - Thanks for the +1

I worked on book.xml for a while. Attached is what I have so far. Let me know 
if you feel I'm going in the right direction and I can polish it and finish it 
off then.

 Exposing HBase Filters to the Thrift API
 

 Key: HBASE-4176
 URL: https://issues.apache.org/jira/browse/HBASE-4176
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Anirudh Todi
Assignee: Anirudh Todi
Priority: Minor
 Attachments: Filter Language (3).xml, Filter Language(2).docx, Filter 
 Language(2).xml, Filter Language(3).docx, Filter Language.docx, 
 HBASE-4176.patch, book.xml


 Currently, to use any of the filters, one has to explicitly add a scanner for 
 the filter in the Thrift API making it messy and long. With this patch, I am 
 trying to add support for all the filters in a clean way. The user specifies 
 a filter via a string. The string is parsed on the server to construct the 
 filter. More information can be found in the attached document named Filter 
 Language
 This patch is trying to extend and further the progress made by the patches 
 in the HBASE-1744 JIRA (https://issues.apache.org/jira/browse/HBASE-1744)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4211) Do init-sizing of the StringBuilder making a ServerName.

2011-08-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086191#comment-13086191
 ] 

Hudson commented on HBASE-4211:
---

Integrated in HBase-TRUNK #2122 (See 
[https://builds.apache.org/job/HBase-TRUNK/2122/])
HBASE-4211 Do init-sizing of the StringBuilder making a ServerName

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ServerName.java


 Do init-sizing of the StringBuilder making a ServerName.
 

 Key: HBASE-4211
 URL: https://issues.apache.org/jira/browse/HBASE-4211
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Benoit Sigoure
Priority: Minor
 Fix For: 0.92.0


 Simple patch from BenoƮt.
 ---
  .../java/org/apache/hadoop/hbase/ServerName.java   |3 ++-
  1 files changed, 2 insertions(+), 1 deletions(-)
 diff --git a/src/main/java/org/apache/hadoop/hbase/ServerName.java 
 b/src/main/java/org/apache/hadoop/hbase/ServerName.java
 index 6b03832..4ddb5b7 100644
 --- a/src/main/java/org/apache/hadoop/hbase/ServerName.java
 +++ b/src/main/java/org/apache/hadoop/hbase/ServerName.java
 @@ -128,7 +128,8 @@ public class ServerName implements ComparableServerName 
 {
* startcode formatted as codelt;hostname ',' lt;port ',' 
 lt;startcode/code
*/
   public static String getServerName(String hostName, int port, long 
 startcode) {
 -StringBuilder name = new StringBuilder(hostName);
 +final StringBuilder name = new StringBuilder(hostName.length() + 1 + 5 + 
 1 + 13);
 +name.append(hostName);
 name.append(SERVERNAME_SEPARATOR);
 name.append(port);
 name.append(SERVERNAME_SEPARATOR);
 --
 1.7.6.434.g1d2b3

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4212) TestMasterFailover fails occasionally

2011-08-17 Thread gaojinchao (JIRA)
TestMasterFailover fails occasionally
-

 Key: HBASE-4212
 URL: https://issues.apache.org/jira/browse/HBASE-4212
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: gaojinchao
 Fix For: 0.90.5


It seems a bug. The root in RIT can't be moved..
In the failover process, it enforces root on-line. But not clean zk node. 
test will wait forever.

  void processFailover() throws KeeperException, IOException, 
InterruptedException {
 
// we enforce on-line root.
HServerInfo hsi =
  this.serverManager.getHServerInfo(this.catalogTracker.getMetaLocation());
regionOnline(HRegionInfo.FIRST_META_REGIONINFO, hsi);
hsi = 
this.serverManager.getHServerInfo(this.catalogTracker.getRootLocation());
regionOnline(HRegionInfo.ROOT_REGIONINFO, hsi);

It seems that we should wait finished as meta region 
  int assignRootAndMeta()
  throws InterruptedException, IOException, KeeperException {
int assigned = 0;
long timeout = this.conf.getLong(hbase.catalog.verification.timeout, 
1000);

// Work on ROOT region.  Is it in zk in transition?
boolean rit = this.assignmentManager.
  
processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);
if (!catalogTracker.verifyRootRegionLocation(timeout)) {
  this.assignmentManager.assignRoot();
  this.catalogTracker.waitForRoot();

  //we need add this code and guarantee that the transition has completed
  this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO);
  assigned++;
}

logs:
2011-08-16 07:45:40,715 DEBUG 
[RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 Received 
ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
path=/hbase/unassigned/70236052
2011-08-16 07:45:40,715 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2011-08-16 07:45:40,715 DEBUG [Thread-760-EventThread] 
zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
path=/hbase/unassigned/70236052
2011-08-16 07:45:40,716 INFO  [PostOpenDeployTasks:70236052] 
catalog.RootLocationEditor(62): Setting ROOT region location in ZooKeeper as 
C4S2.site:47710
2011-08-16 07:45:40,716 DEBUG [Thread-760-EventThread] zookeeper.ZKUtil(1109): 
master:60701-0x131d2690f780009 Retrieved 52 byte(s) of data from znode 
/hbase/unassigned/70236052 and set watcher; region=-ROOT-,,0, 
server=C4S2.site,47710,1313495126115, state=RS_ZK_REGION_OPENING
2011-08-16 07:45:40,717 DEBUG [Thread-760-EventThread] 
master.AssignmentManager(477): Handling transition=RS_ZK_REGION_OPENING, 
server=C4S2.site,47710,1313495126115, region=70236052/-ROOT-
2011-08-16 07:45:40,725 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
zookeeper.ZKAssign(661): regionserver:47710-0x131d2690f780004 Attempting to 
transition node 70236052/-ROOT- from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2011-08-16 07:45:40,727 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
zookeeper.ZKUtil(1109): regionserver:47710-0x131d2690f780004 Retrieved 52 
byte(s) of data from znode /hbase/unassigned/70236052; data=region=-ROOT-,,0, 
server=C4S2.site,47710,1313495126115, state=RS_ZK_REGION_OPENING
2011-08-16 07:45:40,740 DEBUG 
[RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 Received 
ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
path=/hbase/unassigned/70236052
2011-08-16 07:45:40,740 DEBUG [Thread-760-EventThread] 
zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
path=/hbase/unassigned/70236052
2011-08-16 07:45:40,740 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
2011-08-16 07:45:40,741 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
handler.OpenRegionHandler(121): Opened -ROOT-,,0.70236052
2011-08-16 07:45:40,741 DEBUG [Thread-760-EventThread] zookeeper.ZKUtil(1109): 
master:60701-0x131d2690f780009 Retrieved 52 byte(s) of data from znode 
/hbase/unassigned/70236052 and set watcher; region=-ROOT-,,0, 
server=C4S2.site,47710,1313495126115, state=RS_ZK_REGION_OPENED
2011-08-16 07:45:40,741 DEBUG [Thread-760-EventThread] 
master.AssignmentManager(477): Handling transition=RS_ZK_REGION_OPENED, 
server=C4S2.site,47710,1313495126115, region=70236052/-ROOT-

//.It said that zk node can't be 
cleaned because of we have 

[jira] [Updated] (HBASE-4212) TestMasterFailover fails occasionally

2011-08-17 Thread gaojinchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4212:
--

Attachment: HBASE-4212_branch90V1.patch

 TestMasterFailover fails occasionally
 -

 Key: HBASE-4212
 URL: https://issues.apache.org/jira/browse/HBASE-4212
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: gaojinchao
 Fix For: 0.90.5

 Attachments: HBASE-4212_branch90V1.patch


 It seems a bug. The root in RIT can't be moved..
 In the failover process, it enforces root on-line. But not clean zk node. 
 test will wait forever.
   void processFailover() throws KeeperException, IOException, 
 InterruptedException {
  
 // we enforce on-line root.
 HServerInfo hsi =
   
 this.serverManager.getHServerInfo(this.catalogTracker.getMetaLocation());
 regionOnline(HRegionInfo.FIRST_META_REGIONINFO, hsi);
 hsi = 
 this.serverManager.getHServerInfo(this.catalogTracker.getRootLocation());
 regionOnline(HRegionInfo.ROOT_REGIONINFO, hsi);
 It seems that we should wait finished as meta region 
   int assignRootAndMeta()
   throws InterruptedException, IOException, KeeperException {
 int assigned = 0;
 long timeout = this.conf.getLong(hbase.catalog.verification.timeout, 
 1000);
 // Work on ROOT region.  Is it in zk in transition?
 boolean rit = this.assignmentManager.
   
 processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);
 if (!catalogTracker.verifyRootRegionLocation(timeout)) {
   this.assignmentManager.assignRoot();
   this.catalogTracker.waitForRoot();
   //we need add this code and guarantee that the transition has completed
   this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO);
   assigned++;
 }
 logs:
 2011-08-16 07:45:40,715 DEBUG 
 [RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
 zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 
 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,715 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
 transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,715 DEBUG [Thread-760-EventThread] 
 zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
 ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,716 INFO  [PostOpenDeployTasks:70236052] 
 catalog.RootLocationEditor(62): Setting ROOT region location in ZooKeeper as 
 C4S2.site:47710
 2011-08-16 07:45:40,716 DEBUG [Thread-760-EventThread] 
 zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s) 
 of data from znode /hbase/unassigned/70236052 and set watcher; 
 region=-ROOT-,,0, server=C4S2.site,47710,1313495126115, 
 state=RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,717 DEBUG [Thread-760-EventThread] 
 master.AssignmentManager(477): Handling transition=RS_ZK_REGION_OPENING, 
 server=C4S2.site,47710,1313495126115, region=70236052/-ROOT-
 2011-08-16 07:45:40,725 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(661): regionserver:47710-0x131d2690f780004 Attempting to 
 transition node 70236052/-ROOT- from RS_ZK_REGION_OPENING to 
 RS_ZK_REGION_OPENED
 2011-08-16 07:45:40,727 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKUtil(1109): regionserver:47710-0x131d2690f780004 Retrieved 52 
 byte(s) of data from znode /hbase/unassigned/70236052; data=region=-ROOT-,,0, 
 server=C4S2.site,47710,1313495126115, state=RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,740 DEBUG 
 [RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
 zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 
 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,740 DEBUG [Thread-760-EventThread] 
 zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
 ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,740 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
 transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
 2011-08-16 07:45:40,741 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 handler.OpenRegionHandler(121): Opened -ROOT-,,0.70236052
 2011-08-16 07:45:40,741 DEBUG [Thread-760-EventThread] 
 zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s) 
 of data from znode /hbase/unassigned/70236052 and set watcher; 
 

[jira] [Commented] (HBASE-4212) TestMasterFailover fails occasionally

2011-08-17 Thread gaojinchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086199#comment-13086199
 ] 

gaojinchao commented on HBASE-4212:
---

I have made a patch. Please review it. Thanks.

 TestMasterFailover fails occasionally
 -

 Key: HBASE-4212
 URL: https://issues.apache.org/jira/browse/HBASE-4212
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: gaojinchao
 Fix For: 0.90.5

 Attachments: HBASE-4212_branch90V1.patch


 It seems a bug. The root in RIT can't be moved..
 In the failover process, it enforces root on-line. But not clean zk node. 
 test will wait forever.
   void processFailover() throws KeeperException, IOException, 
 InterruptedException {
  
 // we enforce on-line root.
 HServerInfo hsi =
   
 this.serverManager.getHServerInfo(this.catalogTracker.getMetaLocation());
 regionOnline(HRegionInfo.FIRST_META_REGIONINFO, hsi);
 hsi = 
 this.serverManager.getHServerInfo(this.catalogTracker.getRootLocation());
 regionOnline(HRegionInfo.ROOT_REGIONINFO, hsi);
 It seems that we should wait finished as meta region 
   int assignRootAndMeta()
   throws InterruptedException, IOException, KeeperException {
 int assigned = 0;
 long timeout = this.conf.getLong(hbase.catalog.verification.timeout, 
 1000);
 // Work on ROOT region.  Is it in zk in transition?
 boolean rit = this.assignmentManager.
   
 processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);
 if (!catalogTracker.verifyRootRegionLocation(timeout)) {
   this.assignmentManager.assignRoot();
   this.catalogTracker.waitForRoot();
   //we need add this code and guarantee that the transition has completed
   this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO);
   assigned++;
 }
 logs:
 2011-08-16 07:45:40,715 DEBUG 
 [RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
 zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 
 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,715 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
 transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,715 DEBUG [Thread-760-EventThread] 
 zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
 ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,716 INFO  [PostOpenDeployTasks:70236052] 
 catalog.RootLocationEditor(62): Setting ROOT region location in ZooKeeper as 
 C4S2.site:47710
 2011-08-16 07:45:40,716 DEBUG [Thread-760-EventThread] 
 zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s) 
 of data from znode /hbase/unassigned/70236052 and set watcher; 
 region=-ROOT-,,0, server=C4S2.site,47710,1313495126115, 
 state=RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,717 DEBUG [Thread-760-EventThread] 
 master.AssignmentManager(477): Handling transition=RS_ZK_REGION_OPENING, 
 server=C4S2.site,47710,1313495126115, region=70236052/-ROOT-
 2011-08-16 07:45:40,725 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(661): regionserver:47710-0x131d2690f780004 Attempting to 
 transition node 70236052/-ROOT- from RS_ZK_REGION_OPENING to 
 RS_ZK_REGION_OPENED
 2011-08-16 07:45:40,727 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKUtil(1109): regionserver:47710-0x131d2690f780004 Retrieved 52 
 byte(s) of data from znode /hbase/unassigned/70236052; data=region=-ROOT-,,0, 
 server=C4S2.site,47710,1313495126115, state=RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,740 DEBUG 
 [RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
 zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 
 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,740 DEBUG [Thread-760-EventThread] 
 zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
 ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,740 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
 transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
 2011-08-16 07:45:40,741 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 handler.OpenRegionHandler(121): Opened -ROOT-,,0.70236052
 2011-08-16 07:45:40,741 DEBUG [Thread-760-EventThread] 
 zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s) 
 of data from znode 

[jira] [Commented] (HBASE-4212) TestMasterFailover fails occasionally

2011-08-17 Thread gaojinchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086202#comment-13086202
 ] 

gaojinchao commented on HBASE-4212:
---

I test 10 times and logs said that META is assigned after root has finished.

2011-08-17 05:06:51,419 DEBUG [MASTER_OPEN_REGION-C4S2.site:47578-0] 
zookeeper.ZKUtil(1109): master:47578-0x131d6fe02e50009 Retrieved 52 byte(s) of 
data from znode /hbase/unassigned/70236052; data=region=-ROOT-,,0, 
server=C4S2.site,60960,1313571996605, state=RS_ZK_REGION_OPENED
2011-08-17 05:06:51,425 DEBUG [Thread-755-EventThread] 
zookeeper.ZooKeeperWatcher(252): master:47578-0x131d6fe02e50009 Received 
ZooKeeper Event, type=NodeDeleted, state=SyncConnected, 
path=/hbase/unassigned/70236052
2011-08-17 05:06:51,425 DEBUG [MASTER_OPEN_REGION-C4S2.site:47578-0] 
zookeeper.ZKAssign(420): master:47578-0x131d6fe02e50009 Successfully deleted 
unassigned node for region 70236052 in expected state RS_ZK_REGION_OPENED
2011-08-17 05:06:51,426 INFO  [Master:0;C4S2.site:47578] master.HMaster(437): 
-ROOT- assigned=1, rit=false, location=C4S2.site:60960
2011-08-17 05:06:51,426 DEBUG [MASTER_OPEN_REGION-C4S2.site:47578-0] 
handler.OpenedRegionHandler(108): Opened region -ROOT-,,0.70236052 on 
C4S2.site,60960,1313571996605
2011-08-17 05:06:51,427 DEBUG [Master:0;C4S2.site:47578] zookeeper.ZKUtil(553): 
master:47578-0x131d6fe02e50009 Unable to get data of znode 
/hbase/unassigned/1028785192 because node does not exist (not an error)
2011-08-17 05:06:51,429 INFO  [Master:0;C4S2.site:47578] 
catalog.CatalogTracker(421): Passed metaserver is null

 TestMasterFailover fails occasionally
 -

 Key: HBASE-4212
 URL: https://issues.apache.org/jira/browse/HBASE-4212
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: gaojinchao
 Fix For: 0.90.5

 Attachments: HBASE-4212_branch90V1.patch


 It seems a bug. The root in RIT can't be moved..
 In the failover process, it enforces root on-line. But not clean zk node. 
 test will wait forever.
   void processFailover() throws KeeperException, IOException, 
 InterruptedException {
  
 // we enforce on-line root.
 HServerInfo hsi =
   
 this.serverManager.getHServerInfo(this.catalogTracker.getMetaLocation());
 regionOnline(HRegionInfo.FIRST_META_REGIONINFO, hsi);
 hsi = 
 this.serverManager.getHServerInfo(this.catalogTracker.getRootLocation());
 regionOnline(HRegionInfo.ROOT_REGIONINFO, hsi);
 It seems that we should wait finished as meta region 
   int assignRootAndMeta()
   throws InterruptedException, IOException, KeeperException {
 int assigned = 0;
 long timeout = this.conf.getLong(hbase.catalog.verification.timeout, 
 1000);
 // Work on ROOT region.  Is it in zk in transition?
 boolean rit = this.assignmentManager.
   
 processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);
 if (!catalogTracker.verifyRootRegionLocation(timeout)) {
   this.assignmentManager.assignRoot();
   this.catalogTracker.waitForRoot();
   //we need add this code and guarantee that the transition has completed
   this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO);
   assigned++;
 }
 logs:
 2011-08-16 07:45:40,715 DEBUG 
 [RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
 zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 
 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,715 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
 transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,715 DEBUG [Thread-760-EventThread] 
 zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
 ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,716 INFO  [PostOpenDeployTasks:70236052] 
 catalog.RootLocationEditor(62): Setting ROOT region location in ZooKeeper as 
 C4S2.site:47710
 2011-08-16 07:45:40,716 DEBUG [Thread-760-EventThread] 
 zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s) 
 of data from znode /hbase/unassigned/70236052 and set watcher; 
 region=-ROOT-,,0, server=C4S2.site,47710,1313495126115, 
 state=RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,717 DEBUG [Thread-760-EventThread] 
 master.AssignmentManager(477): Handling transition=RS_ZK_REGION_OPENING, 
 server=C4S2.site,47710,1313495126115, region=70236052/-ROOT-
 2011-08-16 07:45:40,725 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(661): regionserver:47710-0x131d2690f780004 Attempting to 
 transition node 

[jira] [Updated] (HBASE-4212) TestMasterFailover fails occasionally

2011-08-17 Thread gaojinchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4212:
--

Assignee: gaojinchao
  Status: Patch Available  (was: Open)

 TestMasterFailover fails occasionally
 -

 Key: HBASE-4212
 URL: https://issues.apache.org/jira/browse/HBASE-4212
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.5

 Attachments: HBASE-4212_branch90V1.patch


 It seems a bug. The root in RIT can't be moved..
 In the failover process, it enforces root on-line. But not clean zk node. 
 test will wait forever.
   void processFailover() throws KeeperException, IOException, 
 InterruptedException {
  
 // we enforce on-line root.
 HServerInfo hsi =
   
 this.serverManager.getHServerInfo(this.catalogTracker.getMetaLocation());
 regionOnline(HRegionInfo.FIRST_META_REGIONINFO, hsi);
 hsi = 
 this.serverManager.getHServerInfo(this.catalogTracker.getRootLocation());
 regionOnline(HRegionInfo.ROOT_REGIONINFO, hsi);
 It seems that we should wait finished as meta region 
   int assignRootAndMeta()
   throws InterruptedException, IOException, KeeperException {
 int assigned = 0;
 long timeout = this.conf.getLong(hbase.catalog.verification.timeout, 
 1000);
 // Work on ROOT region.  Is it in zk in transition?
 boolean rit = this.assignmentManager.
   
 processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);
 if (!catalogTracker.verifyRootRegionLocation(timeout)) {
   this.assignmentManager.assignRoot();
   this.catalogTracker.waitForRoot();
   //we need add this code and guarantee that the transition has completed
   this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO);
   assigned++;
 }
 logs:
 2011-08-16 07:45:40,715 DEBUG 
 [RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
 zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 
 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,715 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
 transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,715 DEBUG [Thread-760-EventThread] 
 zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
 ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,716 INFO  [PostOpenDeployTasks:70236052] 
 catalog.RootLocationEditor(62): Setting ROOT region location in ZooKeeper as 
 C4S2.site:47710
 2011-08-16 07:45:40,716 DEBUG [Thread-760-EventThread] 
 zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s) 
 of data from znode /hbase/unassigned/70236052 and set watcher; 
 region=-ROOT-,,0, server=C4S2.site,47710,1313495126115, 
 state=RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,717 DEBUG [Thread-760-EventThread] 
 master.AssignmentManager(477): Handling transition=RS_ZK_REGION_OPENING, 
 server=C4S2.site,47710,1313495126115, region=70236052/-ROOT-
 2011-08-16 07:45:40,725 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(661): regionserver:47710-0x131d2690f780004 Attempting to 
 transition node 70236052/-ROOT- from RS_ZK_REGION_OPENING to 
 RS_ZK_REGION_OPENED
 2011-08-16 07:45:40,727 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKUtil(1109): regionserver:47710-0x131d2690f780004 Retrieved 52 
 byte(s) of data from znode /hbase/unassigned/70236052; data=region=-ROOT-,,0, 
 server=C4S2.site,47710,1313495126115, state=RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,740 DEBUG 
 [RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
 zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 
 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,740 DEBUG [Thread-760-EventThread] 
 zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
 ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,740 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
 transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
 2011-08-16 07:45:40,741 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 handler.OpenRegionHandler(121): Opened -ROOT-,,0.70236052
 2011-08-16 07:45:40,741 DEBUG [Thread-760-EventThread] 
 zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s) 
 of data from znode 

[jira] [Updated] (HBASE-4212) TestMasterFailover fails occasionally

2011-08-17 Thread gaojinchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4212:
--

Attachment: HBASE-4212_TrunkV1.patch

 TestMasterFailover fails occasionally
 -

 Key: HBASE-4212
 URL: https://issues.apache.org/jira/browse/HBASE-4212
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.5

 Attachments: HBASE-4212_TrunkV1.patch, HBASE-4212_branch90V1.patch


 It seems a bug. The root in RIT can't be moved..
 In the failover process, it enforces root on-line. But not clean zk node. 
 test will wait forever.
   void processFailover() throws KeeperException, IOException, 
 InterruptedException {
  
 // we enforce on-line root.
 HServerInfo hsi =
   
 this.serverManager.getHServerInfo(this.catalogTracker.getMetaLocation());
 regionOnline(HRegionInfo.FIRST_META_REGIONINFO, hsi);
 hsi = 
 this.serverManager.getHServerInfo(this.catalogTracker.getRootLocation());
 regionOnline(HRegionInfo.ROOT_REGIONINFO, hsi);
 It seems that we should wait finished as meta region 
   int assignRootAndMeta()
   throws InterruptedException, IOException, KeeperException {
 int assigned = 0;
 long timeout = this.conf.getLong(hbase.catalog.verification.timeout, 
 1000);
 // Work on ROOT region.  Is it in zk in transition?
 boolean rit = this.assignmentManager.
   
 processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);
 if (!catalogTracker.verifyRootRegionLocation(timeout)) {
   this.assignmentManager.assignRoot();
   this.catalogTracker.waitForRoot();
   //we need add this code and guarantee that the transition has completed
   this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO);
   assigned++;
 }
 logs:
 2011-08-16 07:45:40,715 DEBUG 
 [RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
 zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 
 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,715 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
 transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,715 DEBUG [Thread-760-EventThread] 
 zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
 ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,716 INFO  [PostOpenDeployTasks:70236052] 
 catalog.RootLocationEditor(62): Setting ROOT region location in ZooKeeper as 
 C4S2.site:47710
 2011-08-16 07:45:40,716 DEBUG [Thread-760-EventThread] 
 zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s) 
 of data from znode /hbase/unassigned/70236052 and set watcher; 
 region=-ROOT-,,0, server=C4S2.site,47710,1313495126115, 
 state=RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,717 DEBUG [Thread-760-EventThread] 
 master.AssignmentManager(477): Handling transition=RS_ZK_REGION_OPENING, 
 server=C4S2.site,47710,1313495126115, region=70236052/-ROOT-
 2011-08-16 07:45:40,725 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(661): regionserver:47710-0x131d2690f780004 Attempting to 
 transition node 70236052/-ROOT- from RS_ZK_REGION_OPENING to 
 RS_ZK_REGION_OPENED
 2011-08-16 07:45:40,727 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKUtil(1109): regionserver:47710-0x131d2690f780004 Retrieved 52 
 byte(s) of data from znode /hbase/unassigned/70236052; data=region=-ROOT-,,0, 
 server=C4S2.site,47710,1313495126115, state=RS_ZK_REGION_OPENING
 2011-08-16 07:45:40,740 DEBUG 
 [RegionServer:0;C4S2.site,47710,1313495126115-EventThread] 
 zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004 
 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,740 DEBUG [Thread-760-EventThread] 
 zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received 
 ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/70236052
 2011-08-16 07:45:40,740 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully 
 transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
 2011-08-16 07:45:40,741 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0] 
 handler.OpenRegionHandler(121): Opened -ROOT-,,0.70236052
 2011-08-16 07:45:40,741 DEBUG [Thread-760-EventThread] 
 zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s) 
 of data from znode 

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

2011-08-17 Thread gaojinchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4124:
--

Attachment: HBASE-4124_Branch90V1_trial.patch

I try to make a patch and fix this issue.
But I only run the UT test. Please review it firstly and give me some 
suggestion. I will test it tomorrow. Thanks.

 ZK restarted while assigning a region, new active HM re-assign it but the RS 
 warned 'already online on this server'.
 

 Key: HBASE-4124
 URL: https://issues.apache.org/jira/browse/HBASE-4124
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: fulin wang
 Attachments: HBASE-4124_Branch90V1_trial.patch, log.txt

   Original Estimate: 0.4h
  Remaining Estimate: 0.4h

 ZK restarted while assigning a region, new active HM re-assign it but the RS 
 warned 'already online on this server'.
 Issue:
 The RS failed besause of 'already online on this server' and return; The HM 
 can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4175:
--

Attachment: HBASE-4175_2_with catch block.patch

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4175:
--

Status: Open  (was: Patch Available)

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4175:
--

Attachment: HBASE-4175_2_without catch block.patch

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch, HBASE-4175_2_without catch block.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086256#comment-13086256
 ] 

ramkrishna.s.vasudevan commented on HBASE-4175:
---

I have submitted two versions.  One with catch block and without catch block. 
Catch block i have mainly used for logging at one place and also limited the 
number of changes through default apis.


 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch, HBASE-4175_2_without catch block.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4175:
--

Status: Patch Available  (was: Open)

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch, HBASE-4175_2_without catch block.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4199) blockCache summary - backend

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086317#comment-13086317
 ] 

Ted Yu commented on HBASE-4199:
---

Please use better table names below:
{code}
+  private static final String TEST_TABLE = testFamily;
+  private static final String TEST_TABLE2 = testFamily2;
{code}
Javadoc for BlockCacheSummaryEntry should mention entry:
{code}
+/**
+ * Represents a summary of the blockCache by Table and ColumnFamily  
+ *
+ */
+public class BlockCacheSummaryEntry implements Writable, 
ComparableBlockCacheSummaryEntry {
{code}
I think the code below:
{code}
+  bcse = new BlockCacheSummaryEntry();
+  bcse.setTable(s[ s.length - 4]);   // 4th from the end
+  bcse.setColumnFamily(s[ s.length - 2]);   // 2nd from the end
{code}
should be replaced with calling the two parameter ctor. The default ctor should 
be made package private.

 blockCache summary - backend
 

 Key: HBASE-4199
 URL: https://issues.apache.org/jira/browse/HBASE-4199
 Project: HBase
  Issue Type: Sub-task
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch


 This is the backend work for the blockCache summary.  Change to BlockCache 
 interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition 
 to HRegionInterface, and HRegionServer.
 This will NOT include any of the web UI or anything else like that.  That is 
 for another sub-task.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Subbu M Iyer (JIRA)
Support instant schema updates with out master's intervention (i.e with out 
enable/disable and bulk assign/unassign)


 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer


This Jira is a slight variation in approach to what is being done as part of 
https://issues.apache.org/jira/browse/HBASE-1730

Support instant schema updates such as Modify Table, Add Column, Modify Column 
operations:
1. With out enable/disabling the table.
2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Subbu M Iyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subbu M Iyer updated HBASE-4213:


Attachment: HBASE-4213-Instant_schema_change.patch

 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086325#comment-13086325
 ] 

Ted Yu commented on HBASE-4175:
---

+1 on HBASE-4175_2_without catch block.patch

Minor comment, the second F in ForceFul should be lower cased:
{code}
+  public void testShouldAllowForceFulCreationOfAlreadyExistingTableDescriptor()
{code}

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch, HBASE-4175_2_without catch block.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Subbu M Iyer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086328#comment-13086328
 ] 

Subbu M Iyer commented on HBASE-4213:
-

As earlier, for some reasons I am not able to attach my patch to the review 
board.

Ted/Stack: Can one of you help me with this?

thanks

 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4203) While master restarts and if the META region's state is OPENING then master cannot assign META until timeout monitor deducts

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086332#comment-13086332
 ] 

ramkrishna.s.vasudevan commented on HBASE-4203:
---

@Stack,

I am planning to implement the same logic that happens in timeoutmonitor when 
it finds a node in OPENING.  
-The existing logic takes care of checking if the node had got changed to 
OPENED or not.  If not forces the node to OFFLINE and again starts assignment. 
So we can also do the same here.

Also as per the current changes that am trying out in 
timeoutmonitor(HBASE-4015) this change can also be incorporated. Or do you want 
me to submit a seperate patch for this?




 While master restarts and if the META region's state is OPENING then master 
 cannot assign META until timeout monitor deducts
 

 Key: HBASE-4203
 URL: https://issues.apache.org/jira/browse/HBASE-4203
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Minor

 1. Start Master and 2 RS.
 2. If any exception happens while opening the META region the state in znode 
 will be OPENING.
 3. If at this point the master restarts then the master will start processing 
 the regions in RIT.
 4. If the znode is found to be in OPENING then master waits for timeout 
 monitor to deduct and then call opening.
 5. If default timeout monitor is configured(180 sec/30 min) then it will 
 take 30 mins to open the META region itself.
 Soln:
 
 Better not to wait for the Timeout monitor period to open catalog tables on 
 Master restart

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1730) Near-instantaneous online schema and table state updates

2011-08-17 Thread Subbu M Iyer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086333#comment-13086333
 ] 

Subbu M Iyer commented on HBASE-1730:
-

In continuation of HBASE-451, I was working on a patch for this issue and just 
realized that there is already a patch submitted for this Jira. I created a 
related patch HBASE-4213 that follows a slightly different approach to the same 
problem and thought will any way submit my patch as well.


 

 Near-instantaneous online schema and table state updates
 

 Key: HBASE-1730
 URL: https://issues.apache.org/jira/browse/HBASE-1730
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: 1730-v2.patch, 1730-v3.patch, 1730.patch, 
 HBASE-1730.patch


 We should not need to take a table offline to update HCD or HTD. 
 One option for that is putting HTDs and HCDs up into ZK, with mirror on disk 
 catalog tables to be used only for cold init scenarios, as discussed on IRC. 
 In this scheme, regionservers hosting regions of a table would watch 
 permanent nodes in ZK associated with that table for schema updates and take 
 appropriate actions out of the watcher. In effect, schema updates become 
 another item in the ToDo list.
 {{/hbase/tables/table-name/schema}}
 Must be associated with a write locking scheme also handled with ZK 
 primitives to avoid situations where one concurrent update clobbers another. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086356#comment-13086356
 ] 

Ted Yu commented on HBASE-4213:
---

I haven't gone through the patch yet.
bq. 5. Master will recursively delete the node /hbase/schema/table name, if 
the number of childrens
  of /hbase/schema/table name is greater than or equal to current number of 
active region servers.
You meant recursively deleting the children of node /hbase/schema/table name, 
right ?

For region server which joins the cluster after the creation of 
/hbase/schema/table name, it should be able to find out that it already reads 
the most recent HTD. Does it create a child node under /hbase/schema/table 
name ?

 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086356#comment-13086356
 ] 

Ted Yu edited comment on HBASE-4213 at 8/17/11 2:47 PM:


I haven't gone through the patch yet.
bq. 5. Master will recursively delete the node /hbase/schema/table name, if 
the number of childrens of /hbase/schema/table name is greater than or equal 
to current number of active region servers.
You meant recursively deleting the children of node /hbase/schema/table name, 
right ?

For region server which joins the cluster after the creation of 
/hbase/schema/table name, it should be able to find out that it already reads 
the most recent HTD. Does it create a child node under /hbase/schema/table 
name ?

  was (Author: yuzhih...@gmail.com):
I haven't gone through the patch yet.
bq. 5. Master will recursively delete the node /hbase/schema/table name, if 
the number of childrens
  of /hbase/schema/table name is greater than or equal to current number of 
active region servers.
You meant recursively deleting the children of node /hbase/schema/table name, 
right ?

For region server which joins the cluster after the creation of 
/hbase/schema/table name, it should be able to find out that it already reads 
the most recent HTD. Does it create a child node under /hbase/schema/table 
name ?
  
 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4195) Possible inconsistency in a memstore read after a reseek, possible performance improvement

2011-08-17 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086358#comment-13086358
 ] 

nkeywal commented on HBASE-4195:


I can do a simple patch (removing all the code around numIterReseek). However, 
it would conflict with the patch for HBASE-4188/HBASE-1938. Is it possible for 
you to commit this one first? 

Note that I have been able to make this reseek implementation fails as well by 
adding a Thread.sleep between the search on the two iterators. In other words, 
there is a race condition somewhere. It could be a conflict with the flush 
process. I noticed that a flush cannot happen during a put (lock on 
hregion.update) or a seek (lock on store), but there is nothing to prevent a 
reseek to take place during the snapshot. But I don't how long it will take to 
find the real issue behind all this, so a partial fix lowering the probability 
of having an issue makes sense...

 Possible inconsistency in a memstore read after a reseek, possible 
 performance improvement
 --

 Key: HBASE-4195
 URL: https://issues.apache.org/jira/browse/HBASE-4195
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
 Environment: all
Reporter: nkeywal
Priority: Critical

 This follows the dicussion around HBASE-3855, and the random errors (20% 
 failure on trunk) on the unit test 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting
 I saw some points related to numIterReseek, used in the 
 MemStoreScanner#getNext (line 690):
 {noformat}679 protected KeyValue getNext(Iterator it) {
 680 KeyValue ret = null;
 681 long readPoint = ReadWriteConsistencyControl.getThreadReadPoint();
 682 //DebugPrint.println(  MS@ + hashCode() + : threadpoint =  + 
 readPoint);
 683
 684 while (ret == null  it.hasNext()) {
 685   KeyValue v = it.next();
 686   if (v.getMemstoreTS() = readPoint) {
 687 // keep it.
 688 ret = v;
 689   }
 690   numIterReseek--;
 691   if (numIterReseek == 0) {
 692 break;
 693}
 694 }
 695 return ret;
 696   }{noformat}
 This function is called by seek, reseek, and next. The numIterReseek is only 
 usefull for reseek.
 There are some issues, I am not totally sure it's the root cause of the test 
 case error, but it could explain partly the randomness of the error, and one 
 point is for sure a bug.
 1) In getNext, numIterReseek is decreased, then compared to zero. The seek 
 function sets numIterReseek to zero before calling getNext. It means that the 
 value will be actually negative, hence the test will always fail, and the 
 loop will continue. It is the expected behaviour, but it's quite smart.
 2) In reseek, numIterReseek is not set between the loops on the two 
 iterators. If the numIterReseek is equals to zero after the loop on the first 
 one, the loop on the second one will never call seek, as numIterReseek will 
 be negative.
 3) Still in reseek, the test to call seek is (kvsetNextRow == null  
 numIterReseek == 0). In other words, if kvsetNextRow is not null when 
 numIterReseek equals zero, numIterReseek will start to be negative at the 
 next iteration and seek will never be called.
 4) You can have side effects if reseek ends with a numIterReseek  0: the 
 following calls to the next function will decrease numIterReseek to zero, 
 and getNext will break instead of continuing the loop. As a result, later 
 calls to next() may return null or not depending on how is configured the 
 default value for numIterReseek.
 To check if the issue comes from point 4, you can set the numIterReseek to 
 zero before returning in reseek:
 {noformat}  numIterReseek = 0;
   return (kvsetNextRow != null || snapshotNextRow != null);
 }{noformat}
 On my env, on trunk, it seems to work, but as it's random I am not really 
 sure. I also had to modify the test (I added a loop) to make it fails more 
 often, the original test was working quite well here.
 It has to be confirmed that this totally fix (it could be partial or 
 unrelated) 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting 
 before implementing a complete solution.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4206) jenkins hash implementation uses longs unnecessarily

2011-08-17 Thread Ron Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Yang updated HBASE-4206:


Status: Patch Available  (was: Open)

 jenkins hash implementation uses longs unnecessarily
 

 Key: HBASE-4206
 URL: https://issues.apache.org/jira/browse/HBASE-4206
 Project: HBase
  Issue Type: Improvement
  Components: util
Reporter: Ron Yang
Priority: Minor

 I don't believe you need to use long for a,b,c and as a result no longer need 
 to  against INT_MASK.
 At a minimum the private static longs should be made final, and the main 
 method should not print the absolute value of the hash but instead use 
 something like Integer.toHexString

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4206) jenkins hash implementation uses longs unnecessarily

2011-08-17 Thread Ron Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Yang updated HBASE-4206:


Attachment: ryang.patch

 jenkins hash implementation uses longs unnecessarily
 

 Key: HBASE-4206
 URL: https://issues.apache.org/jira/browse/HBASE-4206
 Project: HBase
  Issue Type: Improvement
  Components: util
Reporter: Ron Yang
Priority: Minor
 Attachments: ryang.patch


 I don't believe you need to use long for a,b,c and as a result no longer need 
 to  against INT_MASK.
 At a minimum the private static longs should be made final, and the main 
 method should not print the absolute value of the hash but instead use 
 something like Integer.toHexString

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4199) blockCache summary - backend

2011-08-17 Thread Doug Meil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086405#comment-13086405
 ] 

Doug Meil commented on HBASE-4199:
--

Thanks Ted.

I'll fix the Javadoc and the unit test table constants.

I'm not sure I agree about the overloaded constructor.  Doing this...
{code}
bcse = new BlockCacheSummaryEntry( s[ s.length - 4], s[s.length - 2]);
{code]
... seems less clear to me.  I think the 'setTable' with the comment reminder 
on why it's being done makes more sense.

And doing this...
{code}
String table = ...;   
String cf = ...;
bcse = new BlockCacheSummaryEntry(table, cf);
{code]
... results in basically the same 3 lines of code that exist now.

 blockCache summary - backend
 

 Key: HBASE-4199
 URL: https://issues.apache.org/jira/browse/HBASE-4199
 Project: HBase
  Issue Type: Sub-task
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch


 This is the backend work for the blockCache summary.  Change to BlockCache 
 interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition 
 to HRegionInterface, and HRegionServer.
 This will NOT include any of the web UI or anything else like that.  That is 
 for another sub-task.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4202) Check filesystem permissions on startup

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reassigned HBASE-4202:
-

Assignee: ramkrishna.s.vasudevan

 Check filesystem permissions on startup
 ---

 Key: HBASE-4202
 URL: https://issues.apache.org/jira/browse/HBASE-4202
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.20.4
 Environment: debian squeeze
Reporter: Matthias Hofschen
Assignee: ramkrishna.s.vasudevan
  Labels: noob

 We added a new node to a 44 node cluster starting the datanode, mapred and 
 regionserver processes on it. The Unix filesystem was configured incorrectly, 
 i.e. /tmp was not writable to processes. All three processes had issues with 
 this. Datanode and mapred shutdown on exception.
 Regionserver did not stop, in fact reported to master that its up without 
 regions. So master assigned regions to it. Regionserver would not accept 
 them, resulting in a constant assign, reject, reassign cycle, that put many 
 regions into a state of not being available. There are no logs about this, 
 but we could observer the regioncount fluctuate by hundredths of regions and 
 the application throwing many NotServingRegion exceptions.  
 In fact to the master process the regionserver looked fine, so it was trying 
 to send regions its way. Regionserver rejected them. So the master/balancer 
 was going into a assign/reassign cycle destabilizing the cluster. Many puts 
 and gets simply failed with NotServingRegionExceptions and took a long time 
 to complete.
 Exception from regionserver:
 2011-08-06 23:57:13,953 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, 
 state: SyncConnected, type: NodeCreated, path: /hbase/master
 2011-08-06 23:57:13,957 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 17.1.0.1:6 that we are up
 2011-08-06 23:57:13,957 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 17.1.0.1:6 that we are up
 2011-08-07 00:07:39.648::INFO:  Logging to STDERR via 
 org.mortbay.log.StdErrLog
 2011-08-07 00:07:39.712::INFO:  jetty-6.1.14
 2011-08-07 00:07:39.742::WARN:  tmpdir
 java.io.IOException: Permission denied
 at java.io.UnixFileSystem.createFileExclusively(Native Method)
 at java.io.File.checkAndCreate(File.java:1704)
 at java.io.File.createTempFile(File.java:1792)
 at java.io.File.createTempFile(File.java:1828)
 at 
 org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
 at 
 org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:458)
 at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
 at 
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
 at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
 at org.mortbay.jetty.Server.doStart(Server.java:222)
 at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
 at org.apache.hadoop.http.HttpServer.start(HttpServer.java:461)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1168)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:792)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:430)
 at java.lang.Thread.run(Thread.java:619)
 Exception from datanode:
 2011-08-06 23:37:20,444 INFO org.apache.hadoop.http.HttpServer: Jetty bound 
 to port 50075
 2011-08-06 23:37:20,444 INFO org.mortbay.log: jetty-6.1.14
 2011-08-06 23:37:20,469 WARN org.mortbay.log: tmpdir
 java.io.IOException: Permission denied
 at java.io.UnixFileSystem.createFileExclusively(Native Method)
 at java.io.File.checkAndCreate(File.java:1704)
 at java.io.File.createTempFile(File.java:1792)
 at java.io.File.createTempFile(File.java:1828)
 at 
 org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
 at 
 org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:458)
 at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
 at 
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
 at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
 at 
 

[jira] [Commented] (HBASE-4199) blockCache summary - backend

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086421#comment-13086421
 ] 

Ted Yu commented on HBASE-4199:
---

Please use curly braces for the code directive.
The advantage of using two parameter ctor is that table and column family would 
be set at the same time, reducing the chance of inconsistency between them now 
that the two fields carry default values.

This is my personal opinion.

 blockCache summary - backend
 

 Key: HBASE-4199
 URL: https://issues.apache.org/jira/browse/HBASE-4199
 Project: HBase
  Issue Type: Sub-task
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch


 This is the backend work for the blockCache summary.  Change to BlockCache 
 interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition 
 to HRegionInterface, and HRegionServer.
 This will NOT include any of the web UI or anything else like that.  That is 
 for another sub-task.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4175:
--

Status: Open  (was: Patch Available)

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch, HBASE-4175_2_without catch block.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4175:
--

Attachment: HBASE-4175_3.patch

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch, HBASE-4175_2_without catch block.patch, HBASE-4175_3.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086424#comment-13086424
 ] 

ramkrishna.s.vasudevan commented on HBASE-4175:
---

@Ted
Thanks for your review.

Resubmitted the patch with the mentioned change.

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch, HBASE-4175_2_without catch block.patch, HBASE-4175_3.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4175:
--

Status: Patch Available  (was: Open)

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch, HBASE-4175_2_without catch block.patch, HBASE-4175_3.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4175) Fix FSUtils.createTableDescriptor()

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086425#comment-13086425
 ] 

Ted Yu commented on HBASE-4175:
---

+1 on HBASE-4175_3.patch

 Fix FSUtils.createTableDescriptor()
 ---

 Key: HBASE-4175
 URL: https://issues.apache.org/jira/browse/HBASE-4175
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4175.patch, HBASE-4175_1.patch, HBASE-4175_2_with 
 catch block.patch, HBASE-4175_2_without catch block.patch, HBASE-4175_3.patch


 Currently createTableDescriptor() doesn't return anything.
 The caller wouldn't know whether the descriptor is created or not. See 
 exception handling:
 {code}
} catch(IOException ioe) {
  LOG.info(IOException while trying to create tableInfo in HDFS, ioe);
}
 {code}
 We should return a boolean.
 If the table descriptor exists already, maybe we should deserialize from hdfs 
 and compare with htableDescriptor argument. If they differ, I am not sure 
 what the proper action would be.
 Maybe we can add a boolean argument, force, to createTableDescriptor(). When 
 force is true, existing table descriptor would be overwritten.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4199) blockCache summary - backend

2011-08-17 Thread Doug Meil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4199:
-

Attachment: java_HBASE_4199_v3.patch

Uploading v3 with all requested changes.

 blockCache summary - backend
 

 Key: HBASE-4199
 URL: https://issues.apache.org/jira/browse/HBASE-4199
 Project: HBase
  Issue Type: Sub-task
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: java_HBASE_4199.patch, java_HBASE_4199_v2.patch, 
 java_HBASE_4199_v3.patch


 This is the backend work for the blockCache summary.  Change to BlockCache 
 interface, Summarization in LruBlockCache, BlockCacheSummaryEntry, addition 
 to HRegionInterface, and HRegionServer.
 This will NOT include any of the web UI or anything else like that.  That is 
 for another sub-task.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4214) Per-region request counters should be clearer about scope

2011-08-17 Thread Todd Lipcon (JIRA)
Per-region request counters should be clearer about scope
-

 Key: HBASE-4214
 URL: https://issues.apache.org/jira/browse/HBASE-4214
 Project: HBase
  Issue Type: Bug
  Components: metrics, regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
 Fix For: 0.92.0


In testing trunk, I noticed that per-region request counters shown on table.jsp 
are lifetime-scoped, rather than per-second or some other time range. However, 
I'm pretty sure they reset when the region is moved. So, it's hard to use them 
to judge relative hotness of regions from the web UI without hooking it up to 
something lik OpenTSDB

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4213:
--

Fix Version/s: 0.92.0

Whatever the resolution this should have the same fix version as HBASE-1730

 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
 Fix For: 0.92.0

 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4206) jenkins hash implementation uses longs unnecessarily

2011-08-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086471#comment-13086471
 ] 

Andrew Purtell commented on HBASE-4206:
---

I'm curious if there are before and after microbenchmarks?

 jenkins hash implementation uses longs unnecessarily
 

 Key: HBASE-4206
 URL: https://issues.apache.org/jira/browse/HBASE-4206
 Project: HBase
  Issue Type: Improvement
  Components: util
Reporter: Ron Yang
Priority: Minor
 Attachments: ryang.patch


 I don't believe you need to use long for a,b,c and as a result no longer need 
 to  against INT_MASK.
 At a minimum the private static longs should be made final, and the main 
 method should not print the absolute value of the hash but instead use 
 something like Integer.toHexString

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4216) IllegalArgumentException prefetching from META

2011-08-17 Thread Todd Lipcon (JIRA)
IllegalArgumentException prefetching from META
--

 Key: HBASE-4216
 URL: https://issues.apache.org/jira/browse/HBASE-4216
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical


Received one of these while doing a YCSB test on 26 nodes on trunk:
java.io.IOException: java.lang.IllegalArgumentException: hostname can't be null


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4216) IllegalArgumentException prefetching from META

2011-08-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086476#comment-13086476
 ] 

Todd Lipcon commented on HBASE-4216:


{noformat}
11/08/17 10:44:59 WARN client.HConnectionManager$HConnectionImplementation: 
Encountered problems when prefetch META table:
java.io.IOException: java.lang.IllegalArgumentException: hostname can't be null
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$2.processRow(HConnectionManager.java:822)
at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:212)
at 
org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52)
at 
org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130)
at 
org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127)
at 
org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:341)
at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127)
at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:828)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:882)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:770)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:740)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:659)
at 
org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(ServerCallable.java:70)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1238)
at org.apache.hadoop.hbase.client.HTable.get(HTable.java:612)
at com.yahoo.ycsb.db.HBaseClient.read(HBaseClient.java:160)
at com.yahoo.ycsb.DBWrapper.read(DBWrapper.java:86)
at 
com.yahoo.ycsb.workloads.CoreWorkload.doTransactionRead(CoreWorkload.java:444)
at 
com.yahoo.ycsb.workloads.CoreWorkload.doTransaction(CoreWorkload.java:391)
at com.yahoo.ycsb.ClientTask.call(ClientTask.java:47)
at com.yahoo.ycsb.RateLimiter.call(RateLimiter.java:53)
at com.yahoo.ycsb.RateLimiter.call(RateLimiter.java:13)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.IllegalArgumentException: hostname can't be null
at java.net.InetSocketAddress.init(InetSocketAddress.java:121)
at 
org.apache.hadoop.hbase.HServerAddress.getResolvedAddress(HServerAddress.java:89)
at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:75)
at 
org.apache.hadoop.hbase.HRegionLocation.getServerAddress(HRegionLocation.java:101)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.cacheLocation(HConnectionManager.java:1123)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.access$000(HConnectionManager.java:439)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$2.processRow(HConnectionManager.java:818)
... 27 more
{noformat}

Seems like a race since it went away

 IllegalArgumentException prefetching from META
 --

 Key: HBASE-4216
 URL: https://issues.apache.org/jira/browse/HBASE-4216
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical

 Received one of these while doing a YCSB test on 26 nodes on trunk:
 java.io.IOException: java.lang.IllegalArgumentException: hostname can't be 
 null

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4217) HRS.closeRegion should be able to close regions with only the encoded name

2011-08-17 Thread Jean-Daniel Cryans (JIRA)
HRS.closeRegion should be able to close regions with only the encoded name
--

 Key: HBASE-4217
 URL: https://issues.apache.org/jira/browse/HBASE-4217
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
 Fix For: 0.92.0


We had some sort of an outage this morning due to a few racks losing power, and 
some regions were left in the following state:

ERROR: Region UNKNOWN_REGION on sv4r17s9:60020, 
key=e32bbe1f48c9b3633c557dc0291b90a3, not on HDFS or in META but deployed on 
sv4r17s9:60020

That region was deleted by the master but the region server never got the memo. 
Right now there's no way to force close it because HRS.closeRegion requires an 
HRI and the only way to create one is to get it from .META. which in our case 
doesn't contain a row for that region. Basically we have to wait until that 
server is dead to get rid of the region and make hbck happy.

The required change is to have closeRegion accept an encoded name in both HBA 
(when the RS address is provided) and HRS since it's able to find it anyways 
from it's list of live regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4176) Exposing HBase Filters to the Thrift API

2011-08-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086486#comment-13086486
 ] 

stack commented on HBASE-4176:
--

Anirudh.  Keep going.  It looks great.  Run xmllint every so often to ensure 
you haven't damaged well-formedness.   Run 'mvn -DskipTests site' to actually 
generate the doc. to look at it.  Great stuff. Attach a patch only when done... 
and then I'll commit the whole shebang.  Good stuff.

 Exposing HBase Filters to the Thrift API
 

 Key: HBASE-4176
 URL: https://issues.apache.org/jira/browse/HBASE-4176
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Anirudh Todi
Assignee: Anirudh Todi
Priority: Minor
 Attachments: Filter Language (3).xml, Filter Language(2).docx, Filter 
 Language(2).xml, Filter Language(3).docx, Filter Language.docx, 
 HBASE-4176.patch, book.xml


 Currently, to use any of the filters, one has to explicitly add a scanner for 
 the filter in the Thrift API making it messy and long. With this patch, I am 
 trying to add support for all the filters in a clean way. The user specifies 
 a filter via a string. The string is parsed on the server to construct the 
 filter. More information can be found in the attached document named Filter 
 Language
 This patch is trying to extend and further the progress made by the patches 
 in the HBASE-1744 JIRA (https://issues.apache.org/jira/browse/HBASE-1744)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4176) Exposing HBase Filters to the Thrift API

2011-08-17 Thread Anirudh Todi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086501#comment-13086501
 ] 

Anirudh Todi commented on HBASE-4176:
-

Great! I can continue working on it. If it's okay with you - can we go ahead 
and commit the code? And add this when I finish the book in a separate commit? 
They seem fairly unrelated.

 Exposing HBase Filters to the Thrift API
 

 Key: HBASE-4176
 URL: https://issues.apache.org/jira/browse/HBASE-4176
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Anirudh Todi
Assignee: Anirudh Todi
Priority: Minor
 Attachments: Filter Language (3).xml, Filter Language(2).docx, Filter 
 Language(2).xml, Filter Language(3).docx, Filter Language.docx, 
 HBASE-4176.patch, book.xml


 Currently, to use any of the filters, one has to explicitly add a scanner for 
 the filter in the Thrift API making it messy and long. With this patch, I am 
 trying to add support for all the filters in a clean way. The user specifies 
 a filter via a string. The string is parsed on the server to construct the 
 filter. More information can be found in the attached document named Filter 
 Language
 This patch is trying to extend and further the progress made by the patches 
 in the HBASE-1744 JIRA (https://issues.apache.org/jira/browse/HBASE-1744)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4209) The HBase hbase-daemon.sh SIGKILLs master when stopping it

2011-08-17 Thread Roman Shaposhnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086513#comment-13086513
 ] 

Roman Shaposhnik commented on HBASE-4209:
-

stack, before I submit the patch, I would really appreciate if you could let me 
know the relationship between bin/stop-hbase.sh and bin/hbase-daemon.sh. I was 
under the impression that whatever stop-hbase.sh triggers in the master code 
would also be triggered by JVM shutdown hook upon receiving SIGTERM, but it 
doesn't seem to be that way. Do we have to call bin/stop-hbase.sh manually from 
within the hbase-daemon.sh before stopping daemons?

 The HBase hbase-daemon.sh SIGKILLs master when stopping it
 --

 Key: HBASE-4209
 URL: https://issues.apache.org/jira/browse/HBASE-4209
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Roman Shaposhnik

 There's a bit of code in hbase-daemon.sh that makes HBase master being 
 SIGKILLed when stopping it rather than trying SIGTERM (like it does for other 
 daemons). When HBase is executed in a standalone mode (and the only daemon 
 you need to run is master) that causes newly created tables to go missing as 
 unflushed data is thrown out. If there was not a good reason to kill master 
 with SIGKILL perhaps we can take that special case out and rely on SIGTERM.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4206) jenkins hash implementation uses longs unnecessarily

2011-08-17 Thread Ron Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086526#comment-13086526
 ] 

Ron Yang commented on HBASE-4206:
-

Seems about 35% faster on my MBP core i7 osx 10.6:
java version 1.6.0_26
Java(TM) SE Runtime Environment (build 1.6.0_26-b03-384-10M3425)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-384, mixed mode)


 0% Scenario{vm=java, trial=0, benchmark=JenkinsOld, len=5} 29.96 ns; ?=0.45 ns 
@ 10 trials
 6% Scenario{vm=java, trial=0, benchmark=JenkinsNew, len=5} 15.03 ns; ?=0.13 ns 
@ 3 trials
13% Scenario{vm=java, trial=0, benchmark=JenkinsOld, len=10} 32.73 ns; ?=0.06 
ns @ 3 trials
19% Scenario{vm=java, trial=0, benchmark=JenkinsNew, len=10} 17.75 ns; ?=0.04 
ns @ 3 trials
25% Scenario{vm=java, trial=0, benchmark=JenkinsOld, len=15} 55.01 ns; ?=0.20 
ns @ 3 trials
31% Scenario{vm=java, trial=0, benchmark=JenkinsNew, len=15} 26.48 ns; ?=0.26 
ns @ 3 trials
38% Scenario{vm=java, trial=0, benchmark=JenkinsOld, len=20} 59.97 ns; ?=0.17 
ns @ 3 trials
44% Scenario{vm=java, trial=0, benchmark=JenkinsNew, len=20} 29.21 ns; ?=0.12 
ns @ 3 trials
50% Scenario{vm=java, trial=0, benchmark=JenkinsOld, len=500} 1103.94 ns; 
?=5.87 ns @ 3 trials
56% Scenario{vm=java, trial=0, benchmark=JenkinsNew, len=500} 710.87 ns; ?=0.73 
ns @ 3 trials
63% Scenario{vm=java, trial=0, benchmark=JenkinsOld, len=1000} 2206.56 ns; 
?=5.04 ns @ 3 trials
69% Scenario{vm=java, trial=0, benchmark=JenkinsNew, len=1000} 1400.48 ns; 
?=5.44 ns @ 3 trials
75% Scenario{vm=java, trial=0, benchmark=JenkinsOld, len=1} 21632.90 ns; 
?=38.49 ns @ 3 trials
81% Scenario{vm=java, trial=0, benchmark=JenkinsNew, len=1} 13975.43 ns; 
?=65.42 ns @ 3 trials
88% Scenario{vm=java, trial=0, benchmark=JenkinsOld, len=10} 216426.33 ns; 
?=1378.41 ns @ 3 trials
94% Scenario{vm=java, trial=0, benchmark=JenkinsNew, len=10} 139348.44 ns; 
?=594.38 ns @ 3 trials

   len  benchmark   ns linear runtime
 5 JenkinsOld 30.0 =
 5 JenkinsNew 15.0 =
10 JenkinsOld 32.7 =
10 JenkinsNew 17.7 =
15 JenkinsOld 55.0 =
15 JenkinsNew 26.5 =
20 JenkinsOld 60.0 =
20 JenkinsNew 29.2 =
   500 JenkinsOld   1103.9 =
   500 JenkinsNew710.9 =
  1000 JenkinsOld   2206.6 =
  1000 JenkinsNew   1400.5 =
 1 JenkinsOld  21632.9 ==
 1 JenkinsNew  13975.4 =
10 JenkinsOld 216426.3 ==
10 JenkinsNew 139348.4 ===

Caliper benchmark source:
public static class Benchmark6 extends SimpleBenchmark {
@Param({5, 10, 15, 20, 500, 1000, 1, 
10}) int len;
byte[] bs;
@Override protected void setUp() {
Random r = new Random();
bs = new byte[len];
r.nextBytes(bs);
}

public boolean timeJenkinsOld(int reps) {
int h = 0;
for (int x = 0; x  reps; x++) {
h += JenkinsHashOld.hash(bs, h);
}
return true;
}
public boolean timeJenkinsNew(int reps) {
int h = 0;
JenkinsHashNew jh = new JenkinsHashNew();
for (int x = 0; x  reps; x++) {
h += jh.hash(bs, 0, len, h);
}
return true;
}
}

 jenkins hash implementation uses longs unnecessarily
 

 Key: HBASE-4206
 URL: https://issues.apache.org/jira/browse/HBASE-4206
 Project: HBase
  Issue Type: Improvement
  Components: util
Reporter: Ron Yang
Priority: Minor
 Attachments: ryang.patch


 I don't believe you need to use long for a,b,c and as a result no longer need 
 to  against INT_MASK.
 At a minimum the private static longs should be made final, and the main 
 method should not print the absolute value of the hash but instead use 
 something like Integer.toHexString

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-08-17 Thread Jacek Migdal (JIRA)
Delta Encoding of KeyValues  (aka prefix compression)
-

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Reporter: Jacek Migdal


A compression for keys. Keys are sorted in HFile and they are usually very 
similar. Because of that, it is possible to design better compression than 
general purpose algorithms,

It is an additional step designed to be used in memory. It aims to save memory 
in cache as well as speeding seeks within HFileBlocks. It should improve 
performance a lot, if key lengths are larger than value lengths. For example, 
it makes a lot of sense to use it when value is a counter.

Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
shows that I could achieve decent level of compression:
 key compression ratio: 92%
 total compression ratio: 85%
 LZO on the same data: 85%
 LZO after delta encoding: 91%
While having much better performance (20-80% faster decompression ratio than 
LZO). Moreover, it should allow far more efficient seeking which should improve 
performance a bit.

It seems that a simple compression algorithms are good enough. Most of the 
savings are due to prefix compression, int128 encoding, timestamp diffs and 
bitfields to avoid duplication. That way, comparisons of compressed data can be 
much faster than a byte comparator (thanks to prefix compression and bitfields).

In order to implement it in HBase two important changes in design will be 
needed:
-solidify interface to HFileBlock / HFileReader Scanner to provide seeking and 
iterating; access to uncompressed buffer in HFileBlock will have bad performance
-extend comparators to support comparison assuming that N first bytes are equal 
(or some fields are equal)

Link to a discussion about something similar:
http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086545#comment-13086545
 ] 

Ted Yu commented on HBASE-4218:
---

bq. Moreover, it should allow far more efficient seeking which should improve 
performance a bit.
Can performance improvement be quantified ?

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Reporter: Jacek Migdal
  Labels: compression

 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-08-17 Thread Jacek Migdal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086556#comment-13086556
 ] 

Jacek Migdal commented on HBASE-4218:
-

Yes, I plan to measure seek performance within one block.

I haven't implement it yet, but I rather expect that it will make seeking and 
decompressing KeyValues as fast as operating on uncompressed bytes.

The primary goal is to save memory in buffers.

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Reporter: Jacek Migdal
  Labels: compression

 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-08-17 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086574#comment-13086574
 ] 

Matt Corgan commented on HBASE-4218:


Sorry I haven't chimed in on this in a while, but I've made significant 
progress implementing some of the ideas I mentioned in the discussion you 
linked to.  Taking a sorted ListKeyValue, converting to a compressed byte[], 
and then providing fast mechanisms for reading the byte[] back to KeyValues.  
It should work for block indexes and data blocks.

I don't think I'll be able to do the full integration into HBase, but I'm 
trying to get the code to a point where it's well designed, tested, and easy 
(possible) to start working in to the code base.  I'll try to get it on github 
in the next couple weeks.  I wish I could dedicate more time, but it's been a 
nights/weekends project.

Here's a quick storage format overview.  Class names begin with Pt for 
Prefix Trie.  

A block of KeyValues gets converted to a byte[] composed of 5 sections:

1) PtBlockMeta stores some offsets into the block, the width of some 
byte-encoded integers, etc.. http://pastebin.com/iizJz3f4

2) PtRowNodes are the bulk of the complexity.  They store a trie structure for 
rebuilding the row keys in the block.  Each Leaf node has a list of offsets 
that point to the corresponding columns, timestamps, and data offsets/lengths 
in that row.  The row data is structured for efficient sequential iteration 
and/or individual row lookups.  http://pastebin.com/cb79N0Ge

3) PtColNodes store a trie structure that provides random access to column 
qualifiers.  A PtRowNode points at one of these and it traverses its parents 
backwards through the trie to rebuild the full column qualifier.  Important for 
wide rows.  http://pastebin.com/7rsq7epp

4) TimestampDeltas are byte-encoded deltas from the minimum timestamp in the 
block.  The PtRowNodes contain pointers to these deltas.  The width of all 
deltas is determined by the longest one.  Supports having all timestamps equal 
to the minTimestamp resulting in zero storage cost.

5) A data section made of all data values concatenated together.  The 
PtRowNodes contain the offsets/lengths.


My first priority is getting the storage format right.  Then optimizing the 
read path.  Then the write path.  I'd love to hear any comments, and will 
continue to work on getting the full code ready.

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Reporter: Jacek Migdal
  Labels: compression

 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4219) Add Per-Column Family Metrics

2011-08-17 Thread Nicolas Spiegelberg (JIRA)
Add Per-Column Family Metrics
-

 Key: HBASE-4219
 URL: https://issues.apache.org/jira/browse/HBASE-4219
 Project: HBase
  Issue Type: New Feature
Reporter: Nicolas Spiegelberg
Assignee: David Goode
 Fix For: 0.92.0


Right now, we have region server level statistics.  However, the read/write 
flow varies a lot based on the column family involved.  We should add dynamic, 
per column family metrics to JMX so we can track each column family 
individually.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4176) Exposing HBase Filters to the Thrift API

2011-08-17 Thread Anirudh Todi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anirudh Todi updated HBASE-4176:


Attachment: book2.html
book2.xml

Attaching book2.xml and book2.html - containing my Filter Language document

 Exposing HBase Filters to the Thrift API
 

 Key: HBASE-4176
 URL: https://issues.apache.org/jira/browse/HBASE-4176
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Anirudh Todi
Assignee: Anirudh Todi
Priority: Minor
 Attachments: Filter Language (3).xml, Filter Language(2).docx, Filter 
 Language(2).xml, Filter Language(3).docx, Filter Language.docx, 
 HBASE-4176.patch, book.xml, book2.html, book2.xml


 Currently, to use any of the filters, one has to explicitly add a scanner for 
 the filter in the Thrift API making it messy and long. With this patch, I am 
 trying to add support for all the filters in a clean way. The user specifies 
 a filter via a string. The string is parsed on the server to construct the 
 filter. More information can be found in the attached document named Filter 
 Language
 This patch is trying to extend and further the progress made by the patches 
 in the HBASE-1744 JIRA (https://issues.apache.org/jira/browse/HBASE-1744)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086626#comment-13086626
 ] 

Ted Yu commented on HBASE-4213:
---

In TableEventHandler.java, Exception handling should be enhanced:
{code}
+  } catch (KeeperException e) {
+LOG.warn(Instant schema change failed for table  + tableName );
+  }
{code}


 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
 Fix For: 0.92.0

 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2321) Support RPC interface changes at runtime

2011-08-17 Thread Benoit Sigoure (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoit Sigoure updated HBASE-2321:
--

Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed])

This breaks RPC compatibility.

 Support RPC interface changes at runtime
 

 Key: HBASE-2321
 URL: https://issues.apache.org/jira/browse/HBASE-2321
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Andrew Purtell
Assignee: Gary Helmling
 Fix For: 0.92.0


 Now we are able to append methods to interfaces without breaking RPC 
 compatibility with earlier releases. However there is no way that I am aware 
 of to dynamically add entire new RPC interfaces. Methods/parameters are fixed 
 to the class used to instantiate the server at that time. Coprocessors need 
 this. They will extend functionality on regions in arbitrary ways. How to 
 support that on the client side? A couple of options:
 1. New RPC from scratch.
 2. Modify HBaseServer such that multiple interface objects can be used for 
 reflection and objects can be added or removed at runtime. 
 3. Have the coprocessor host instantiate new HBaseServer instances on 
 ephemeral ports and publish the endpoints to clients via Zookeeper. Couple 
 this with a small modification to HBaseServer to support elastic thread pools 
 to minimize the number of threads that might be kept around in the JVM. 
 4. Add a generic method to HRegionInterface, an ioctl-like construction, 
 which accepts a ImmutableBytesWritable key and an array of Writable as 
 parameters. 
 My opinion is we should opt for #4 as it is the simplest and most expedient 
 approach. I could also do #3 if consensus prefers. Really we should do #1 but 
 it's not clear who has the time for that at the moment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-08-17 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086634#comment-13086634
 ] 

Matt Corgan commented on HBASE-4218:


That sounds great Jacek.  Let me know how to get the interfaces, tests, and 
benchmarks when you're ready to share them.  They would be really helpful.

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Reporter: Jacek Migdal
  Labels: compression

 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4220) Lots of DNS queries from client

2011-08-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086643#comment-13086643
 ] 

Todd Lipcon commented on HBASE-4220:


I managed to snag the following stack trace:

{noformat}
pool-1-thread-1 prio=10 tid=0x2aac380e4800 nid=0x7797 runnable 
[0x40d4d000]
   java.lang.Thread.State: RUNNABLE
at java.lang.Throwable.fillInStackTrace(Native Method)
- locked 0x2aabb74d06b8 (a java.lang.NumberFormatException)
at java.lang.Throwable.init(Throwable.java:196)
at java.lang.Exception.init(Exception.java:41)
at java.lang.RuntimeException.init(RuntimeException.java:43)
at 
java.lang.IllegalArgumentException.init(IllegalArgumentException.java:36)
at java.lang.NumberFormatException.init(NumberFormatException.java:38)
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:449)
at java.lang.Integer.parseInt(Integer.java:499)
at 
sun.net.util.IPAddressUtil.textToNumericFormatV4(IPAddressUtil.java:94)
at java.net.InetAddress.getAllByName(InetAddress.java:1051)
at java.net.InetAddress.getAllByName(InetAddress.java:1020)
at java.net.InetAddress.getByName(InetAddress.java:970)
at java.net.InetSocketAddress.init(InetSocketAddress.java:124)
at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:75)
at 
org.apache.hadoop.hbase.HRegionLocation.getServerAddress(HRegionLocation.java:101)
at 
org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(ServerCallable.java:71)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1238)
at org.apache.hadoop.hbase.client.HTable.get(HTable.java:612)
{noformat}

 Lots of DNS queries from client
 ---

 Key: HBASE-4220
 URL: https://issues.apache.org/jira/browse/HBASE-4220
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0


 In running a YCSB workload, I managed to DDOS a DNS server since it seems to 
 be flooding lots of DNS requests. Installing nscd on the client machines 
 improved throughput by a factor of 6 and stopped killing the server. These 
 are long-running clients, so it's not clear why we do so many lookups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4220) Lots of DNS queries from client

2011-08-17 Thread Todd Lipcon (JIRA)
Lots of DNS queries from client
---

 Key: HBASE-4220
 URL: https://issues.apache.org/jira/browse/HBASE-4220
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0


In running a YCSB workload, I managed to DDOS a DNS server since it seems to be 
flooding lots of DNS requests. Installing nscd on the client machines improved 
throughput by a factor of 6 and stopped killing the server. These are 
long-running clients, so it's not clear why we do so many lookups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086645#comment-13086645
 ] 

Ted Yu commented on HBASE-4213:
---

In HRegionServer.java, the following method would always return true if there 
is no intervening IOException:
{code}
+  public boolean refreshSchema(byte[] tableName) throws IOException {
{code}
I wonder if the return type should be void.

 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
 Fix For: 0.92.0

 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086651#comment-13086651
 ] 

Ted Yu commented on HBASE-4213:
---

In EventHandler.java, should isSchemaChangeEvent() include C_M_DELETE_FAMILY as 
well ?

 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
 Fix For: 0.92.0

 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-08-17 Thread Jacek Migdal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086650#comment-13086650
 ] 

Jacek Migdal commented on HBASE-4218:
-

So far the implemented interface looks like:
{noformat} 
/**
 * Fast compression of KeyValue. It aims to be fast and efficient
 * using assumptions:
 * - the KeyValue are stored sorted by key
 * - we know the structure of KeyValue
 * - the values are iterated always forward from beginning of block
 * - application specific knowledge 
 * 
 * It is designed to work fast enough to be feasible as in memory compression.
 */
public interface DeltaEncoder {
  /**
   * Compress KeyValues and write them to output buffer.
   * @param writeHere Where to write compressed data.
   * @param rawKeyValues Source of KeyValue for compression.
   * @throws IOException If there is an error in writeHere.
   */
  public void compressKeyValue(OutputStream writeHere, ByteBuffer rawKeyValues)
  throws IOException;
  
  /**
   * Uncompress assuming that original size is known.
   * @param source Compressed stream of KeyValues.
   * @param decompressedSize Size in bytes of uncompressed KeyValues.
   * @return Uncompressed block of KeyValues.
   * @throws IOException If there is an error in source.
   * @throws DeltaEncoderToSmallBufferException If specified uncompressed
   *size is too small.
   */
  public ByteBuffer uncompressKeyValue(DataInputStream source,
  int decompressedSize)
  throws IOException, DeltaEncoderToSmallBufferException;
}
{noformat}

I also need some kind of interface for iterating and seeking. I haven't got it 
yet but would like to have something like:
{noformat}
  public IteratorKeyValue getIterator(ByteBuffer encodedKeyValues);
  public IteratorKeyValue getIteratorStartingFrom(ByteBuffer 
encodedKeyValues, byte[] keyBuffer, int offset, int length);
{noformat}
For me it would work, but for you I might have changing it to something like:
{noformat}
  public EncodingIterator getState(ByteBuffer encodedKeyValues);
class EncodingIterator implements IteratorKeyValue {
...
  public void seekToBeginning();
  public void seekTo(byte[] keyBuffer, int offset, int length);
{noformat}

I will figure out how we could share the code.

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Reporter: Jacek Migdal
  Labels: compression

 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4220) Lots of DNS queries from client

2011-08-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086654#comment-13086654
 ] 

Todd Lipcon commented on HBASE-4220:


I think the constructor of HRegionLocation should create and cache the 
HServerAddress

 Lots of DNS queries from client
 ---

 Key: HBASE-4220
 URL: https://issues.apache.org/jira/browse/HBASE-4220
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0


 In running a YCSB workload, I managed to DDOS a DNS server since it seems to 
 be flooding lots of DNS requests. Installing nscd on the client machines 
 improved throughput by a factor of 6 and stopped killing the server. These 
 are long-running clients, so it's not clear why we do so many lookups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4221) Changes necessary to build and run against Hadoop 0.23

2011-08-17 Thread Todd Lipcon (JIRA)
Changes necessary to build and run against Hadoop 0.23
--

 Key: HBASE-4221
 URL: https://issues.apache.org/jira/browse/HBASE-4221
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.92.0


A few modifications necessary to run against today's trunk:
- copy-paste VersionedProtocol into the hbase IPC package
- upgrade protobufs to 2.4.0a
- fix one of the tests in TestHFileOutputFormat for new TaskAttemptContext API
- remove illegal accesses to private members of FSNamesystem in tests (use 
reflection)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3917) Separate the Avro schema definition file from the code

2011-08-17 Thread Alex Newman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-3917:
---

Attachment: 0001-HBASE-3917.-Separate-the-Avro-schema-definition-file.patch

 Separate the Avro schema definition file from the code
 --

 Key: HBASE-3917
 URL: https://issues.apache.org/jira/browse/HBASE-3917
 Project: HBase
  Issue Type: Improvement
  Components: avro
Affects Versions: 0.90.3
Reporter: Lars George
Priority: Trivial
  Labels: noob
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-3917.-Separate-the-Avro-schema-definition-file.patch


 The Avro schema files are in the src/main/java path, but should be in 
 /src/main/resources just like the Hbase.thrift is. Makes the separation the 
 same and cleaner.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-3917) Separate the Avro schema definition file from the code

2011-08-17 Thread Alex Newman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman reassigned HBASE-3917:
--

Assignee: Alex Newman

 Separate the Avro schema definition file from the code
 --

 Key: HBASE-3917
 URL: https://issues.apache.org/jira/browse/HBASE-3917
 Project: HBase
  Issue Type: Improvement
  Components: avro
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Alex Newman
Priority: Trivial
  Labels: noob
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-3917.-Separate-the-Avro-schema-definition-file.patch


 The Avro schema files are in the src/main/java path, but should be in 
 /src/main/resources just like the Hbase.thrift is. Makes the separation the 
 same and cleaner.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3917) Separate the Avro schema definition file from the code

2011-08-17 Thread Alex Newman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-3917:
---

Status: Patch Available  (was: Open)

 Separate the Avro schema definition file from the code
 --

 Key: HBASE-3917
 URL: https://issues.apache.org/jira/browse/HBASE-3917
 Project: HBase
  Issue Type: Improvement
  Components: avro
Affects Versions: 0.90.3
Reporter: Lars George
Priority: Trivial
  Labels: noob
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-3917.-Separate-the-Avro-schema-definition-file.patch


 The Avro schema files are in the src/main/java path, but should be in 
 /src/main/resources just like the Hbase.thrift is. Makes the separation the 
 same and cleaner.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2750) Add sanity check for system configs in hbase-daemon wrapper

2011-08-17 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086687#comment-13086687
 ] 

Alex Newman commented on HBASE-2750:


I assume it should only prevent the regionserver/master daemons from starting?

 Add sanity check for system configs in hbase-daemon wrapper
 ---

 Key: HBASE-2750
 URL: https://issues.apache.org/jira/browse/HBASE-2750
 Project: HBase
  Issue Type: New Feature
  Components: scripts
Affects Versions: 0.90.0
Reporter: Todd Lipcon
Priority: Minor
  Labels: noob

 We should add a config variable like MIN_ULIMIT_TO_START in hbase-env.sh. If 
 the daemon script finds ulimit  this value, it will print a warning and 
 refuse to start. We can make the default set to 0 so that this doesn't affect 
 non-production clusters, but in the tuning guide recommend that people change 
 it to the expected ulimit.
 (I've seen it happen all the time where people configure ulimit on some 
 nodes, add a new node to the cluster, and forgot to re-tune it on the new 
 one, and then that new one borks the whole cluster when it joins)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4221) Changes necessary to build and run against Hadoop 0.23

2011-08-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086703#comment-13086703
 ] 

Andrew Purtell commented on HBASE-4221:
---

bq. This patch doesn't include the protobuf update - talking with Andrew and 
Gary about updating our protobufs to 2.4.0a to match Hadoop's.

We can just up the protobuf dep for REST to the latest and regenerate. This use 
is quite self contained.

 Changes necessary to build and run against Hadoop 0.23
 --

 Key: HBASE-4221
 URL: https://issues.apache.org/jira/browse/HBASE-4221
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.92.0

 Attachments: hbase-4221.txt


 A few modifications necessary to run against today's trunk:
 - copy-paste VersionedProtocol into the hbase IPC package
 - upgrade protobufs to 2.4.0a
 - fix one of the tests in TestHFileOutputFormat for new TaskAttemptContext API
 - remove illegal accesses to private members of FSNamesystem in tests (use 
 reflection)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-08-17 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086704#comment-13086704
 ] 

Matt Corgan commented on HBASE-4218:


I should be able to work with ByteBuffer as the backing block data.

Like you said above, we'll have to work on smarter iterators and comparators 
that can do most things without instantiating a full KeyValue in it's current 
form.  Sounds like it will be a longer term project to make KeyValue into a 
more flexible interface, so in the mean time there will be places it has to 
cut a full KeyValue by copying bytes.

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Reporter: Jacek Migdal
  Labels: compression

 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4176) Exposing HBase Filters to the Thrift API

2011-08-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086707#comment-13086707
 ] 

stack commented on HBASE-4176:
--

Can you add the patch as a diff against src/docbk/book.xml?  Seems like there 
are a bunch of changes outside of the scope of your filter addition (and your 
book.html is missing stuff added recently).  Thanks Anirudh.

 Exposing HBase Filters to the Thrift API
 

 Key: HBASE-4176
 URL: https://issues.apache.org/jira/browse/HBASE-4176
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Anirudh Todi
Assignee: Anirudh Todi
Priority: Minor
 Attachments: Filter Language (3).xml, Filter Language(2).docx, Filter 
 Language(2).xml, Filter Language(3).docx, Filter Language.docx, 
 HBASE-4176.patch, book.xml, book2.html, book2.xml


 Currently, to use any of the filters, one has to explicitly add a scanner for 
 the filter in the Thrift API making it messy and long. With this patch, I am 
 trying to add support for all the filters in a clean way. The user specifies 
 a filter via a string. The string is parsed on the server to construct the 
 filter. More information can be found in the attached document named Filter 
 Language
 This patch is trying to extend and further the progress made by the patches 
 in the HBASE-1744 JIRA (https://issues.apache.org/jira/browse/HBASE-1744)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-08-17 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086708#comment-13086708
 ] 

Jonathan Gray commented on HBASE-4218:
--

bq. in the mean time there will be places it has to cut a full KeyValue by 
copying bytes
Agreed.  There's some other work going on around slab allocators and object 
reuse that could be paired with this to ameliorate some of that overhead.

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Reporter: Jacek Migdal
  Labels: compression

 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4222) Make HLog more resilient to write pipeline failures

2011-08-17 Thread Gary Helmling (JIRA)
Make HLog more resilient to write pipeline failures
---

 Key: HBASE-4222
 URL: https://issues.apache.org/jira/browse/HBASE-4222
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Gary Helmling
 Fix For: 0.92.0


The current implementation of HLog rolling to recover from transient errors in 
the write pipeline seems to have two problems:

# When {{HLog.LogSyncer}} triggers an {{IOException}} during time-based sync 
operations, it triggers a log rolling request in the corresponding catch block, 
but only after escaping from the internal while loop.  As a result, the 
{{LogSyncer}} thread will exit and never be restarted from what I can tell, 
even if the log rolling was successful.
# Log rolling requests triggered by an {{IOException}} in {{sync()}} or 
{{append()}} never happen if no entries have yet been written to the log.  This 
means that write errors are not immediately recovered, which extends the 
exposure to more errors occurring in the pipeline.

In addition, it seems like we should be able to better handle transient 
problems, like a rolling restart of DataNodes while the HBase RegionServers are 
running.  Currently this will reliably cause RegionServer aborts during log 
rolling: either an append or time-based sync triggers an initial 
{{IOException}}, initiating a log rolling request.  However the log rolling 
then fails in closing the current writer (All datanodes are bad), causing a 
RegionServer abort.  In this case, it seems like we should at least allow you 
an option to continue with the new writer and only abort on subsequent errors.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-08-17 Thread gaojinchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086728#comment-13086728
 ] 

gaojinchao commented on HBASE-3845:
---

Hi,Patch has not yet apply to the branch ?  

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
 HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, 
 HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, 
 HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, 
 HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Subbu M Iyer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086745#comment-13086745
 ] 

Subbu M Iyer commented on HBASE-4213:
-

Ted,

Yes. DELETE_FAMILY also needs to be added. Will take care of that.

 You meant recursively deleting the children of node /hbase/schema/table 
 name, right ?

Yes.

 For region server which joins the cluster after the creation of 
 /hbase/schema/table name, it should be able to find out that it already 
 reads the most recent HTD. Does it create a child node under 
 /hbase/schema/table name ?

The newly joined RS will not create a child under /hbase/schema/table name as 
it will not be processing the schema change event. 




 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
 Fix For: 0.92.0

 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4213) Support instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign)

2011-08-17 Thread Subbu M Iyer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086751#comment-13086751
 ] 

Subbu M Iyer commented on HBASE-4213:
-

Regarding: 

  public boolean refreshSchema(byte[] tableName) throws IOException {

I agree that it could be a void instead of a boolean. Will change that as well.



 Support instant schema updates with out master's intervention (i.e with out 
 enable/disable and bulk assign/unassign)
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
 Fix For: 0.92.0

 Attachments: HBASE-4213-Instant_schema_change.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4223) Support the ability to return a set of rows using Coprocessors

2011-08-17 Thread Nichole Treadway (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nichole Treadway updated HBASE-4223:


Affects Version/s: 0.92.0
   Status: Patch Available  (was: Open)

 Support the ability to return a set of rows using Coprocessors
 --

 Key: HBASE-4223
 URL: https://issues.apache.org/jira/browse/HBASE-4223
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Nichole Treadway
Priority: Minor

 Currently HBase supports returning the results of aggregation operations 
 using coprocessors with the AggregationClient. It would be useful to include 
 a client and implementation which would return a set of rows which match a 
 certain criteria using coprocessors as well. We have a use case in our 
 business process for this. 
 We have an initial implementation of this, which I've attached. The only 
 limitation that we've found is that it cannot be used to return very large 
 sets of rows. If the result set is very large, it would probably require some 
 sort of pagination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4223) Support the ability to return a set of rows using Coprocessors

2011-08-17 Thread Nichole Treadway (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nichole Treadway updated HBASE-4223:


Attachment: HBASE-4223.patch

 Support the ability to return a set of rows using Coprocessors
 --

 Key: HBASE-4223
 URL: https://issues.apache.org/jira/browse/HBASE-4223
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Nichole Treadway
Priority: Minor
 Attachments: HBASE-4223.patch


 Currently HBase supports returning the results of aggregation operations 
 using coprocessors with the AggregationClient. It would be useful to include 
 a client and implementation which would return a set of rows which match a 
 certain criteria using coprocessors as well. We have a use case in our 
 business process for this. 
 We have an initial implementation of this, which I've attached. The only 
 limitation that we've found is that it cannot be used to return very large 
 sets of rows. If the result set is very large, it would probably require some 
 sort of pagination.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version

2011-08-17 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086799#comment-13086799
 ] 

jirapos...@reviews.apache.org commented on HBASE-4071:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1582/
---

Review request for Ian Varley.


Summary
---

A min versions coupled with TTL.

Note that the I unified the GC logic inside the ColumnTrackers. Previously they 
did the versioning and TTL was handled outside.

What is still open what to do with Store.getKeyAtOrBefore(...)


This addresses bug HBASE-4071.
https://issues.apache.org/jira/browse/HBASE-4071


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
 1158860 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 
1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
 1158860 

Diff: https://reviews.apache.org/r/1582/diff


Testing
---

See TestMinVersions.


Thanks,

Lars



 Data GC: Remove all versions  TTL EXCEPT the last written version
 --

 Key: HBASE-4071
 URL: https://issues.apache.org/jira/browse/HBASE-4071
 Project: HBase
  Issue Type: New Feature
Reporter: stack
 Attachments: MinVersions.diff


 We were chatting today about our backup cluster.  What we want is to be able 
 to restore the dataset from any point of time but only within a limited 
 timeframe -- say one week.  Thereafter, if the versions are older than one 
 week, rather than as we do with TTL where we let go of all versions older 
 than TTL, instead, let go of all versions EXCEPT the last one written.  So, 
 its like versions==1 when TTL  one week.  We want to allow that if an error 
 is caught within a week of its happening -- user mistakenly removes a 
 critical table -- then we'll be able to restore up the the moment just before 
 catastrophe hit otherwise, we keep one version only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version

2011-08-17 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086804#comment-13086804
 ] 

jirapos...@reviews.apache.org commented on HBASE-4071:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1582/
---

(Updated 2011-08-18 05:01:59.054304)


Review request for Ian Varley.


Summary
---

A min versions coupled with TTL.

Note that the I unified the GC logic inside the ColumnTrackers. Previously they 
did the versioning and TTL was handled outside.

What is still open what to do with Store.getKeyAtOrBefore(...)


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
 1158860 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 
1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
 1158860 

Diff: https://reviews.apache.org/r/1582/diff


Testing
---

See TestMinVersions.


Thanks,

Lars



 Data GC: Remove all versions  TTL EXCEPT the last written version
 --

 Key: HBASE-4071
 URL: https://issues.apache.org/jira/browse/HBASE-4071
 Project: HBase
  Issue Type: New Feature
Reporter: stack
 Attachments: MinVersions.diff


 We were chatting today about our backup cluster.  What we want is to be able 
 to restore the dataset from any point of time but only within a limited 
 timeframe -- say one week.  Thereafter, if the versions are older than one 
 week, rather than as we do with TTL where we let go of all versions older 
 than TTL, instead, let go of all versions EXCEPT the last one written.  So, 
 its like versions==1 when TTL  one week.  We want to allow that if an error 
 is caught within a week of its happening -- user mistakenly removes a 
 critical table -- then we'll be able to restore up the the moment just before 
 catastrophe hit otherwise, we keep one version only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086805#comment-13086805
 ] 

ramkrishna.s.vasudevan commented on HBASE-3845:
---

Yes Gao.  The fix is not gone into 0.90.x version.  its available in trunk 
only.  

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
 HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, 
 HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, 
 HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, 
 HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4071) Data GC: Remove all versions TTL EXCEPT the last written version

2011-08-17 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086808#comment-13086808
 ] 

jirapos...@reviews.apache.org commented on HBASE-4071:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1582/
---

(Updated 2011-08-18 05:07:35.742598)


Review request for Ian Varley.


Summary
---

A min versions coupled with TTL.

Note that the I unified the GC logic inside the ColumnTrackers. Previously they 
did the versioning and TTL was handled outside.

What is still open what to do with Store.getKeyAtOrBefore(...)


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
 1158860 
  http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 
1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
 1158860 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
 1158860 

Diff: https://reviews.apache.org/r/1582/diff


Testing
---

See TestMinVersions.


Thanks,

Lars



 Data GC: Remove all versions  TTL EXCEPT the last written version
 --

 Key: HBASE-4071
 URL: https://issues.apache.org/jira/browse/HBASE-4071
 Project: HBase
  Issue Type: New Feature
Reporter: stack
 Attachments: MinVersions.diff


 We were chatting today about our backup cluster.  What we want is to be able 
 to restore the dataset from any point of time but only within a limited 
 timeframe -- say one week.  Thereafter, if the versions are older than one 
 week, rather than as we do with TTL where we let go of all versions older 
 than TTL, instead, let go of all versions EXCEPT the last one written.  So, 
 its like versions==1 when TTL  one week.  We want to allow that if an error 
 is caught within a week of its happening -- user mistakenly removes a 
 critical table -- then we'll be able to restore up the the moment just before 
 catastrophe hit otherwise, we keep one version only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4176) Exposing HBase Filters to the Thrift API

2011-08-17 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086821#comment-13086821
 ] 

jirapos...@reviews.apache.org commented on HBASE-4176:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1326/
---

(Updated 2011-08-18 05:35:32.078889)


Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, and Jonathan Gray.


Changes
---

Included the Filter Language document in the docbook. Please do let me know if 
I should add more documentation or any formatting issues I can try and address.

P.S. Is trunk broken right now? I get an error saying - BaseMasterObserver is 
not abstract and does not override abstract method 
stop(org.apache.hadoop.hbase.coprocessor.CoprocessorEnvironment) in 
org.apache.hadoop.hbase.coprocessor.Coprocessor

This is completely unrelated to my diff - and I checked that the only 
difference b/w this diff and the previous one that I had submitted is the 
change to book.xml. Thus I am going ahead and uploading the patch.


Summary
---

https://issues.apache.org/jira/browse/HBASE-4176: Exposing HBase Filters to the 
Thrift API

Currently, to use any of the filters, one has to explicitly add a scanner for 
the filter in the Thrift API making it messy and long. 
With this patch, I am trying to add support for all the filters in a clean way. 
The user specifies a filter via a string. The string is parsed on the server to 
construct the filter. More information can be found in the attached document 
named Filter Language

This patch is trying to extend and further the progress made by the patches in 
HBASE-1744

There is document attached to the HBASE-4176 JIRA that describes this patch in 
further detail


This addresses bug HBASE-4176.
https://issues.apache.org/jira/browse/HBASE-4176


Diffs (updated)
-

  /src/docbkx/book.xml 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/ColumnCountGetFilter.java 
1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.java 
1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/ColumnRangeFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/CompareFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/DependentColumnFilter.java 
1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/FamilyFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/InclusiveStopFilter.java 
1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/MultipleColumnPrefixFilter.java 
1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/PageFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/ParseConstants.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/ParseFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/QualifierFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/RowFilter.java 1158977 
  
/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueExcludeFilter.java
 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java 
1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/TimestampsFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/filter/ValueFilter.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 1158977 
  /src/main/java/org/apache/hadoop/hbase/thrift/generated/TScan.java 1158977 
  /src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 1158977 
  /src/main/ruby/hbase/table.rb 1158977 
  /src/main/ruby/shell/commands/scan.rb 1158977 
  /src/test/java/org/apache/hadoop/hbase/filter/TestParseFilter.java 1158977 

Diff: https://reviews.apache.org/r/1326/diff


Testing
---

patch includes one test: TestParseFilter.java


Thanks,

Anirudh



 Exposing HBase Filters to the Thrift API
 

 Key: HBASE-4176
 URL: https://issues.apache.org/jira/browse/HBASE-4176
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Anirudh Todi
Assignee: Anirudh Todi
Priority: Minor
 Attachments: Filter Language (3).xml, Filter Language(2).docx, Filter 
 Language(2).xml, Filter Language(3).docx, Filter Language.docx, 
 

[jira] [Commented] (HBASE-4202) Check filesystem permissions on startup

2011-08-17 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086827#comment-13086827
 ] 

ramkrishna.s.vasudevan commented on HBASE-4202:
---

@Stack,
I think 0.90.x behaves correctly. I checked the behaviour.  Below are the logs
{noformat}
2011-08-18 10:57:49,345 INFO org.mortbay.log: Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-08-18 10:57:49,475 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed 
initialization
2011-08-18 10:57:49,479 ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
java.lang.IllegalArgumentException: Bad temp directory: 
/tmp/hadoop-test666/Jetty/regionserver
at 
org.mortbay.jetty.webapp.WebAppContext.setTempDirectory(WebAppContext.java:1201)
at org.apache.hadoop.http.HttpServer.init(HttpServer.java:128)
at org.apache.hadoop.hbase.util.InfoServer.init(InfoServer.java:54)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1262)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:880)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1481)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:571)
at java.lang.Thread.run(Thread.java:619)
2011-08-18 10:57:49,488 FATAL 
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
serverName=linux-kxjl,60020,1313645267001, load=(requests=0, regions=0, 
usedHeap=22, maxHeap=995): Unhandled exception: Region server startup failed
java.io.IOException: Region server startup failed
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:987)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:889)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1481)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:571)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.IllegalArgumentException: Bad temp directory: 
/tmp/hadoop-test666/Jetty/regionserver
at 
org.mortbay.jetty.webapp.WebAppContext.setTempDirectory(WebAppContext.java:1201)
at org.apache.hadoop.http.HttpServer.init(HttpServer.java:128)
at org.apache.hadoop.hbase.util.InfoServer.init(InfoServer.java:54)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startServiceThreads(HRegionServer.java:1262)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:880)
... 3 more
---
2011-08-18 10:57:49,770 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: 
Shutdown hook starting; hbase.shutdown.hook=true; 
fsShutdownHook=Thread[Thread-14,5,main]
2011-08-18 10:57:49,771 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown hook
2011-08-18 10:57:49,771 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: 
Starting fs shutdown hook thread.
2011-08-18 10:57:49,874 INFO org.apache.hadoop.hbase.regionserver.ShutdownHook: 
Shutdown hook finished.
{noformat}

This defect may not be valid in 0.90.x version. 

 Check filesystem permissions on startup
 ---

 Key: HBASE-4202
 URL: https://issues.apache.org/jira/browse/HBASE-4202
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.20.4
 Environment: debian squeeze
Reporter: Matthias Hofschen
Assignee: ramkrishna.s.vasudevan
  Labels: noob

 We added a new node to a 44 node cluster starting the datanode, mapred and 
 regionserver processes on it. The Unix filesystem was configured incorrectly, 
 i.e. /tmp was not writable to processes. All three processes had issues with 
 this. Datanode and mapred shutdown on exception.
 Regionserver did not stop, in fact reported to master that its up without 
 regions. So master assigned regions to it. Regionserver would not accept 
 them, resulting in a constant assign, reject, reassign cycle, that put many 
 regions into a state of not being available. There are no logs about this, 
 but we could observer the regioncount fluctuate by hundredths of regions and 
 the application throwing many NotServingRegion exceptions.  
 In fact to the master process the regionserver looked fine, so it was trying 
 to send regions its way. Regionserver rejected them. So the master/balancer 
 was going into a assign/reassign cycle destabilizing the cluster. Many puts 
 and gets simply failed with NotServingRegionExceptions and took a long time 
 to complete.
 Exception from