[jira] [Updated] (HBASE-4247) Add isAborted method to the Abortable interface

2011-09-17 Thread Akash Ashok (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash Ashok updated HBASE-4247:
---

Attachment: HBase-4247.patch

 Add isAborted method to the Abortable interface
 ---

 Key: HBASE-4247
 URL: https://issues.apache.org/jira/browse/HBASE-4247
 Project: HBase
  Issue Type: Task
Reporter: Akash Ashok
Assignee: Akash Ashok
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBase-4247.patch


 Add a new method isAborted() to the Abortable interface 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4247) Add isAborted method to the Abortable interface

2011-09-17 Thread Akash Ashok (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash Ashok updated HBASE-4247:
---

Status: Patch Available  (was: In Progress)

 Add isAborted method to the Abortable interface
 ---

 Key: HBASE-4247
 URL: https://issues.apache.org/jira/browse/HBASE-4247
 Project: HBase
  Issue Type: Task
Reporter: Akash Ashok
Assignee: Akash Ashok
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBase-4247.patch


 Add a new method isAborted() to the Abortable interface 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4247) Add isAborted method to the Abortable interface

2011-09-17 Thread Akash Ashok (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash Ashok updated HBASE-4247:
---

Attachment: HBase-4247-v2.patch

Adding Javadoc to the isAbortable() method in Abortable Interface

 Add isAborted method to the Abortable interface
 ---

 Key: HBASE-4247
 URL: https://issues.apache.org/jira/browse/HBASE-4247
 Project: HBase
  Issue Type: Task
Reporter: Akash Ashok
Assignee: Akash Ashok
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBase-4247-v2.patch, HBase-4247.patch


 Add a new method isAborted() to the Abortable interface 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4388) Second start after migration from 90 to trunk crashes

2011-09-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107099#comment-13107099
 ] 

Ted Yu commented on HBASE-4388:
---

metaMigrated column is a new column. I wonder if hostname would ever get into 
this column.

 Second start after migration from 90 to trunk crashes
 -

 Key: HBASE-4388
 URL: https://issues.apache.org/jira/browse/HBASE-4388
 Project: HBase
  Issue Type: Bug
  Components: master, migration
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Ted Yu
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4388.txt, meta.tgz


 I started a trunk cluster to upgrade from 90, inserted a ton of data, then 
 did a clean shutdown. When I started again, I got the following exception:
 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta 
 now.
 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting 
 shutdown.
 java.lang.NegativeArraySizeException: -102
 at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
 at 
 org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606)
 at 
 org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103)
 at 
 org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298)
 at 
 org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4400) .META. getting stuck if RS hosting it is dead and znode state is in RS_ZK_REGION_OPENED

2011-09-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107104#comment-13107104
 ] 

Ted Yu commented on HBASE-4400:
---

+1 on patch.
For TestMasterFailover#testShouldCheckMasterFailOverWhenMETAIsInOpenedState, 
the following is never read:
{code}
int activeIndex = -1;
{code}
We don't need the above variable.

Around line 188, I think doing the following mimics the situation for this JIRA:
{code}
  if (null != metaRegion) {
regionServer.abort();
break;
  }
{code}
i.e. we only need to abort the region server carrying .META.
I ran with the above change and the test passed.

I like the way you transition .META. znode to RS_ZK_REGION_OPENED. I think it 
would be nice to extract lines 196 to 210 into a separate method so that other 
developers can utilize it later.

I think the new method can be put into HBaseTestingUtility.

Good job, Ramkrishna.

 .META. getting stuck if RS hosting it is dead and znode state is in 
 RS_ZK_REGION_OPENED
 ---

 Key: HBASE-4400
 URL: https://issues.apache.org/jira/browse/HBASE-4400
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4400_0.90.patch, HBASE-4400_trunk.patch


 Start 2 RS.
 The .META. is being hosted by RS2 but while processing it goes down.
 Now restart the master and RS1.  Master gets the RS name from the znode in 
 RS_ZK_REGION_OPENED.  But as RS2 is not online still the master is not able 
 to process the META at all.  Please find the logs
 {noformat}
 2011-09-14 16:43:51,949 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,968 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
 assigned=1, rit=false, location=linux76:60020
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
 .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed to find 
 linux146,60020,1315998414623 in list of online servers; skipping registration 
 of open of .META.,,1.1028785192
 2011-09-14 16:43:51,971 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 1028785192/.META.
 2011-09-14 16:43:51,983 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,986 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 70236052; deleting unassigned node
 2011-09-14 16:43:51,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,998 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,999 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 -ROOT-,,0.70236052 on linux76,60020,1315998828523
 2011-09-14 16:44:00,945 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Registering server=linux146,60020,1315998839724, regionCount=0, userLoad=false
 2011-09-14 16:46:20,003 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  .META.,,1.1028785192 state=OPEN, ts=0
 2011-09-14 16:46:20,004 ERROR 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
 too long, we don't know where region was opened so can't do anything
 {noformat}
 {code}
 regionsInTransition.put(encodedRegionName, new RegionState(
 regionInfo, RegionState.State.OPEN, data.getStamp()));
   
 } else {
   HServerInfo hsi = this.serverManager.getServerInfo(sn);
   if (hsi == null) {
 LOG.info(Failed to find  + sn +
in list of online servers; skipping registration of open of  
 +
   regionInfo.getRegionNameAsString());
   } else {
 new OpenedRegionHandler(master, this, regionInfo, hsi).process();
   }
 }
 {code}
 So timeout monitor is not able to do anything here
 {code}
   LOG.error(Region has been OPEN for too long,  +
   we don't know where region was opened so can't do anything);
   synchronized(regionState) {
 regionState.update(regionState.getState());
   }
 {code}

--
This message is automatically generated by JIRA.

[jira] [Created] (HBASE-4428) Two methods in CacheTestUtils don't call setDaemon() on the threads

2011-09-17 Thread Ted Yu (JIRA)
Two methods in CacheTestUtils don't call setDaemon() on the threads
---

 Key: HBASE-4428
 URL: https://issues.apache.org/jira/browse/HBASE-4428
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu


hammerSingleKey() and hammerEviction() don't call t.setDaemon(true) on the 
threads they create.
The correct pattern is:
{code}
  t.setDaemon(true);
  ctx.addThread(t);
{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4428) Two methods in CacheTestUtils don't call setDaemon() on the threads

2011-09-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4428:
--

Attachment: 4428.txt

Running test suite.

 Two methods in CacheTestUtils don't call setDaemon() on the threads
 ---

 Key: HBASE-4428
 URL: https://issues.apache.org/jira/browse/HBASE-4428
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 4428.txt


 hammerSingleKey() and hammerEviction() don't call t.setDaemon(true) on the 
 threads they create.
 The correct pattern is:
 {code}
   t.setDaemon(true);
   ctx.addThread(t);
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4428) Two methods in CacheTestUtils don't call setDaemon() on the threads

2011-09-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107146#comment-13107146
 ] 

Ted Yu commented on HBASE-4428:
---

Slab related tests passed in the test suite.

 Two methods in CacheTestUtils don't call setDaemon() on the threads
 ---

 Key: HBASE-4428
 URL: https://issues.apache.org/jira/browse/HBASE-4428
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 4428.txt


 hammerSingleKey() and hammerEviction() don't call t.setDaemon(true) on the 
 threads they create.
 The correct pattern is:
 {code}
   t.setDaemon(true);
   ctx.addThread(t);
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4298) Support to drain RS nodes through ZK

2011-09-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107149#comment-13107149
 ] 

Ted Yu commented on HBASE-4298:
---

I reviewed the patch for trunk.

For AssignmentManager.java:
{code}
+, exclude= + drainingServers + ) available servers);
{code}
I think we only need to log the number of draining servers.

For ServerManager.java:
{code}
+  /** Map of region servers that should not get any more new regions */
+  private final MapServerName, HServerLoad drainingServers =
+new ConcurrentHashMapServerName, HServerLoad();
{code}
The javadoc should state that keys of the map are region servers.

I think removeServerFromDrainList() should return a boolean. 
ServerManager.isServerOnline(sn) should be used instead of checking 
HServerLoad. If sn isn't online, the method should return false. Otherwise true 
is returned.

You may consider doing similar action in addServerToDrainList().

I wonder if Map is needed for drainingServers because it is private and 
getDrainingServersList() only returns the keySet.

For DrainingServerTracker.java, please remove year.
The handling of calling this.serverManager methods is different between add() 
and remove(): one inside synchronized block, one outside. Is there a reason ?

More to follow.

 Support to drain RS nodes through ZK
 

 Key: HBASE-4298
 URL: https://issues.apache.org/jira/browse/HBASE-4298
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
 Environment: all
Reporter: Aravind Gottipati
Priority: Critical
  Labels: patch
 Fix For: 0.92.0, 0.90.5


 HDFS currently has a way to exclude certain datanodes and prevent them from 
 getting new blocks.  HDFS goes one step further and even drains these nodes 
 for you.  This enhancement is a step in that direction.
 The idea is that we mark nodes in zookeeper as draining nodes.  This means 
 that they don't get any more new regions.  These draining nodes look exactly 
 the same as the corresponding nodes in /rs, except they live under /draining.
 Eventually, support for draining them can be added.  I am submitting two 
 patches for review - one for the 0.90 branch and one for trunk (in git).
 Here are the two patches
 0.90 - 
 https://github.com/aravind/hbase/commit/181041e72e7ffe6a4da6d82b431ef7f8c99e62d2
 trunk - 
 https://github.com/aravind/hbase/commit/e127b25ae3b4034103b185d8380f3b7267bc67d5
 I have tested both these patches and they work as advertised.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4298) Support to drain RS nodes through ZK

2011-09-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107154#comment-13107154
 ] 

Ted Yu commented on HBASE-4298:
---

For nodeChildrenChanged(), please change the sentence for catch black of 
IOException, it mentioned zk exception.

For ZooKeeperWatcher.java:
{code}
+conf.get(zookeeper.znode.draining, draining));
{code}
I think a better name maybe zookeeper.znode.draining.rs

Can you write some unit tests for this feature ?
Please also share your experience from using this in your environment.

 Support to drain RS nodes through ZK
 

 Key: HBASE-4298
 URL: https://issues.apache.org/jira/browse/HBASE-4298
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
 Environment: all
Reporter: Aravind Gottipati
Priority: Critical
  Labels: patch
 Fix For: 0.92.0, 0.90.5


 HDFS currently has a way to exclude certain datanodes and prevent them from 
 getting new blocks.  HDFS goes one step further and even drains these nodes 
 for you.  This enhancement is a step in that direction.
 The idea is that we mark nodes in zookeeper as draining nodes.  This means 
 that they don't get any more new regions.  These draining nodes look exactly 
 the same as the corresponding nodes in /rs, except they live under /draining.
 Eventually, support for draining them can be added.  I am submitting two 
 patches for review - one for the 0.90 branch and one for trunk (in git).
 Here are the two patches
 0.90 - 
 https://github.com/aravind/hbase/commit/181041e72e7ffe6a4da6d82b431ef7f8c99e62d2
 trunk - 
 https://github.com/aravind/hbase/commit/e127b25ae3b4034103b185d8380f3b7267bc67d5
 I have tested both these patches and they work as advertised.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4400) .META. getting stuck if RS hosting it is dead and znode state is in RS_ZK_REGION_OPENED

2011-09-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4400:
--

Status: Open  (was: Patch Available)

 .META. getting stuck if RS hosting it is dead and znode state is in 
 RS_ZK_REGION_OPENED
 ---

 Key: HBASE-4400
 URL: https://issues.apache.org/jira/browse/HBASE-4400
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4400_0.90.patch, HBASE-4400_trunk.patch


 Start 2 RS.
 The .META. is being hosted by RS2 but while processing it goes down.
 Now restart the master and RS1.  Master gets the RS name from the znode in 
 RS_ZK_REGION_OPENED.  But as RS2 is not online still the master is not able 
 to process the META at all.  Please find the logs
 {noformat}
 2011-09-14 16:43:51,949 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,968 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
 assigned=1, rit=false, location=linux76:60020
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
 .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed to find 
 linux146,60020,1315998414623 in list of online servers; skipping registration 
 of open of .META.,,1.1028785192
 2011-09-14 16:43:51,971 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 1028785192/.META.
 2011-09-14 16:43:51,983 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,986 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 70236052; deleting unassigned node
 2011-09-14 16:43:51,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,998 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,999 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 -ROOT-,,0.70236052 on linux76,60020,1315998828523
 2011-09-14 16:44:00,945 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Registering server=linux146,60020,1315998839724, regionCount=0, userLoad=false
 2011-09-14 16:46:20,003 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  .META.,,1.1028785192 state=OPEN, ts=0
 2011-09-14 16:46:20,004 ERROR 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
 too long, we don't know where region was opened so can't do anything
 {noformat}
 {code}
 regionsInTransition.put(encodedRegionName, new RegionState(
 regionInfo, RegionState.State.OPEN, data.getStamp()));
   
 } else {
   HServerInfo hsi = this.serverManager.getServerInfo(sn);
   if (hsi == null) {
 LOG.info(Failed to find  + sn +
in list of online servers; skipping registration of open of  
 +
   regionInfo.getRegionNameAsString());
   } else {
 new OpenedRegionHandler(master, this, regionInfo, hsi).process();
   }
 }
 {code}
 So timeout monitor is not able to do anything here
 {code}
   LOG.error(Region has been OPEN for too long,  +
   we don't know where region was opened so can't do anything);
   synchronized(regionState) {
 regionState.update(regionState.getState());
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4400) .META. getting stuck if RS hosting it is dead and znode state is in RS_ZK_REGION_OPENED

2011-09-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4400:
--

Attachment: HBASE-4400_0.90_1.patch

 .META. getting stuck if RS hosting it is dead and znode state is in 
 RS_ZK_REGION_OPENED
 ---

 Key: HBASE-4400
 URL: https://issues.apache.org/jira/browse/HBASE-4400
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4400_0.90.patch, HBASE-4400_0.90_1.patch, 
 HBASE-4400_trunk.patch, HBASE-4400_trunk_1.patch


 Start 2 RS.
 The .META. is being hosted by RS2 but while processing it goes down.
 Now restart the master and RS1.  Master gets the RS name from the znode in 
 RS_ZK_REGION_OPENED.  But as RS2 is not online still the master is not able 
 to process the META at all.  Please find the logs
 {noformat}
 2011-09-14 16:43:51,949 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,968 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
 assigned=1, rit=false, location=linux76:60020
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
 .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed to find 
 linux146,60020,1315998414623 in list of online servers; skipping registration 
 of open of .META.,,1.1028785192
 2011-09-14 16:43:51,971 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 1028785192/.META.
 2011-09-14 16:43:51,983 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,986 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 70236052; deleting unassigned node
 2011-09-14 16:43:51,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,998 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,999 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 -ROOT-,,0.70236052 on linux76,60020,1315998828523
 2011-09-14 16:44:00,945 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Registering server=linux146,60020,1315998839724, regionCount=0, userLoad=false
 2011-09-14 16:46:20,003 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  .META.,,1.1028785192 state=OPEN, ts=0
 2011-09-14 16:46:20,004 ERROR 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
 too long, we don't know where region was opened so can't do anything
 {noformat}
 {code}
 regionsInTransition.put(encodedRegionName, new RegionState(
 regionInfo, RegionState.State.OPEN, data.getStamp()));
   
 } else {
   HServerInfo hsi = this.serverManager.getServerInfo(sn);
   if (hsi == null) {
 LOG.info(Failed to find  + sn +
in list of online servers; skipping registration of open of  
 +
   regionInfo.getRegionNameAsString());
   } else {
 new OpenedRegionHandler(master, this, regionInfo, hsi).process();
   }
 }
 {code}
 So timeout monitor is not able to do anything here
 {code}
   LOG.error(Region has been OPEN for too long,  +
   we don't know where region was opened so can't do anything);
   synchronized(regionState) {
 regionState.update(regionState.getState());
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4400) .META. getting stuck if RS hosting it is dead and znode state is in RS_ZK_REGION_OPENED

2011-09-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4400:
--

Attachment: HBASE-4400_trunk_1.patch

Addressing Ted's comment of removing the unused vaiable an dmoing the creation 
of OPENED node to HBAseTestingUtility.

 .META. getting stuck if RS hosting it is dead and znode state is in 
 RS_ZK_REGION_OPENED
 ---

 Key: HBASE-4400
 URL: https://issues.apache.org/jira/browse/HBASE-4400
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4400_0.90.patch, HBASE-4400_0.90_1.patch, 
 HBASE-4400_trunk.patch, HBASE-4400_trunk_1.patch


 Start 2 RS.
 The .META. is being hosted by RS2 but while processing it goes down.
 Now restart the master and RS1.  Master gets the RS name from the znode in 
 RS_ZK_REGION_OPENED.  But as RS2 is not online still the master is not able 
 to process the META at all.  Please find the logs
 {noformat}
 2011-09-14 16:43:51,949 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,968 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
 assigned=1, rit=false, location=linux76:60020
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
 .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed to find 
 linux146,60020,1315998414623 in list of online servers; skipping registration 
 of open of .META.,,1.1028785192
 2011-09-14 16:43:51,971 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 1028785192/.META.
 2011-09-14 16:43:51,983 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,986 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 70236052; deleting unassigned node
 2011-09-14 16:43:51,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,998 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,999 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 -ROOT-,,0.70236052 on linux76,60020,1315998828523
 2011-09-14 16:44:00,945 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Registering server=linux146,60020,1315998839724, regionCount=0, userLoad=false
 2011-09-14 16:46:20,003 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  .META.,,1.1028785192 state=OPEN, ts=0
 2011-09-14 16:46:20,004 ERROR 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
 too long, we don't know where region was opened so can't do anything
 {noformat}
 {code}
 regionsInTransition.put(encodedRegionName, new RegionState(
 regionInfo, RegionState.State.OPEN, data.getStamp()));
   
 } else {
   HServerInfo hsi = this.serverManager.getServerInfo(sn);
   if (hsi == null) {
 LOG.info(Failed to find  + sn +
in list of online servers; skipping registration of open of  
 +
   regionInfo.getRegionNameAsString());
   } else {
 new OpenedRegionHandler(master, this, regionInfo, hsi).process();
   }
 }
 {code}
 So timeout monitor is not able to do anything here
 {code}
   LOG.error(Region has been OPEN for too long,  +
   we don't know where region was opened so can't do anything);
   synchronized(regionState) {
 regionState.update(regionState.getState());
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4400) .META. getting stuck if RS hosting it is dead and znode state is in RS_ZK_REGION_OPENED

2011-09-17 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4400:
--

Status: Patch Available  (was: Open)

 .META. getting stuck if RS hosting it is dead and znode state is in 
 RS_ZK_REGION_OPENED
 ---

 Key: HBASE-4400
 URL: https://issues.apache.org/jira/browse/HBASE-4400
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4400_0.90.patch, HBASE-4400_0.90_1.patch, 
 HBASE-4400_trunk.patch, HBASE-4400_trunk_1.patch


 Start 2 RS.
 The .META. is being hosted by RS2 but while processing it goes down.
 Now restart the master and RS1.  Master gets the RS name from the znode in 
 RS_ZK_REGION_OPENED.  But as RS2 is not online still the master is not able 
 to process the META at all.  Please find the logs
 {noformat}
 2011-09-14 16:43:51,949 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,968 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
 assigned=1, rit=false, location=linux76:60020
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
 .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed to find 
 linux146,60020,1315998414623 in list of online servers; skipping registration 
 of open of .META.,,1.1028785192
 2011-09-14 16:43:51,971 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 1028785192/.META.
 2011-09-14 16:43:51,983 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,986 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 70236052; deleting unassigned node
 2011-09-14 16:43:51,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,998 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,999 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 -ROOT-,,0.70236052 on linux76,60020,1315998828523
 2011-09-14 16:44:00,945 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Registering server=linux146,60020,1315998839724, regionCount=0, userLoad=false
 2011-09-14 16:46:20,003 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  .META.,,1.1028785192 state=OPEN, ts=0
 2011-09-14 16:46:20,004 ERROR 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
 too long, we don't know where region was opened so can't do anything
 {noformat}
 {code}
 regionsInTransition.put(encodedRegionName, new RegionState(
 regionInfo, RegionState.State.OPEN, data.getStamp()));
   
 } else {
   HServerInfo hsi = this.serverManager.getServerInfo(sn);
   if (hsi == null) {
 LOG.info(Failed to find  + sn +
in list of online servers; skipping registration of open of  
 +
   regionInfo.getRegionNameAsString());
   } else {
 new OpenedRegionHandler(master, this, regionInfo, hsi).process();
   }
 }
 {code}
 So timeout monitor is not able to do anything here
 {code}
   LOG.error(Region has been OPEN for too long,  +
   we don't know where region was opened so can't do anything);
   synchronized(regionState) {
 regionState.update(regionState.getState());
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4213) Support for fault tolerant, instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) through ZK.

2011-09-17 Thread Subbu M Iyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subbu M Iyer updated HBASE-4213:


Attachment: 4213-V10-Support_instant_schema_changes_through_ZK.patch

 Support for fault tolerant, instant schema updates with out master's 
 intervention (i.e with out enable/disable and bulk assign/unassign) through 
 ZK.
 

 Key: HBASE-4213
 URL: https://issues.apache.org/jira/browse/HBASE-4213
 Project: HBase
  Issue Type: Improvement
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
 Fix For: 0.92.0

 Attachments: 4213-Instant_Schema_change_through_ZK.patch, 
 4213-V10-Support_instant_schema_changes_through_ZK.patch, 
 4213-V5-Support_instant_schema_changes_through_ZK.patch, 
 4213-V7-Support_instant_schema_changes_through_ZK.patch, 
 4213-V8-Support_instant_schema_changes_through_ZK.patch, 
 4213-V9-Support_instant_schema_changes_through_ZK.patch, 4213-v9.txt, 
 4213.v6, HBASE-4213-Instant_schema_change.patch, 
 HBASE-4213_Instant_schema_change_-Version_2_.patch, 
 HBASE_Instant_schema_change-version_3_.patch


 This Jira is a slight variation in approach to what is being done as part of 
 https://issues.apache.org/jira/browse/HBASE-1730
 Support instant schema updates such as Modify Table, Add Column, Modify 
 Column operations:
 1. With out enable/disabling the table.
 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4428) Two methods in CacheTestUtils don't call setDaemon() on the threads

2011-09-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107171#comment-13107171
 ] 

stack commented on HBASE-4428:
--

+1

 Two methods in CacheTestUtils don't call setDaemon() on the threads
 ---

 Key: HBASE-4428
 URL: https://issues.apache.org/jira/browse/HBASE-4428
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 4428.txt


 hammerSingleKey() and hammerEviction() don't call t.setDaemon(true) on the 
 threads they create.
 The correct pattern is:
 {code}
   t.setDaemon(true);
   ctx.addThread(t);
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4388) Second start after migration from 90 to trunk crashes

2011-09-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107172#comment-13107172
 ] 

stack commented on HBASE-4388:
--

@Ted true... but two bytes seems fine here.

 Second start after migration from 90 to trunk crashes
 -

 Key: HBASE-4388
 URL: https://issues.apache.org/jira/browse/HBASE-4388
 Project: HBase
  Issue Type: Bug
  Components: master, migration
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Ted Yu
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4388.txt, meta.tgz


 I started a trunk cluster to upgrade from 90, inserted a ton of data, then 
 did a clean shutdown. When I started again, I got the following exception:
 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta 
 now.
 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting 
 shutdown.
 java.lang.NegativeArraySizeException: -102
 at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
 at 
 org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606)
 at 
 org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103)
 at 
 org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298)
 at 
 org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4427) It would help to run a standalone HBase's ZK on a different port

2011-09-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107174#comment-13107174
 ] 

stack commented on HBASE-4427:
--

If we change this in hbase-default.xml, it will change it for all runs of hbase 
not just standalone.  That is probably not what you want.  I see over in 
src/test/resources/hbase-site.xml, the configuration we use for tests that we 
have a non-default port for zk.  Is it too much trouble setting this config. in 
your local standalone instance Roman?  Seems like a pretty disruptive change to 
make otherwise.

 It would help to run a standalone HBase's ZK on a different port
 

 Key: HBASE-4427
 URL: https://issues.apache.org/jira/browse/HBASE-4427
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 0.90.4
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
Priority: Minor

 It would be extremely helpful to have standalone HBase default to a 
 non-standard port for running its embedded ZK. This would help to run HBase 
 on the same host where a legitimate fully distributed ZK server, etc.
 It seems that the following addition to hbase-default.xml would be enough to 
 make it happen:
 {noformat}
 +  property
 +namehbase.zookeeper.property.clientPort/name
 +value4181/value
 +  /property
 {noformat}
 This will take care of the master/client for HBase and can be overridden in 
 hbase-site if needed.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4369) Deprecate HConnection#getZookeeperWatcher in prep for HBASE-1762

2011-09-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4369:
-

  Resolution: Fixed
Hadoop Flags: [Incompatible change, Reviewed]
  Status: Resolved  (was: Patch Available)

Committed to TRUNK. Thanks for review Andrew.

 Deprecate HConnection#getZookeeperWatcher in prep for HBASE-1762
 

 Key: HBASE-4369
 URL: https://issues.apache.org/jira/browse/HBASE-4369
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4369.txt


 Need a +1 on this from someone else who agrees HBASE-1762 should be done.  
 Makes sense to me.   Will take a little bit of work doing the actual removal 
 over HBASE-1762 but first step is this deprecating step.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4400) .META. getting stuck if RS hosting it is dead and znode state is in RS_ZK_REGION_OPENED

2011-09-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107180#comment-13107180
 ] 

stack commented on HBASE-4400:
--

+1  Nice test.

 .META. getting stuck if RS hosting it is dead and znode state is in 
 RS_ZK_REGION_OPENED
 ---

 Key: HBASE-4400
 URL: https://issues.apache.org/jira/browse/HBASE-4400
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4400_0.90.patch, HBASE-4400_0.90_1.patch, 
 HBASE-4400_trunk.patch, HBASE-4400_trunk_1.patch


 Start 2 RS.
 The .META. is being hosted by RS2 but while processing it goes down.
 Now restart the master and RS1.  Master gets the RS name from the znode in 
 RS_ZK_REGION_OPENED.  But as RS2 is not online still the master is not able 
 to process the META at all.  Please find the logs
 {noformat}
 2011-09-14 16:43:51,949 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,968 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
 assigned=1, rit=false, location=linux76:60020
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
 .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed to find 
 linux146,60020,1315998414623 in list of online servers; skipping registration 
 of open of .META.,,1.1028785192
 2011-09-14 16:43:51,971 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 1028785192/.META.
 2011-09-14 16:43:51,983 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,986 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 70236052; deleting unassigned node
 2011-09-14 16:43:51,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,998 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,999 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 -ROOT-,,0.70236052 on linux76,60020,1315998828523
 2011-09-14 16:44:00,945 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Registering server=linux146,60020,1315998839724, regionCount=0, userLoad=false
 2011-09-14 16:46:20,003 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  .META.,,1.1028785192 state=OPEN, ts=0
 2011-09-14 16:46:20,004 ERROR 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
 too long, we don't know where region was opened so can't do anything
 {noformat}
 {code}
 regionsInTransition.put(encodedRegionName, new RegionState(
 regionInfo, RegionState.State.OPEN, data.getStamp()));
   
 } else {
   HServerInfo hsi = this.serverManager.getServerInfo(sn);
   if (hsi == null) {
 LOG.info(Failed to find  + sn +
in list of online servers; skipping registration of open of  
 +
   regionInfo.getRegionNameAsString());
   } else {
 new OpenedRegionHandler(master, this, regionInfo, hsi).process();
   }
 }
 {code}
 So timeout monitor is not able to do anything here
 {code}
   LOG.error(Region has been OPEN for too long,  +
   we don't know where region was opened so can't do anything);
   synchronized(regionState) {
 regionState.update(regionState.getState());
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2011-09-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3855:
-

Fix Version/s: (was: 0.92.0)
   0.90.5

OK. Moving to 0.90.5.  I did not apply 4195 to the branch BECAUSE it does not 
apply over on the branch (which means I must have been dreaming yesterday when 
I thought I was testing 4195 on 0.90 -- I must have been running it on TRUNK).  
Leaving this as open against 0.90.5 rather than against 0.92 since we don't 
seem to have the issue that caused the reopen in TRUNK (and 4195 improves on 
the original patch here anyways).

 Performance degradation of memstore because reseek is linear
 

 Key: HBASE-3855
 URL: https://issues.apache.org/jira/browse/HBASE-3855
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Priority: Blocker
 Fix For: 0.90.5

 Attachments: memstoreReseek.txt, memstoreReseek2.txt


 The scanner use reseek to find the next row (or next column) as part of a 
 scan. The reseek code iterates over a Set to position itself at the right 
 place. If there are many thousands of kvs that need to be skipped over, then 
 the time-cost is very high. In this case, a seek would be far lesser in cost 
 than a reseek.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-09-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107191#comment-13107191
 ] 

stack commented on HBASE-3833:
--

@Vishal Any luck w/ hacking up another patch?

 ability to support includes/excludes list in Hbase
 --

 Key: HBASE-3833
 URL: https://issues.apache.org/jira/browse/HBASE-3833
 Project: HBase
  Issue Type: Improvement
  Components: client, regionserver
Affects Versions: 0.90.2
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: excl-patch.txt, excl-patch.txt


 An HBase cluster currently does not have the ability to specify that the 
 master should accept regionservers only from a specified list. This helps 
 preventing administrative errors where the same machine could be included in 
 two clusters. It also allows the administrator to easily remove un-ssh-able 
 machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4298) Support to drain RS nodes through ZK

2011-09-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107194#comment-13107194
 ] 

stack commented on HBASE-4298:
--

@Aravind Should remove the createNode rather than just comment it out.  Nice 
feature.  Do nodes get poked into  /draining by external process?  Lets work on 
a unit test for this stuff (and address Ted's comment above).  When you get a 
chance, stick in some of your experience running this patch here.

 Support to drain RS nodes through ZK
 

 Key: HBASE-4298
 URL: https://issues.apache.org/jira/browse/HBASE-4298
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
 Environment: all
Reporter: Aravind Gottipati
Priority: Critical
  Labels: patch
 Fix For: 0.92.0, 0.90.5


 HDFS currently has a way to exclude certain datanodes and prevent them from 
 getting new blocks.  HDFS goes one step further and even drains these nodes 
 for you.  This enhancement is a step in that direction.
 The idea is that we mark nodes in zookeeper as draining nodes.  This means 
 that they don't get any more new regions.  These draining nodes look exactly 
 the same as the corresponding nodes in /rs, except they live under /draining.
 Eventually, support for draining them can be added.  I am submitting two 
 patches for review - one for the 0.90 branch and one for trunk (in git).
 Here are the two patches
 0.90 - 
 https://github.com/aravind/hbase/commit/181041e72e7ffe6a4da6d82b431ef7f8c99e62d2
 trunk - 
 https://github.com/aravind/hbase/commit/e127b25ae3b4034103b185d8380f3b7267bc67d5
 I have tested both these patches and they work as advertised.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4419) Resolve build warning messages

2011-09-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4419:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed to TRUNK.  I see this still:

{code}
[WARNING] Assembly file: 
/Users/stack/checkout/clean-trunk/target/hbase-0.91.0-SNAPSHOT is not a regular 
file (it may be a directory). It cannot be atta
{code}

but other WARNINGs are gone.

Thank you Praveen.

 Resolve build warning messages
 --

 Key: HBASE-4419
 URL: https://issues.apache.org/jira/browse/HBASE-4419
 Project: HBase
  Issue Type: Task
  Components: build
Affects Versions: 0.92.0
Reporter: Praveen Patibandla
Priority: Trivial
 Fix For: 0.92.0

 Attachments: 4419-V1.patch, 4419.patch, 4419.patch


 This item is created to clean up the build log.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4247) Add isAborted method to the Abortable interface

2011-09-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4247:
-

   Resolution: Fixed
Fix Version/s: (was: 0.94.0)
   0.92.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to TRUNK.  Thank you for the patch Akash.

 Add isAborted method to the Abortable interface
 ---

 Key: HBASE-4247
 URL: https://issues.apache.org/jira/browse/HBASE-4247
 Project: HBase
  Issue Type: Task
Reporter: Akash Ashok
Assignee: Akash Ashok
Priority: Minor
 Fix For: 0.92.0

 Attachments: HBase-4247-v2.patch, HBase-4247.patch


 Add a new method isAborted() to the Abortable interface 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4247) Add isAborted method to the Abortable interface

2011-09-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4247:
-

Release Note: Add isAbortable to Abortable Interface.
Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed])

 Add isAborted method to the Abortable interface
 ---

 Key: HBASE-4247
 URL: https://issues.apache.org/jira/browse/HBASE-4247
 Project: HBase
  Issue Type: Task
Reporter: Akash Ashok
Assignee: Akash Ashok
Priority: Minor
 Fix For: 0.92.0

 Attachments: HBase-4247-v2.patch, HBase-4247.patch


 Add a new method isAborted() to the Abortable interface 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4427) It would help to run a standalone HBase's ZK on a different port

2011-09-17 Thread Roman Shaposhnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107204#comment-13107204
 ] 

Roman Shaposhnik commented on HBASE-4427:
-

It is mostly out-of-the-box experience kind of question and you're absolutely 
right -- I can set it in my hbase-site.xml. However, I would like to understand 
a little bit better why do you think it'll be that disruptive. Suppose we 
change hbase-default.xml. The scenarious I see are these:
   * standalone mode -- will work perfectly
   * pseudo-distributed mode (hbase.cluster.distributed == true) with embedded 
ZK -- will work perfectly
   * pseudo-distributed mode (hbase.cluster.distributed == true) with 
standalone ZK (hbase.zookeeper.quorum != nul) -- will work perfectly since, as 
far as I understand, hbase.zookeeper.quorum will override 
hbase.zookeeper.property.clientPort.
   * fully-distributed mode with standalone ZK -- see above. Little difference 
between where the processes run, they will still require the same configs as 
far as hbase-site.xml is concerned.

Am I missing anything?

 It would help to run a standalone HBase's ZK on a different port
 

 Key: HBASE-4427
 URL: https://issues.apache.org/jira/browse/HBASE-4427
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 0.90.4
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
Priority: Minor

 It would be extremely helpful to have standalone HBase default to a 
 non-standard port for running its embedded ZK. This would help to run HBase 
 on the same host where a legitimate fully distributed ZK server, etc.
 It seems that the following addition to hbase-default.xml would be enough to 
 make it happen:
 {noformat}
 +  property
 +namehbase.zookeeper.property.clientPort/name
 +value4181/value
 +  /property
 {noformat}
 This will take care of the master/client for HBase and can be overridden in 
 hbase-site if needed.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3646) When mapper writes multiple values for a key keep chronological order of values

2011-09-17 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107210#comment-13107210
 ] 

Jesse Yates commented on HBASE-3646:


@Bob (or @stack) what is meant by 'chronological order' - the timestamp from 
HBase or the time that the key is written into the context? Also, in what 
context would the index argument be used? Would you be mapping values back to a 
row? When passing ImmutableBytesWritable to the reducer, are you just passing 
the Row returned from TableRecordReader or your own version? Sending the 
KeyValue pairs from the Result seems to make more sense, at least at first 
blush, to me.

I'm looking at this issue sight unseen (rather than actually having the problem 
myself), so a little in the dark. 

Been thinking about this for a while (and dug into the code a bit), and I'm 
thinking this may be a straight patch to 
org.apahe.hadoop.mapreduce.Mapper.Context (est. in the first comment), if we 
need to do anything at all. Then given that, do we even need to maintain this 
ticket? Is it just going to be used track the change we would need to make in 
Hadoop-core?

It seems like if we were going to add something to hbase it would be a class 
that would bind kVs together and be comparable (a KeyValue that is 
WritableComparable) including sorting wrt timestamp (it that is what is meant 
be chronological). So just adding compareTo(KV) using the KeyValueComparator.

 When mapper writes multiple values for a key keep chronological order of 
 values
 ---

 Key: HBASE-3646
 URL: https://issues.apache.org/jira/browse/HBASE-3646
 Project: HBase
  Issue Type: New Feature
  Components: client
Affects Versions: 0.90.1
 Environment: Cloudera 3.5 VM 
 TableMapperImmutableBytesWritable,IntWritable
 TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable
Reporter: Bob Cummins
Priority: Minor

 When mapper writes multiple values for a key, the underlying collection class 
 maps each of the values to the key, but not always in chronological order. If 
 chronological order were guaranteed each of the values mapped to the key, 
 each of the values could be understood as specific and different parameters 
 between the mapper and the reducer.
 I've done little tricks like having the mapper flag one a the values by 
 making it a negative number, which the reducer recognizes and can write it to 
 hbase as a unique column value.This is a kluge workaround which it would be 
 nice to not have to do.
 Used to formulate this suggestion:
 TableMapperImmutableBytesWritable,IntWritable
 TableReducerImmutableBytesWritable,IntWritable, ImmutableBytesWritable

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4428) Two methods in CacheTestUtils don't call setDaemon() on the threads

2011-09-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107218#comment-13107218
 ] 

Ted Yu commented on HBASE-4428:
---

Integrated to TRUNK.

Thanks for the review Stack.

 Two methods in CacheTestUtils don't call setDaemon() on the threads
 ---

 Key: HBASE-4428
 URL: https://issues.apache.org/jira/browse/HBASE-4428
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 4428.txt


 hammerSingleKey() and hammerEviction() don't call t.setDaemon(true) on the 
 threads they create.
 The correct pattern is:
 {code}
   t.setDaemon(true);
   ctx.addThread(t);
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1762) Remove concept of ZooKeeper from HConnection interface

2011-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107235#comment-13107235
 ] 

Hudson commented on HBASE-1762:
---

Integrated in HBase-TRUNK #2226 (See 
[https://builds.apache.org/job/HBase-TRUNK/2226/])
HBASE-4369 Deprecate HConnection#getZookeeperWatcher in prep for HBASE-1762

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnection.java


 Remove concept of ZooKeeper from HConnection interface
 --

 Key: HBASE-1762
 URL: https://issues.apache.org/jira/browse/HBASE-1762
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.20.0
Reporter: Ken Weiner
Assignee: stack
 Attachments: HBASE-1762.patch


 The concept of ZooKeeper is really an implementation detail and should not be 
 exposed in the {{HConnection}} interface.   Therefore, I suggest removing the 
 {{HConnection.getZooKeeperWrapper()}} method from the interface. 
 I couldn't find any uses of this method within the HBase code base except for 
 in one of the unit tests: {{org.apache.hadoop.hbase.TestZooKeeper}}.  This 
 unit test should be changed to instantiate the implementation of 
 {{HConnection}} directly, allowing it to use the {{getZooKeeperWrapper()}} 
 method.  This requires making 
 {{org.apache.hadoop.hbase.client.HConnectionManager.TableServers}} public.  
 (I actually think TableServers should be moved out into an outer class, but 
 in the spirit of small patches, I'll refrain from suggesting that in this 
 issue).
 I'll attach a patch for:
 # The removal of {{HConnection.getZooKeeperWrapper()}}
 # Change of {{TableServers}} class from private to public
 # Direct instantiation of {{TableServers}} within {{TestZooKeeper}}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4369) Deprecate HConnection#getZookeeperWatcher in prep for HBASE-1762

2011-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107236#comment-13107236
 ] 

Hudson commented on HBASE-4369:
---

Integrated in HBase-TRUNK #2226 (See 
[https://builds.apache.org/job/HBase-TRUNK/2226/])
HBASE-4369 Deprecate HConnection#getZookeeperWatcher in prep for HBASE-1762

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnection.java


 Deprecate HConnection#getZookeeperWatcher in prep for HBASE-1762
 

 Key: HBASE-4369
 URL: https://issues.apache.org/jira/browse/HBASE-4369
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4369.txt


 Need a +1 on this from someone else who agrees HBASE-1762 should be done.  
 Makes sense to me.   Will take a little bit of work doing the actual removal 
 over HBASE-1762 but first step is this deprecating step.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4388) Second start after migration from 90 to trunk crashes

2011-09-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4388:
--

Attachment: 4388-v2.txt

Patch version 2 adopts Todd's suggestion.

 Second start after migration from 90 to trunk crashes
 -

 Key: HBASE-4388
 URL: https://issues.apache.org/jira/browse/HBASE-4388
 Project: HBase
  Issue Type: Bug
  Components: master, migration
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Ted Yu
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4388-v2.txt, 4388.txt, meta.tgz


 I started a trunk cluster to upgrade from 90, inserted a ton of data, then 
 did a clean shutdown. When I started again, I got the following exception:
 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta 
 now.
 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting 
 shutdown.
 java.lang.NegativeArraySizeException: -102
 at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
 at 
 org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606)
 at 
 org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103)
 at 
 org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298)
 at 
 org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4400) .META. getting stuck if RS hosting it is dead and znode state is in RS_ZK_REGION_OPENED

2011-09-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107246#comment-13107246
 ] 

Ted Yu commented on HBASE-4400:
---

Integrated to TRUNK and branch.

Thanks for the patches Ramkrishna.

Thanks for the review Michael.

 .META. getting stuck if RS hosting it is dead and znode state is in 
 RS_ZK_REGION_OPENED
 ---

 Key: HBASE-4400
 URL: https://issues.apache.org/jira/browse/HBASE-4400
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4400_0.90.patch, HBASE-4400_0.90_1.patch, 
 HBASE-4400_trunk.patch, HBASE-4400_trunk_1.patch


 Start 2 RS.
 The .META. is being hosted by RS2 but while processing it goes down.
 Now restart the master and RS1.  Master gets the RS name from the znode in 
 RS_ZK_REGION_OPENED.  But as RS2 is not online still the master is not able 
 to process the META at all.  Please find the logs
 {noformat}
 2011-09-14 16:43:51,949 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,968 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
 assigned=1, rit=false, location=linux76:60020
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
 .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed to find 
 linux146,60020,1315998414623 in list of online servers; skipping registration 
 of open of .META.,,1.1028785192
 2011-09-14 16:43:51,971 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 1028785192/.META.
 2011-09-14 16:43:51,983 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,986 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 70236052; deleting unassigned node
 2011-09-14 16:43:51,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,998 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,999 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 -ROOT-,,0.70236052 on linux76,60020,1315998828523
 2011-09-14 16:44:00,945 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Registering server=linux146,60020,1315998839724, regionCount=0, userLoad=false
 2011-09-14 16:46:20,003 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  .META.,,1.1028785192 state=OPEN, ts=0
 2011-09-14 16:46:20,004 ERROR 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
 too long, we don't know where region was opened so can't do anything
 {noformat}
 {code}
 regionsInTransition.put(encodedRegionName, new RegionState(
 regionInfo, RegionState.State.OPEN, data.getStamp()));
   
 } else {
   HServerInfo hsi = this.serverManager.getServerInfo(sn);
   if (hsi == null) {
 LOG.info(Failed to find  + sn +
in list of online servers; skipping registration of open of  
 +
   regionInfo.getRegionNameAsString());
   } else {
 new OpenedRegionHandler(master, this, regionInfo, hsi).process();
   }
 }
 {code}
 So timeout monitor is not able to do anything here
 {code}
   LOG.error(Region has been OPEN for too long,  +
   we don't know where region was opened so can't do anything);
   synchronized(regionState) {
 regionState.update(regionState.getState());
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4400) .META. getting stuck if RS hosting it is dead and znode state is in RS_ZK_REGION_OPENED

2011-09-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107251#comment-13107251
 ] 

Ted Yu commented on HBASE-4400:
---

Minor change: I renamed metaRegion to region in 
createAndForceNodeToOpenedState()
I also fixed up the anonymous Abortable in createAndForceNodeToOpenedState().

 .META. getting stuck if RS hosting it is dead and znode state is in 
 RS_ZK_REGION_OPENED
 ---

 Key: HBASE-4400
 URL: https://issues.apache.org/jira/browse/HBASE-4400
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4400_0.90.patch, HBASE-4400_0.90_1.patch, 
 HBASE-4400_trunk.patch, HBASE-4400_trunk_1.patch


 Start 2 RS.
 The .META. is being hosted by RS2 but while processing it goes down.
 Now restart the master and RS1.  Master gets the RS name from the znode in 
 RS_ZK_REGION_OPENED.  But as RS2 is not online still the master is not able 
 to process the META at all.  Please find the logs
 {noformat}
 2011-09-14 16:43:51,949 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,968 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
 assigned=1, rit=false, location=linux76:60020
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
 .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed to find 
 linux146,60020,1315998414623 in list of online servers; skipping registration 
 of open of .META.,,1.1028785192
 2011-09-14 16:43:51,971 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 1028785192/.META.
 2011-09-14 16:43:51,983 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,986 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 70236052; deleting unassigned node
 2011-09-14 16:43:51,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,998 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,999 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 -ROOT-,,0.70236052 on linux76,60020,1315998828523
 2011-09-14 16:44:00,945 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Registering server=linux146,60020,1315998839724, regionCount=0, userLoad=false
 2011-09-14 16:46:20,003 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  .META.,,1.1028785192 state=OPEN, ts=0
 2011-09-14 16:46:20,004 ERROR 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
 too long, we don't know where region was opened so can't do anything
 {noformat}
 {code}
 regionsInTransition.put(encodedRegionName, new RegionState(
 regionInfo, RegionState.State.OPEN, data.getStamp()));
   
 } else {
   HServerInfo hsi = this.serverManager.getServerInfo(sn);
   if (hsi == null) {
 LOG.info(Failed to find  + sn +
in list of online servers; skipping registration of open of  
 +
   regionInfo.getRegionNameAsString());
   } else {
 new OpenedRegionHandler(master, this, regionInfo, hsi).process();
   }
 }
 {code}
 So timeout monitor is not able to do anything here
 {code}
   LOG.error(Region has been OPEN for too long,  +
   we don't know where region was opened so can't do anything);
   synchronized(regionState) {
 regionState.update(regionState.getState());
   }
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4375) [hbck] Add region coverage visualization to hbck

2011-09-17 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107260#comment-13107260
 ] 

Jonathan Hsieh commented on HBASE-4375:
---

This patch should be good to go for trunk and on 0.90

 [hbck] Add region coverage visualization to hbck
 

 Key: HBASE-4375
 URL: https://issues.apache.org/jira/browse/HBASE-4375
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.0, 0.90.5
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: 
 0001-HBASE-4375-Add-region-coverage-visualization-to-hbck.patch


 After HBASE-4322 and HBASE-4321, we now have an accurate region splits / 
 coverage map for properly identifying holes, overlaps, backwards regions and 
 other kinds of problems in the .META. table.  hbck should display this 
 information so that someone can fix this.
 A simple version for a table with regions [,A], [A,B], [A,C], [C,] and would 
 dump out something like this (showing an overlap in [A,B])
 :  ['table,,..', 'table,A,..']
 A: ['table,A,..', 'B'] ['table,A,..', 'C']
 B: ['table,A,..', 'C']  
 C: ['table,C', '']
 null:
 My first thought is '-details' should this dump the full region map including 
 all good and bad regions.  Without -details, any errors should dump info with 
 some context -- dump one region before problems, problem regions, and then 
 one post problem region.
 Alternately we could add a new option or options to dump the region split map.
 What is the preferred way to toggle display of this information in hbck?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4388) Second start after migration from 90 to trunk crashes

2011-09-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107261#comment-13107261
 ] 

Todd Lipcon commented on HBASE-4388:


Oh, sorry, got my wires crossed between tickets. It's the _other_ ticket where 
we were talking about a version number for something that used to just be 
hostname :) My bad.

 Second start after migration from 90 to trunk crashes
 -

 Key: HBASE-4388
 URL: https://issues.apache.org/jira/browse/HBASE-4388
 Project: HBase
  Issue Type: Bug
  Components: master, migration
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Ted Yu
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4388-v2.txt, 4388.txt, meta.tgz


 I started a trunk cluster to upgrade from 90, inserted a ton of data, then 
 did a clean shutdown. When I started again, I got the following exception:
 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta 
 now.
 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting 
 shutdown.
 java.lang.NegativeArraySizeException: -102
 at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
 at 
 org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606)
 at 
 org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103)
 at 
 org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298)
 at 
 org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4247) Add isAborted method to the Abortable interface

2011-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107276#comment-13107276
 ] 

Hudson commented on HBASE-4247:
---

Integrated in HBase-TRUNK #2227 (See 
[https://builds.apache.org/job/HBase-TRUNK/2227/])
HBASE-4247 Add isAborted method to the Abortable interface
HBASE-4247 Add isAborted method to the Abortable interface

stack : 
Files : 
* /hbase/trunk/CHANGES.txt

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/Abortable.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTrackerOnCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestLogsCleaner.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/MockRegionServerServices.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/MockServer.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZKTable.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperNodeTracker.java


 Add isAborted method to the Abortable interface
 ---

 Key: HBASE-4247
 URL: https://issues.apache.org/jira/browse/HBASE-4247
 Project: HBase
  Issue Type: Task
Reporter: Akash Ashok
Assignee: Akash Ashok
Priority: Minor
 Fix For: 0.92.0

 Attachments: HBase-4247-v2.patch, HBase-4247.patch


 Add a new method isAborted() to the Abortable interface 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4428) Two methods in CacheTestUtils don't call setDaemon() on the threads

2011-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107277#comment-13107277
 ] 

Hudson commented on HBASE-4428:
---

Integrated in HBase-TRUNK #2227 (See 
[https://builds.apache.org/job/HBase-TRUNK/2227/])
HBASE-4428  Two methods in CacheTestUtils don't call setDaemon() on the 
threads

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java


 Two methods in CacheTestUtils don't call setDaemon() on the threads
 ---

 Key: HBASE-4428
 URL: https://issues.apache.org/jira/browse/HBASE-4428
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 4428.txt


 hammerSingleKey() and hammerEviction() don't call t.setDaemon(true) on the 
 threads they create.
 The correct pattern is:
 {code}
   t.setDaemon(true);
   ctx.addThread(t);
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4419) Resolve build warning messages

2011-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107278#comment-13107278
 ] 

Hudson commented on HBASE-4419:
---

Integrated in HBase-TRUNK #2227 (See 
[https://builds.apache.org/job/HBase-TRUNK/2227/])
HBASE-4419 Resolve build warning messages

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/pom.xml


 Resolve build warning messages
 --

 Key: HBASE-4419
 URL: https://issues.apache.org/jira/browse/HBASE-4419
 Project: HBase
  Issue Type: Task
  Components: build
Affects Versions: 0.92.0
Reporter: Praveen Patibandla
Priority: Trivial
 Fix For: 0.92.0

 Attachments: 4419-V1.patch, 4419.patch, 4419.patch


 This item is created to clean up the build log.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4400) .META. getting stuck if RS hosting it is dead and znode state is in RS_ZK_REGION_OPENED

2011-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107299#comment-13107299
 ] 

Hudson commented on HBASE-4400:
---

Integrated in HBase-TRUNK #2228 (See 
[https://builds.apache.org/job/HBase-TRUNK/2228/])
HBASE-4400 fixed up the anonymous Abortable in 
createAndForceNodeToOpenedState()
HBASE-4400 rename metaRegion to region in 
HBaseTestingUtility.createAndForceNodeToOpenedState()
HBASE-4400  .META. getting stuck if RS hosting it is dead and znode state is in
   RS_ZK_REGION_OPENED (Ramkrishna)

tedyu : 
Files : 
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java

tedyu : 
Files : 
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java


 .META. getting stuck if RS hosting it is dead and znode state is in 
 RS_ZK_REGION_OPENED
 ---

 Key: HBASE-4400
 URL: https://issues.apache.org/jira/browse/HBASE-4400
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4400_0.90.patch, HBASE-4400_0.90_1.patch, 
 HBASE-4400_trunk.patch, HBASE-4400_trunk_1.patch


 Start 2 RS.
 The .META. is being hosted by RS2 but while processing it goes down.
 Now restart the master and RS1.  Master gets the RS name from the znode in 
 RS_ZK_REGION_OPENED.  But as RS2 is not online still the master is not able 
 to process the META at all.  Please find the logs
 {noformat}
 2011-09-14 16:43:51,949 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,968 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
 assigned=1, rit=false, location=linux76:60020
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
 .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,970 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed to find 
 linux146,60020,1315998414623 in list of online servers; skipping registration 
 of open of .META.,,1.1028785192
 2011-09-14 16:43:51,971 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Waiting on 1028785192/.META.
 2011-09-14 16:43:51,983 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=linux76,60020,1315998828523, 
 region=70236052/-ROOT-
 2011-09-14 16:43:51,986 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 70236052; deleting unassigned node
 2011-09-14 16:43:51,986 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Deleting existing unassigned node for 70236052 
 that is in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,998 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13267854032001d Successfully deleted unassigned node for 
 region 70236052 in expected state RS_ZK_REGION_OPENED
 2011-09-14 16:43:51,999 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 -ROOT-,,0.70236052 on linux76,60020,1315998828523
 2011-09-14 16:44:00,945 INFO org.apache.hadoop.hbase.master.ServerManager: 
 Registering server=linux146,60020,1315998839724, regionCount=0, userLoad=false
 2011-09-14 16:46:20,003 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  .META.,,1.1028785192 state=OPEN, ts=0
 2011-09-14 16:46:20,004 ERROR 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
 too long, we don't know where region was opened so can't do anything
 {noformat}
 {code}
 regionsInTransition.put(encodedRegionName, new RegionState(
 regionInfo, RegionState.State.OPEN, data.getStamp()));
   
 } else {
   HServerInfo hsi = this.serverManager.getServerInfo(sn);
   if (hsi == null) {
 LOG.info(Failed to find  + sn +
in list of online servers; skipping registration of open of  
 +
   regionInfo.getRegionNameAsString());
   } else {
 new OpenedRegionHandler(master, this, regionInfo, hsi).process();
   }
 }
 {code}
 So timeout monitor is not able to do anything here
 {code}
   LOG.error(Region has been OPEN for too long,  +
   we don't know where region was opened so can't do 

[jira] [Created] (HBASE-4429) Provide synchronous balanceSwitch()

2011-09-17 Thread Ted Yu (JIRA)
Provide synchronous balanceSwitch()
---

 Key: HBASE-4429
 URL: https://issues.apache.org/jira/browse/HBASE-4429
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu


Currently balanceSwitch() doesn't guarantee that balancer isn't running upon 
return.
During code review of HBASE-4213, we found that synchronous behavior for 
turning off balancer is desired.
We can add blockingBalanceSwitch() to HMasterInterface upon whose return 
balancer is guaranteed not running.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Work started] (HBASE-4224) Need a flush by regionserver rather than by table option

2011-09-17 Thread Akash Ashok (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-4224 started by Akash Ashok.

 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok

 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4428) Two methods in CacheTestUtils don't call setDaemon() on the threads

2011-09-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4428:
--

Attachment: 4428.addendum

Addendum fixes the deadlock shown in attachment.
ctx.stop() should only be called at the end of testCacheMultiThreaded.

 Two methods in CacheTestUtils don't call setDaemon() on the threads
 ---

 Key: HBASE-4428
 URL: https://issues.apache.org/jira/browse/HBASE-4428
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 4428.addendum, 4428.txt


 hammerSingleKey() and hammerEviction() don't call t.setDaemon(true) on the 
 threads they create.
 The correct pattern is:
 {code}
   t.setDaemon(true);
   ctx.addThread(t);
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4428) Two methods in CacheTestUtils don't call setDaemon() on the threads

2011-09-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4428:
--

Attachment: test-TestSingleSizeCache.trace

Trace shows the deadlock between doAnAction() and testCacheMultiThreaded().

 Two methods in CacheTestUtils don't call setDaemon() on the threads
 ---

 Key: HBASE-4428
 URL: https://issues.apache.org/jira/browse/HBASE-4428
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 4428.addendum, 4428.txt, test-TestSingleSizeCache.trace


 hammerSingleKey() and hammerEviction() don't call t.setDaemon(true) on the 
 threads they create.
 The correct pattern is:
 {code}
   t.setDaemon(true);
   ctx.addThread(t);
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1935) Scan in parallel

2011-09-17 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107340#comment-13107340
 ] 

Lars Hofhansl commented on HBASE-1935:
--

I wonder if a better building block would to be able to submit a scan to a 
region via HTable.

For example we have a need not necessarily for a parallel serial scan, but 
rather for a bunch of parallel scans that (via coprocessors) perform some 
aggregation and then perform a merge sort of the results at the client.
And of course this can also be used for parallel serial scans in the case of 
highly selective filters.

That would make for very small simple patch (management of threads, merging 
results, etc, would be application specific and not part of HBase).

The user visible API could be something as simple as (on HTable[Interface]):
ResultScanner getScanner(Scan, HRegionInfo)

And maybe something like the ParallelScannerManager could be added as an 
example(?)


 Scan in parallel
 

 Key: HBASE-1935
 URL: https://issues.apache.org/jira/browse/HBASE-1935
 Project: HBase
  Issue Type: New Feature
  Components: coprocessors
Reporter: stack
 Attachments: pscanner-v2.patch, pscanner-v3.patch, pscanner-v4.patch, 
 pscanner.patch


 A scanner that rather than scan in series, instead scanned multiple regions 
 in parallell would be more involved but could complete much faster 
 partiularly if results are sparse.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4415) Add configuration script for setup HBase (hbase-setup-conf.sh)

2011-09-17 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107341#comment-13107341
 ] 

jirapos...@reviews.apache.org commented on HBASE-4415:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1925/#review1957
---


Generally it might be good to be able to define prefixes similar to what 
autoconf scripts do. Such as:
--prefix=DIR[PREFIX/etc]
--sysconfdir=DIR[PREFIX/etc]

And so on.

The hbase-site.xml and hbase-env.sh templates now have to be maintained 
separately, hence this makes config changes harder to maintain.

In general this seems to be mostly useful for beginners of HBase, right?
(Oldtimers will have a mechanism in place to setup HBase - puppet, chef, 
fabric, etc.

So I wonder if a more useful avenue would be to post this as part of the HBase 
book or some other getting started spot on the HBase site.



/src/packages/hbase-setup-conf.sh
https://reviews.apache.org/r/1925/#comment4436

Isn't /usr a strange default for a Hadoop (or HBase or ZooKeeper) 
installation?
Maybe the script should rather check for a value, and fail if none was 
given.



/src/packages/hbase-setup-conf.sh
https://reviews.apache.org/r/1925/#comment4437

I don't see HBASE_LOG_DIR defaulted anywhere.
Same for HBASE_PID_DIR.



/src/packages/templates/conf/hbase-env.sh
https://reviews.apache.org/r/1925/#comment4438

Wow 2007... that's old :)
I think the latest convention is not to include he year.


- Lars


On 2011-09-16 00:06:21, Eric Yang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1925/
bq.  ---
bq.  
bq.  (Updated 2011-09-16 00:06:21)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Create a post installation script to streamline configuration tasks for 
HBase.
bq.  
bq.  usage: /usr/sbin/hbase-setup-conf.sh parameters
bq.  
bq.Optional parameters:
bq.  --hadoop-conf=/etc/hadoopSet Hadoop configuration 
directory location
bq.  --hadoop-home=/usr   Set Hadoop directory location
bq.  --hadoop-namenode=localhost  Set Hadoop namenode hostname
bq.  --hadoop-replication=3   Set HDFS replication
bq.  --hbase-home=/usrSet HBase directory location
bq.  --hbase-conf=/etc/hbase  Set HBase configuration 
directory location
bq.  --hbase-log=/var/log/hbase   Set HBase log directory 
location
bq.  --hbase-pid=/var/run/hbase   Set HBase pid directory 
location
bq.  --hbase-user=hbase   Set HBase user
bq.  --java-home=/usr/java/defaultSet JAVA_HOME directory 
location
bq.  --kerberos-realm=KERBEROS.EXAMPLE.COMSet Kerberos realm
bq.  --kerberos-principal-id=_HOSTSet Kerberos principal ID 
bq.  --keytab-dir=/etc/security/keytabs   Set keytab directory
bq.  --regionservers=localhostSet regionservers hostnames
bq.  --zookeeper-home=/usrSet ZooKeeper directory 
location
bq.  --zookeeper-quorum=localhost Set ZooKeeper Quorum
bq.  --zookeeper-snapshot=/var/lib/zookeeper  Set ZooKeeper snapshot 
location
bq.  
bq.  
bq.  This addresses bug HBASE-4415.
bq.  https://issues.apache.org/jira/browse/HBASE-4415
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/assembly/all.xml 1170899 
bq./src/packages/hbase-setup-conf.sh PRE-CREATION 
bq./src/packages/templates/conf/hbase-env.sh PRE-CREATION 
bq./src/packages/templates/conf/hbase-site.xml PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/1925/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eric
bq.  
bq.



 Add configuration script for setup HBase (hbase-setup-conf.sh)
 --

 Key: HBASE-4415
 URL: https://issues.apache.org/jira/browse/HBASE-4415
 Project: HBase
  Issue Type: New Feature
  Components: scripts
 Environment: Java 6, Linux
Reporter: Eric Yang
Assignee: Eric Yang
 Attachments: HBASE-4415-1.patch, HBASE-4415.patch


 The goal of this jura is to provide a installation script for configuring 
 HBase environment and configuration.  By using the same pattern of 
 *-setup-conf.sh for all Hadoop related projects.  For HBase, the usage of the 
 script looks like this:
 {noformat}
 usage: ./hbase-setup-conf.sh parameters
   Optional 

[jira] [Commented] (HBASE-4428) Two methods in CacheTestUtils don't call setDaemon() on the threads

2011-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107355#comment-13107355
 ] 

Hudson commented on HBASE-4428:
---

Integrated in HBase-TRUNK #2229 (See 
[https://builds.apache.org/job/HBase-TRUNK/2229/])
HBASE-4428 addendum, call ctx.stop at the end of testCacheMultiThreaded(), 
not in doAnAction()

tedyu : 
Files : 
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/MultithreadedTestUtil.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java


 Two methods in CacheTestUtils don't call setDaemon() on the threads
 ---

 Key: HBASE-4428
 URL: https://issues.apache.org/jira/browse/HBASE-4428
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 4428.addendum, 4428.txt, test-TestSingleSizeCache.trace


 hammerSingleKey() and hammerEviction() don't call t.setDaemon(true) on the 
 threads they create.
 The correct pattern is:
 {code}
   t.setDaemon(true);
   ctx.addThread(t);
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira