[jira] [Commented] (HBASE-4554) Allow set/unset coprocessor table attributes from shell.

2011-10-09 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123640#comment-13123640
 ] 

Andrew Purtell commented on HBASE-4554:
---

As originally conceived, the 'COPROCESSOR$' prefix for a table attribute name 
informs the coprocessor host that the attribute contains a coprocessor 
specification, and what follows the prefix can be arbitrary. For example, all 
are equally valid:

  - COPROCESSOR$1

  - COPROCESSOR$8453410222

  - COPROCESSOR$FooBar

  - COPROCESSOR$org.apache.hbase.foo.bar.Baz

 Offhand I'm not sure if the code supports this, but it should. 


> Allow set/unset coprocessor table attributes from shell.
> 
>
> Key: HBASE-4554
> URL: https://issues.apache.org/jira/browse/HBASE-4554
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Reporter: Mingjie Lai
>Assignee: Mingjie Lai
> Fix For: 0.92.0
>
>
> Table/region level coprocessor -- RegionObserver -- can be configured by 
> setting a HTD's attribute which matches Coprocessor$*. 
> Current shell -- alter -- cannot support to set/unset a table's arbitrary 
> attribute. We need it in order to configure region level coprocessors to a 
> table. 
> Proposed new shell:
> {code}
> hbase shell > alter 't1', METHOD => 'table_att', COPROCESSOR$1 => 
> 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|'
> hbase shell > describe 't1'
>  {NAME => 't1', COPROCESSOR$1 => 
> 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|', MAX_FILESIZE => 
> '134217728', ...}
> hbase shell > alter 't1', METHOD => 'table_att_unset', COPROCESSOR$1
> hbase shell > describe 't1'
>  {NAME => 't1', MAX_FILESIZE => '134217728', ...}
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4430) Disable TestSlabCache and TestSingleSizedCache temporarily to see if these are cause of build box failure though all tests pass

2011-10-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123643#comment-13123643
 ] 

Hudson commented on HBASE-4430:
---

Integrated in HBase-TRUNK #2310 (See 
[https://builds.apache.org/job/HBase-TRUNK/2310/])
HBASE-4430 Disable TestSlabCache and TestSingleSizedCache temporarily to 
see if these are cause of build box failure though all tests pass -- DISABLED 
TESTSLABCACHE AGAIN... its hanging

stack : 
Files : 
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java


> Disable TestSlabCache and TestSingleSizedCache temporarily to see if these 
> are cause of build box failure though all tests pass
> ---
>
> Key: HBASE-4430
> URL: https://issues.apache.org/jira/browse/HBASE-4430
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: stack
>Assignee: Li Pi
> Fix For: 0.92.0
>
> Attachments: TestSlabCache.trace
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4559) Refactor TestAvroServer into an integration test

2011-10-09 Thread Jesse Yates (Created) (JIRA)
Refactor TestAvroServer into an integration test


 Key: HBASE-4559
 URL: https://issues.apache.org/jira/browse/HBASE-4559
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Jesse Yates


TestAvroServer is a beefy test, spins up a mini cluster, does a large series of 
manipulations and then spins it down. It take about 2 mins to run on a local 
machine, which on the high side for a 'unit' test. 

This is part of the implentation discussed in 
http://search-hadoop.com/m/L9OzBNEOJK1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4559) Refactor TestAvroServer into an integration test

2011-10-09 Thread Jesse Yates (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-4559:
---

Attachment: java_HBASE_4559.txt

> Refactor TestAvroServer into an integration test
> 
>
> Key: HBASE-4559
> URL: https://issues.apache.org/jira/browse/HBASE-4559
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Jesse Yates
> Attachments: java_HBASE_4559.txt
>
>
> TestAvroServer is a beefy test, spins up a mini cluster, does a large series 
> of manipulations and then spins it down. It take about 2 mins to run on a 
> local machine, which on the high side for a 'unit' test. 
> This is part of the implentation discussed in 
> http://search-hadoop.com/m/L9OzBNEOJK1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4559) Refactor TestAvroServer into an integration test

2011-10-09 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123655#comment-13123655
 ] 

Jesse Yates commented on HBASE-4559:


Attached patch does not add any new unit test for avro server since everything 
in avro server seems to be just pass through functionality to the table admin. 
All the other functionality is covered already by TestAvroUtil.

> Refactor TestAvroServer into an integration test
> 
>
> Key: HBASE-4559
> URL: https://issues.apache.org/jira/browse/HBASE-4559
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Jesse Yates
> Attachments: java_HBASE_4559.txt
>
>
> TestAvroServer is a beefy test, spins up a mini cluster, does a large series 
> of manipulations and then spins it down. It take about 2 mins to run on a 
> local machine, which on the high side for a 'unit' test. 
> This is part of the implentation discussed in 
> http://search-hadoop.com/m/L9OzBNEOJK1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4539) OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading to HMaster abort

2011-10-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123657#comment-13123657
 ] 

Hudson commented on HBASE-4539:
---

Integrated in HBase-0.92 #54 (See 
[https://builds.apache.org/job/HBase-0.92/54/])
HBASE-4540 OpenedRegionHandler is not enforcing atomicity of the operation 
it is performing.  Also addresses HBASE-4539. (Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java


> OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading 
> to HMaster abort
> -
>
> Key: HBASE-4539
> URL: https://issues.apache.org/jira/browse/HBASE-4539
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>
> Steps to reproduce
> ==
> -> Region R1 is being opened in RS1.  
> ->After processing the znode to OPENED RS1 goes down.
> ->Now before the OpenedRegionHandler executor deletes the znode if 
> ServerShutDownHandler tries to assign the region to RS2, RS2 transits the 
> node to OPENED and this OpenedRegionHandler executor deletes the znode.  
> ->Now if the first OpenedRegionHandler tries deleting the znode it throws 
> NoNode Exception and causes the HMaster to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123658#comment-13123658
 ] 

Hudson commented on HBASE-4540:
---

Integrated in HBase-0.92 #54 (See 
[https://builds.apache.org/job/HBase-0.92/54/])
HBASE-4540 OpenedRegionHandler is not enforcing atomicity of the operation 
it is performing.  Also addresses HBASE-4539. (Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java


> OpenedRegionHandler is not enforcing atomicity of the operation it is 
> performing
> 
>
> Key: HBASE-4540
> URL: https://issues.apache.org/jira/browse/HBASE-4540
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
> by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by 
> RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its 
> in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of 
> R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open 
> the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ==
> 2011-10-05 20:49:45,301 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
> processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
> running balancer because 1 region(s) in transition: 
> {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
>  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x132d3dc13090023 Deleting existing unassigned node for 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x132d3dc13090023 Attempting to delete unassigned node 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
> RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =
> 2011-10-05 20:50:48,066 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
> region was in  the state null and not in expected PENDING_OPEN or OPENING 
> states
> 2011-10-05 20:50:53,743 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
> Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
> region was in  the state null and not in expected PENDING_OPEN or OPENING 
> states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3872) Hole in split transaction rollback; edits to .META. need to be rolled back even if it seems like they didn't make it

2011-10-09 Thread mingjian (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123662#comment-13123662
 ] 

mingjian commented on HBASE-3872:
-

@stack: Do you think delete daughter regions when rollbacking will cause data 
loss?
  We can replay this bug while using the follow code:
{code:title=SplitTransaction.java|borderStyle=solid}
if (!testing) {
  MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
throw new IOException("for test!");
}
{code}
  we think we could move "this.journal.add(JournalEntry.PONR);" before 
MetaEditor.offlineParentInMeta, then rollback would cause RS exit, not delete 
Daughter regions.
  Do you think so?

> Hole in split transaction rollback; edits to .META. need to be rolled back 
> even if it seems like they didn't make it
> 
>
> Key: HBASE-3872
> URL: https://issues.apache.org/jira/browse/HBASE-3872
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.3
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.90.4
>
> Attachments: 3872-v2.txt, 3872.txt
>
>
> Saw this interesting one on a cluster of ours.  The cluster was configured 
> with too few handlers so lots of the phenomeneon where actions were queued 
> but then by the time they got into the server and tried respond to the 
> client, the client had disconnected because of the timeout of 60 seconds.  
> Well, the meta edits for a split were queued at the regionserver carrying 
> .META. and by the time it went to write back, the client had gone (the first 
> insert of parent offline with daughter regions added as info:splitA and 
> info:splitB).  The client presumed the edits failed and 'successfully' rolled 
> back the transaction (failing to undo .META. edits thinking they didn't go 
> through).
> A few minutes later the .META. scanner on master runs.  It sees 'no 
> references' in daughters -- the daughters had been cleaned up as part of the 
> split transaction rollback -- so it thinks its safe to delete the parent.
> Two things:
> + Tighten up check in master... need to check daughter region at least exists 
> and possibly the daughter region has an entry in .META.
> + Dependent on the edit that fails, schedule rollback edits though it will 
> seem like they didn't go through.
> This is pretty critical one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4539) OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading to HMaster abort

2011-10-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123671#comment-13123671
 ] 

Hudson commented on HBASE-4539:
---

Integrated in HBase-TRUNK #2311 (See 
[https://builds.apache.org/job/HBase-TRUNK/2311/])
HBASE-4540 OpenedRegionHandler is not enforcing atomicity of the operation 
it is performing . Also fixes HBASE-4539 (ram)

ramkrishna : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java


> OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading 
> to HMaster abort
> -
>
> Key: HBASE-4539
> URL: https://issues.apache.org/jira/browse/HBASE-4539
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>
> Steps to reproduce
> ==
> -> Region R1 is being opened in RS1.  
> ->After processing the znode to OPENED RS1 goes down.
> ->Now before the OpenedRegionHandler executor deletes the znode if 
> ServerShutDownHandler tries to assign the region to RS2, RS2 transits the 
> node to OPENED and this OpenedRegionHandler executor deletes the znode.  
> ->Now if the first OpenedRegionHandler tries deleting the znode it throws 
> NoNode Exception and causes the HMaster to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123672#comment-13123672
 ] 

Hudson commented on HBASE-4540:
---

Integrated in HBase-TRUNK #2311 (See 
[https://builds.apache.org/job/HBase-TRUNK/2311/])
HBASE-4540 OpenedRegionHandler is not enforcing atomicity of the operation 
it is performing . Also fixes HBASE-4539 (ram)

ramkrishna : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java


> OpenedRegionHandler is not enforcing atomicity of the operation it is 
> performing
> 
>
> Key: HBASE-4540
> URL: https://issues.apache.org/jira/browse/HBASE-4540
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
> by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by 
> RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its 
> in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of 
> R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open 
> the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ==
> 2011-10-05 20:49:45,301 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
> processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
> running balancer because 1 region(s) in transition: 
> {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
>  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x132d3dc13090023 Deleting existing unassigned node for 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x132d3dc13090023 Attempting to delete unassigned node 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
> RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =
> 2011-10-05 20:50:48,066 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
> region was in  the state null and not in expected PENDING_OPEN or OPENING 
> states
> 2011-10-05 20:50:53,743 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
> Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
> region was in  the state null and not in expected PENDING_OPEN or OPENING 
> states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3872) Hole in split transaction rollback; edits to .META. need to be rolled back even if it seems like they didn't make it

2011-10-09 Thread bluedavy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123670#comment-13123670
 ] 

bluedavy commented on HBASE-3872:
-

@stack
current patch will cause data-loss in this situation:
1. change the SplitTransaction just like @mingjian said;
2. then create a table & put some data in hbase shell;
3. split the table in hbase shell;
4. kill the region server hosted the table;
5. after master do servershutdownhandler,then the table can be wrote again,but 
the data previous wrote to the table lost.

and in above code,if we don't kill the region server,then the parent region 
cann't be wrote,even if restart the cluster.

> Hole in split transaction rollback; edits to .META. need to be rolled back 
> even if it seems like they didn't make it
> 
>
> Key: HBASE-3872
> URL: https://issues.apache.org/jira/browse/HBASE-3872
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.3
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.90.4
>
> Attachments: 3872-v2.txt, 3872.txt
>
>
> Saw this interesting one on a cluster of ours.  The cluster was configured 
> with too few handlers so lots of the phenomeneon where actions were queued 
> but then by the time they got into the server and tried respond to the 
> client, the client had disconnected because of the timeout of 60 seconds.  
> Well, the meta edits for a split were queued at the regionserver carrying 
> .META. and by the time it went to write back, the client had gone (the first 
> insert of parent offline with daughter regions added as info:splitA and 
> info:splitB).  The client presumed the edits failed and 'successfully' rolled 
> back the transaction (failing to undo .META. edits thinking they didn't go 
> through).
> A few minutes later the .META. scanner on master runs.  It sees 'no 
> references' in daughters -- the daughters had been cleaned up as part of the 
> split transaction rollback -- so it thinks its safe to delete the parent.
> Two things:
> + Tighten up check in master... need to check daughter region at least exists 
> and possibly the daughter region has an entry in .META.
> + Dependent on the edit that fails, schedule rollback edits though it will 
> seem like they didn't go through.
> This is pretty critical one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123704#comment-13123704
 ] 

Ted Yu commented on HBASE-4218:
---

There seems to be a typo in the comment of KeyValue.java:
{noformat}
  /** Size in bytes of field the row length */
  public static final int FAMILY_LENGTH_SIZE = Bytes.SIZEOF_BYTE;
{noformat}

> Delta Encoding of KeyValues  (aka prefix compression)
> -
>
> Key: HBASE-4218
> URL: https://issues.apache.org/jira/browse/HBASE-4218
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 0.94.0
>Reporter: Jacek Migdal
>  Labels: compression
> Attachments: open-source.diff
>
>
> A compression for keys. Keys are sorted in HFile and they are usually very 
> similar. Because of that, it is possible to design better compression than 
> general purpose algorithms,
> It is an additional step designed to be used in memory. It aims to save 
> memory in cache as well as speeding seeks within HFileBlocks. It should 
> improve performance a lot, if key lengths are larger than value lengths. For 
> example, it makes a lot of sense to use it when value is a counter.
> Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
> shows that I could achieve decent level of compression:
>  key compression ratio: 92%
>  total compression ratio: 85%
>  LZO on the same data: 85%
>  LZO after delta encoding: 91%
> While having much better performance (20-80% faster decompression ratio than 
> LZO). Moreover, it should allow far more efficient seeking which should 
> improve performance a bit.
> It seems that a simple compression algorithms are good enough. Most of the 
> savings are due to prefix compression, int128 encoding, timestamp diffs and 
> bitfields to avoid duplication. That way, comparisons of compressed data can 
> be much faster than a byte comparator (thanks to prefix compression and 
> bitfields).
> In order to implement it in HBase two important changes in design will be 
> needed:
> -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
> and iterating; access to uncompressed buffer in HFileBlock will have bad 
> performance
> -extend comparators to support comparison assuming that N first bytes are 
> equal (or some fields are equal)
> Link to a discussion about something similar:
> http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windows&subj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123709#comment-13123709
 ] 

Ted Yu commented on HBASE-4218:
---

HFileBlockDeltaEncoder.java, RedundantKVGenerator.java, 
TestBufferedDeltaEncoder.java, TestDeltaEncoders.java need license.

RedundantKVGenerator ctor has many parameters. Is it possible to use some 
wrapper to hold the parameters ?

> Delta Encoding of KeyValues  (aka prefix compression)
> -
>
> Key: HBASE-4218
> URL: https://issues.apache.org/jira/browse/HBASE-4218
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 0.94.0
>Reporter: Jacek Migdal
>  Labels: compression
> Attachments: open-source.diff
>
>
> A compression for keys. Keys are sorted in HFile and they are usually very 
> similar. Because of that, it is possible to design better compression than 
> general purpose algorithms,
> It is an additional step designed to be used in memory. It aims to save 
> memory in cache as well as speeding seeks within HFileBlocks. It should 
> improve performance a lot, if key lengths are larger than value lengths. For 
> example, it makes a lot of sense to use it when value is a counter.
> Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
> shows that I could achieve decent level of compression:
>  key compression ratio: 92%
>  total compression ratio: 85%
>  LZO on the same data: 85%
>  LZO after delta encoding: 91%
> While having much better performance (20-80% faster decompression ratio than 
> LZO). Moreover, it should allow far more efficient seeking which should 
> improve performance a bit.
> It seems that a simple compression algorithms are good enough. Most of the 
> savings are due to prefix compression, int128 encoding, timestamp diffs and 
> bitfields to avoid duplication. That way, comparisons of compressed data can 
> be much faster than a byte comparator (thanks to prefix compression and 
> bitfields).
> In order to implement it in HBase two important changes in design will be 
> needed:
> -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
> and iterating; access to uncompressed buffer in HFileBlock will have bad 
> performance
> -extend comparators to support comparison assuming that N first bytes are 
> equal (or some fields are equal)
> Link to a discussion about something similar:
> http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windows&subj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4560) Script to find hanging test cases in build

2011-10-09 Thread ramkrishna.s.vasudevan (Created) (JIRA)
Script to find hanging test cases in build
--

 Key: HBASE-4560
 URL: https://issues.apache.org/jira/browse/HBASE-4560
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Priority: Minor


A script that parses the console output to get the hanging test names. This 
will be very useful to know the hanging test case names when we see in some 
builds all the test cases run but still the build shows failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4560) Script to find hanging test cases in build

2011-10-09 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4560:
--

Attachment: findHangingTest.sh

A script that parses the console output and finds the testcases that hangs.

> Script to find hanging test cases in build
> --
>
> Key: HBASE-4560
> URL: https://issues.apache.org/jira/browse/HBASE-4560
> Project: HBase
>  Issue Type: Improvement
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Attachments: findHangingTest.sh
>
>
> A script that parses the console output to get the hanging test names. This 
> will be very useful to know the hanging test case names when we see in some 
> builds all the test cases run but still the build shows failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4561) Update Maven documentation in book

2011-10-09 Thread Jesse Yates (Created) (JIRA)
Update Maven documentation in book
--

 Key: HBASE-4561
 URL: https://issues.apache.org/jira/browse/HBASE-4561
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Jesse Yates
Priority: Minor


The maven documentation is a little out of date and has recently led to some 
confusion about tests. This would cleanup the maven documents in the book to be 
more explicit about how maven should be used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4561) Update Maven documentation in book

2011-10-09 Thread Jesse Yates (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-4561:
---

Attachment: book_HBASE-4561.txt

Adding fix. Updates that maven 3 is standard and moves info about integration 
tests up to be with the rest of the maven commands

> Update Maven documentation in book
> --
>
> Key: HBASE-4561
> URL: https://issues.apache.org/jira/browse/HBASE-4561
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Jesse Yates
>Priority: Minor
> Attachments: book_HBASE-4561.txt
>
>
> The maven documentation is a little out of date and has recently led to some 
> confusion about tests. This would cleanup the maven documents in the book to 
> be more explicit about how maven should be used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123785#comment-13123785
 ] 

Ted Yu commented on HBASE-4218:
---

For BlockDeltaEncoder.decodeDataBlock():
{code}
  private HFileBlock decodeDataBlock(HFileBlock block, boolean verifyEncoding,
  short exceptDeltaEncoderId) {
{code}
exceptDeltaEncoderId should be called expectedDeltaEncoderId.

RuntimeException is thrown in case of IOException. I think decodeDataBlock() 
can be declared to throw IOException.

> Delta Encoding of KeyValues  (aka prefix compression)
> -
>
> Key: HBASE-4218
> URL: https://issues.apache.org/jira/browse/HBASE-4218
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 0.94.0
>Reporter: Jacek Migdal
>  Labels: compression
> Attachments: open-source.diff
>
>
> A compression for keys. Keys are sorted in HFile and they are usually very 
> similar. Because of that, it is possible to design better compression than 
> general purpose algorithms,
> It is an additional step designed to be used in memory. It aims to save 
> memory in cache as well as speeding seeks within HFileBlocks. It should 
> improve performance a lot, if key lengths are larger than value lengths. For 
> example, it makes a lot of sense to use it when value is a counter.
> Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
> shows that I could achieve decent level of compression:
>  key compression ratio: 92%
>  total compression ratio: 85%
>  LZO on the same data: 85%
>  LZO after delta encoding: 91%
> While having much better performance (20-80% faster decompression ratio than 
> LZO). Moreover, it should allow far more efficient seeking which should 
> improve performance a bit.
> It seems that a simple compression algorithms are good enough. Most of the 
> savings are due to prefix compression, int128 encoding, timestamp diffs and 
> bitfields to avoid duplication. That way, comparisons of compressed data can 
> be much faster than a byte comparator (thanks to prefix compression and 
> bitfields).
> In order to implement it in HBase two important changes in design will be 
> needed:
> -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
> and iterating; access to uncompressed buffer in HFileBlock will have bad 
> performance
> -extend comparators to support comparison assuming that N first bytes are 
> equal (or some fields are equal)
> Link to a discussion about something similar:
> http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windows&subj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123792#comment-13123792
 ] 

Ted Yu commented on HBASE-4218:
---

For BlockDeltaEncoder.inMemory:
{code}
  private final boolean inMemory;
{code}
Would encodedInMemory be a better name ? From javadoc in the code, it seems 
inMemory indicates whether in memory encoding is desired.

For BlockDeltaEncoder.afterReadFromDiskAndPuttingInCache(),
{code}
if (block.getBlockType() == BlockType.ENCODED_DATA) {
  throw new IllegalStateException("Unexcepted encoding");
}
{code}
I think block.getDeltaEncodingId() should be included in the exception. 
Further, can we use a call such as the following to decode the block instead of 
throwing exception ?
{code}
decodeDataBlock(block, true, block.getDeltaEncodingId())
{code}

> Delta Encoding of KeyValues  (aka prefix compression)
> -
>
> Key: HBASE-4218
> URL: https://issues.apache.org/jira/browse/HBASE-4218
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 0.94.0
>Reporter: Jacek Migdal
>  Labels: compression
> Attachments: open-source.diff
>
>
> A compression for keys. Keys are sorted in HFile and they are usually very 
> similar. Because of that, it is possible to design better compression than 
> general purpose algorithms,
> It is an additional step designed to be used in memory. It aims to save 
> memory in cache as well as speeding seeks within HFileBlocks. It should 
> improve performance a lot, if key lengths are larger than value lengths. For 
> example, it makes a lot of sense to use it when value is a counter.
> Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
> shows that I could achieve decent level of compression:
>  key compression ratio: 92%
>  total compression ratio: 85%
>  LZO on the same data: 85%
>  LZO after delta encoding: 91%
> While having much better performance (20-80% faster decompression ratio than 
> LZO). Moreover, it should allow far more efficient seeking which should 
> improve performance a bit.
> It seems that a simple compression algorithms are good enough. Most of the 
> savings are due to prefix compression, int128 encoding, timestamp diffs and 
> bitfields to avoid duplication. That way, comparisons of compressed data can 
> be much faster than a byte comparator (thanks to prefix compression and 
> bitfields).
> In order to implement it in HBase two important changes in design will be 
> needed:
> -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
> and iterating; access to uncompressed buffer in HFileBlock will have bad 
> performance
> -extend comparators to support comparison assuming that N first bytes are 
> equal (or some fields are equal)
> Link to a discussion about something similar:
> http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windows&subj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123799#comment-13123799
 ] 

Ted Yu commented on HBASE-4218:
---

For BlockDeltaEncoder.useEncodedScanner(), why doesn't isCompaction appear in 
the second condition on line 227 ?

TestHFileBlockDeltaEncoder, DeltaEncodingSeekPerformance need license.

For BitsetKeyDeltaEncoder.uncompressKeyValues(), the IllegalStateException on 
line 81 should contain source.available() and skipLastBytes.
BitsetKeyDeltaEncoder.isPartEqual() should be named arePartsEqual().


> Delta Encoding of KeyValues  (aka prefix compression)
> -
>
> Key: HBASE-4218
> URL: https://issues.apache.org/jira/browse/HBASE-4218
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 0.94.0
>Reporter: Jacek Migdal
>  Labels: compression
> Attachments: open-source.diff
>
>
> A compression for keys. Keys are sorted in HFile and they are usually very 
> similar. Because of that, it is possible to design better compression than 
> general purpose algorithms,
> It is an additional step designed to be used in memory. It aims to save 
> memory in cache as well as speeding seeks within HFileBlocks. It should 
> improve performance a lot, if key lengths are larger than value lengths. For 
> example, it makes a lot of sense to use it when value is a counter.
> Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
> shows that I could achieve decent level of compression:
>  key compression ratio: 92%
>  total compression ratio: 85%
>  LZO on the same data: 85%
>  LZO after delta encoding: 91%
> While having much better performance (20-80% faster decompression ratio than 
> LZO). Moreover, it should allow far more efficient seeking which should 
> improve performance a bit.
> It seems that a simple compression algorithms are good enough. Most of the 
> savings are due to prefix compression, int128 encoding, timestamp diffs and 
> bitfields to avoid duplication. That way, comparisons of compressed data can 
> be much faster than a byte comparator (thanks to prefix compression and 
> bitfields).
> In order to implement it in HBase two important changes in design will be 
> needed:
> -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
> and iterating; access to uncompressed buffer in HFileBlock will have bad 
> performance
> -extend comparators to support comparison assuming that N first bytes are 
> equal (or some fields are equal)
> Link to a discussion about something similar:
> http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windows&subj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-09 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123802#comment-13123802
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--



bq.  On 2011-10-08 21:55:31, Michael Stack wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java,
 line 102
bq.  > 
bq.  >
bq.  > Good test.
bq.  > 
bq.  > Would it be possible to test the handler without spinning up the 
cluster?  See TestOpenRegionHandler over under regionserver.handler in tests -- 
they don't spin up a cluster, just zk.  Test can run faster if no dfs+hbase.  
Not important.  For the future.
bq.  
bq.  ramkrishna vasudevan wrote:
bq.  @Stack
bq.  I can do like that atleast for one of the testcases in 
TestOpenedRegionHandler.  But i have to use the MockServer and 
MockRegionServices.
bq.  I will raise one minor improvement task to do that. Currently 
MockServer and MockRegionServices are under regionserver.handler package but 
the new testcase is in master package.  So better we can move it to a 
test.utility package and then use it across. So i will currently go with this 
commit and then track the new improvement JIRA to closure.

Sounds good Ram.  Yes, we should move these out if more generally useful.


bq.  On 2011-10-08 21:55:31, Michael Stack wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java,
 line 873
bq.  > 
bq.  >
bq.  > We don't have this method already in our ZK* classes?
bq.  
bq.  ramkrishna vasudevan wrote:
bq.  @Stack
bq.  ZKAssign() did have getDataAndWatch() that accepts stat object.  Only 
ZKUtil had but it returned data in bytes which had to be again converted to 
RegionTransitionData. 
bq.  Hence added an utility api in ZKAssign itself and thought it may be 
useful in future also.

Sounds good.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2469
---


On 2011-10-08 05:13:32, ramkrishna vasudevan wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  ---
bq.  
bq.  (Updated 2011-10-08 05:13:32)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify 
OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.  https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.



> OpenedRegionHandler is not enforcing atomicity of the operation it is 
> performing
> 
>
> Key: HBASE-4540
> URL: https://issues.apache.org/jira/browse/HBASE-4540
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
> by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by 
> RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its 
> in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of

[jira] [Resolved] (HBASE-4561) Update Maven documentation in book

2011-10-09 Thread stack (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4561.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to TRUNK.  Thanks for the patch Jesse.  I changed the 'NOT Maven 2' 
to something like 'Maven 2 may work but we recommend you upgrade to Maven 3' -- 
since Maven 2 seems to work

> Update Maven documentation in book
> --
>
> Key: HBASE-4561
> URL: https://issues.apache.org/jira/browse/HBASE-4561
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Jesse Yates
>Priority: Minor
> Attachments: book_HBASE-4561.txt
>
>
> The maven documentation is a little out of date and has recently led to some 
> confusion about tests. This would cleanup the maven documents in the book to 
> be more explicit about how maven should be used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4561) Update Maven documentation in book

2011-10-09 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123810#comment-13123810
 ] 

Jesse Yates commented on HBASE-4561:


No problem - I'm not one for prose :)

> Update Maven documentation in book
> --
>
> Key: HBASE-4561
> URL: https://issues.apache.org/jira/browse/HBASE-4561
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Jesse Yates
>Priority: Minor
> Attachments: book_HBASE-4561.txt
>
>
> The maven documentation is a little out of date and has recently led to some 
> confusion about tests. This would cleanup the maven documents in the book to 
> be more explicit about how maven should be used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4388) Second start after migration from 90 to trunk crashes

2011-10-09 Thread Suraj Varma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123819#comment-13123819
 ] 

Suraj Varma commented on HBASE-4388:


I applied the 4288-v3 patch locally but still got the same exception. 
(0.92.0-SNAPSHOT + 4288-v3 patch pointed to a 0.90.x hbase root dir.

2011-10-09 17:36:47,062 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled
exception. Starting shutdown.
java.lang.NegativeArraySizeException: -108
at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
at org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.
java:621)
at org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionI
nfo090x.java:641)
at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133
)

Just wanted to give this feedback ... 
Does this need to pull in HBASE-3970 as well to 0.92?

> Second start after migration from 90 to trunk crashes
> -
>
> Key: HBASE-4388
> URL: https://issues.apache.org/jira/browse/HBASE-4388
> Project: HBase
>  Issue Type: Bug
>  Components: master, migration
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: stack
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 4388-v2.txt, 4388-v3.txt, 4388.txt, meta.tgz
>
>
> I started a trunk cluster to upgrade from 90, inserted a ton of data, then 
> did a clean shutdown. When I started again, I got the following exception:
> 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta 
> now.
> 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting 
> shutdown.
> java.lang.NegativeArraySizeException: -102
> at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
> at 
> org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606)
> at 
> org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641)
> at 
> org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133)
> at 
> org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103)
> at 
> org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298)
> at 
> org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4388) Second start after migration from 90 to trunk crashes

2011-10-09 Thread Suraj Varma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123820#comment-13123820
 ] 

Suraj Varma commented on HBASE-4388:


Sorry - I meant 4388-v3 not 4288-v3 ...

> Second start after migration from 90 to trunk crashes
> -
>
> Key: HBASE-4388
> URL: https://issues.apache.org/jira/browse/HBASE-4388
> Project: HBase
>  Issue Type: Bug
>  Components: master, migration
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: stack
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 4388-v2.txt, 4388-v3.txt, 4388.txt, meta.tgz
>
>
> I started a trunk cluster to upgrade from 90, inserted a ton of data, then 
> did a clean shutdown. When I started again, I got the following exception:
> 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta 
> now.
> 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting 
> shutdown.
> java.lang.NegativeArraySizeException: -102
> at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
> at 
> org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606)
> at 
> org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641)
> at 
> org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133)
> at 
> org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103)
> at 
> org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298)
> at 
> org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4388) Second start after migration from 90 to trunk crashes

2011-10-09 Thread Suraj Varma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123825#comment-13123825
 ] 

Suraj Varma commented on HBASE-4388:


So - looks like HBASE-3970 changes are already in 0.92 (though it is marked as 
trunk in the jira ticket).

So - this issue still exists, it appears.

> Second start after migration from 90 to trunk crashes
> -
>
> Key: HBASE-4388
> URL: https://issues.apache.org/jira/browse/HBASE-4388
> Project: HBase
>  Issue Type: Bug
>  Components: master, migration
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: stack
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 4388-v2.txt, 4388-v3.txt, 4388.txt, meta.tgz
>
>
> I started a trunk cluster to upgrade from 90, inserted a ton of data, then 
> did a clean shutdown. When I started again, I got the following exception:
> 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta 
> now.
> 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting 
> shutdown.
> java.lang.NegativeArraySizeException: -102
> at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
> at 
> org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606)
> at 
> org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641)
> at 
> org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133)
> at 
> org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103)
> at 
> org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298)
> at 
> org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4388) Second start after migration from 90 to trunk crashes

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123827#comment-13123827
 ] 

Ted Yu commented on HBASE-4388:
---

@Suraj:
Can you give more details (including master log) ?
Was your procedure different from mine described @ 14/Sep/11 23:44 ?

Thanks

> Second start after migration from 90 to trunk crashes
> -
>
> Key: HBASE-4388
> URL: https://issues.apache.org/jira/browse/HBASE-4388
> Project: HBase
>  Issue Type: Bug
>  Components: master, migration
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: stack
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 4388-v2.txt, 4388-v3.txt, 4388.txt, meta.tgz
>
>
> I started a trunk cluster to upgrade from 90, inserted a ton of data, then 
> did a clean shutdown. When I started again, I got the following exception:
> 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta 
> now.
> 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting 
> shutdown.
> java.lang.NegativeArraySizeException: -102
> at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
> at 
> org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606)
> at 
> org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641)
> at 
> org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133)
> at 
> org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103)
> at 
> org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298)
> at 
> org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3872) Hole in split transaction rollback; edits to .META. need to be rolled back even if it seems like they didn't make it

2011-10-09 Thread bluedavy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123837#comment-13123837
 ] 

bluedavy commented on HBASE-3872:
-

We fix the bug using below code:
 if (!testing) {
+this.journal.add(JournalEntry.PONR);
 
MetaEditor.offlineParentInMeta(server.getCatalogTracker(),this.parent.getRegionInfo(),
a.getRegionInfo(), b.getRegionInfo());
}

-   this.journal.add(JournalEntry.PONR);

> Hole in split transaction rollback; edits to .META. need to be rolled back 
> even if it seems like they didn't make it
> 
>
> Key: HBASE-3872
> URL: https://issues.apache.org/jira/browse/HBASE-3872
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.3
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.90.4
>
> Attachments: 3872-v2.txt, 3872.txt
>
>
> Saw this interesting one on a cluster of ours.  The cluster was configured 
> with too few handlers so lots of the phenomeneon where actions were queued 
> but then by the time they got into the server and tried respond to the 
> client, the client had disconnected because of the timeout of 60 seconds.  
> Well, the meta edits for a split were queued at the regionserver carrying 
> .META. and by the time it went to write back, the client had gone (the first 
> insert of parent offline with daughter regions added as info:splitA and 
> info:splitB).  The client presumed the edits failed and 'successfully' rolled 
> back the transaction (failing to undo .META. edits thinking they didn't go 
> through).
> A few minutes later the .META. scanner on master runs.  It sees 'no 
> references' in daughters -- the daughters had been cleaned up as part of the 
> split transaction rollback -- so it thinks its safe to delete the parent.
> Two things:
> + Tighten up check in master... need to check daughter region at least exists 
> and possibly the daughter region has an entry in .META.
> + Dependent on the edit that fails, schedule rollback edits though it will 
> seem like they didn't go through.
> This is pretty critical one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3872) Hole in split transaction rollback; edits to .META. need to be rolled back even if it seems like they didn't make it

2011-10-09 Thread bluedavy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123838#comment-13123838
 ] 

bluedavy commented on HBASE-3872:
-

{code:title=Bar.java|borderStyle=solid}
if (!testing) {
  this.journal.add(JournalEntry.PONR); 
  MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
}
// this.journal.add(JournalEntry.PONR); 
{code} 

> Hole in split transaction rollback; edits to .META. need to be rolled back 
> even if it seems like they didn't make it
> 
>
> Key: HBASE-3872
> URL: https://issues.apache.org/jira/browse/HBASE-3872
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.3
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.90.4
>
> Attachments: 3872-v2.txt, 3872.txt
>
>
> Saw this interesting one on a cluster of ours.  The cluster was configured 
> with too few handlers so lots of the phenomeneon where actions were queued 
> but then by the time they got into the server and tried respond to the 
> client, the client had disconnected because of the timeout of 60 seconds.  
> Well, the meta edits for a split were queued at the regionserver carrying 
> .META. and by the time it went to write back, the client had gone (the first 
> insert of parent offline with daughter regions added as info:splitA and 
> info:splitB).  The client presumed the edits failed and 'successfully' rolled 
> back the transaction (failing to undo .META. edits thinking they didn't go 
> through).
> A few minutes later the .META. scanner on master runs.  It sees 'no 
> references' in daughters -- the daughters had been cleaned up as part of the 
> split transaction rollback -- so it thinks its safe to delete the parent.
> Two things:
> + Tighten up check in master... need to check daughter region at least exists 
> and possibly the daughter region has an entry in .META.
> + Dependent on the edit that fails, schedule rollback edits though it will 
> seem like they didn't go through.
> This is pretty critical one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3872) Hole in split transaction rollback; edits to .META. need to be rolled back even if it seems like they didn't make it

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123842#comment-13123842
 ] 

Ted Yu commented on HBASE-3872:
---

I think the above code should be improved.
If testing is true, JournalEntry.PONR wouldn't be set.
It seems JournalEntry.PONR should be set before the check for testing.

A new JIRA should be opened.


> Hole in split transaction rollback; edits to .META. need to be rolled back 
> even if it seems like they didn't make it
> 
>
> Key: HBASE-3872
> URL: https://issues.apache.org/jira/browse/HBASE-3872
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.3
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.90.4
>
> Attachments: 3872-v2.txt, 3872.txt
>
>
> Saw this interesting one on a cluster of ours.  The cluster was configured 
> with too few handlers so lots of the phenomeneon where actions were queued 
> but then by the time they got into the server and tried respond to the 
> client, the client had disconnected because of the timeout of 60 seconds.  
> Well, the meta edits for a split were queued at the regionserver carrying 
> .META. and by the time it went to write back, the client had gone (the first 
> insert of parent offline with daughter regions added as info:splitA and 
> info:splitB).  The client presumed the edits failed and 'successfully' rolled 
> back the transaction (failing to undo .META. edits thinking they didn't go 
> through).
> A few minutes later the .META. scanner on master runs.  It sees 'no 
> references' in daughters -- the daughters had been cleaned up as part of the 
> split transaction rollback -- so it thinks its safe to delete the parent.
> Two things:
> + Tighten up check in master... need to check daughter region at least exists 
> and possibly the daughter region has an entry in .META.
> + Dependent on the edit that fails, schedule rollback edits though it will 
> seem like they didn't go through.
> This is pretty critical one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4561) Update Maven documentation in book

2011-10-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123846#comment-13123846
 ] 

Hudson commented on HBASE-4561:
---

Integrated in HBase-TRUNK #2312 (See 
[https://builds.apache.org/job/HBase-TRUNK/2312/])
HBASE-4561 Update Maven documentation in book

stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml


> Update Maven documentation in book
> --
>
> Key: HBASE-4561
> URL: https://issues.apache.org/jira/browse/HBASE-4561
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Jesse Yates
>Priority: Minor
> Attachments: book_HBASE-4561.txt
>
>
> The maven documentation is a little out of date and has recently led to some 
> confusion about tests. This would cleanup the maven documents in the book to 
> be more explicit about how maven should be used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new entr

2011-10-09 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4102:
-

Attachment: 4102.txt

Here's a first patch.
I hopefully minimized the copying and creation of byte[] in the regionserver.
I added a new constructor to KeyValue to be able to create an empty but 
pre-sized KeyValue that can be filled later.

There's a lot of boilerplate that is very similar between Put, Increment, and 
now Append. Could think about factoring some more of it out.

> atomicAppend: A put that appends to the latest version of a cell; i.e. reads 
> current value then adds the bytes offered by the client to the tail and 
> writes out a new entry
> ---
>
> Key: HBASE-4102
> URL: https://issues.apache.org/jira/browse/HBASE-4102
> Project: HBase
>  Issue Type: New Feature
>Reporter: stack
>Assignee: Lars Hofhansl
> Attachments: 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather 
> than make a new cell each time.  At our place, the frontend keeps a list of 
> urls a user has visited -- their md5s -- and updates it as user progresses.  
> Rather than read, modify client-side, then write new value back to hbase, it 
> would be sweet if could do it all in one operation in hbase server.  TSDB 
> aims to be space efficient.  Rather than pay the cost of the KV wrapper per 
> metric, it would rather have a KV for an interval an in this KV have a value 
> that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental 
> feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the 
> client wants to see the resulting cell; often a client won't want to see the 
> result and in this case, why pay the price formulating and delivering a 
> response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4562) When split occurs error,it'll cause data loss

2011-10-09 Thread bluedavy (Created) (JIRA)
When split occurs error,it'll cause data loss
-

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Priority: Blocker
 Fix For: 0.90.5


Follow below steps to replay the problem:
1. change the SplitTransaction.java as below,just like mock the timeout error.
   {code:title=SplitTransaction.java|borderStyle=solid}
  if (!testing) {
MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
throw new IOException("some unexpected error in split");
  }
   {code} 
2. update the regionserver code,restart;
3. create a table & put some data to the table;
4. split the table;
5. kill the regionserver hosted the table;
6. wait some time after master ServerShutdownHandler.process execute,then scan 
the table,u'll find the data wrote before lost.

We can fix the bug just use below code:
{code:title=SplitTransaction.java|borderStyle=solid}
  this.journal.add(JournalEntry.PONR); 
  if (!testing) {
MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
throw new IOException("some unexpected error in split");
  }
{code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4562) When split occurs error,it'll cause data loss

2011-10-09 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Description: 
Follow below steps to replay the problem:
1. change the SplitTransaction.java as below,just like mock the timeout error.
   {code:title=SplitTransaction.java|borderStyle=solid}
  if (!testing) {
MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
throw new IOException("some unexpected error in split");
  }
   {code} 
2. update the regionserver code,restart;
3. create a table & put some data to the table;
4. split the table;
5. kill the regionserver hosted the table;
6. wait some time after master ServerShutdownHandler.process execute,then scan 
the table,u'll find the data wrote before lost.

We can fix the bug just use below code:
{code:title=SplitTransaction.java|borderStyle=solid}
  this.journal.add(JournalEntry.PONR); 
  if (!testing) {
MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
throw new IOException("some unexpected error in split");
  }
{code} 
{code:title=CompactSplitThread.java|borderStyle=solid}
  if (st.rollback(this.server, this.server)) {
  LOG.info("Successful rollback of failed split of " +
parent.getRegionNameAsString());
  } 
  else {
  this.server.abort("Abort; we got an error after point-of-no-return");
  }
{code}

  was:
Follow below steps to replay the problem:
1. change the SplitTransaction.java as below,just like mock the timeout error.
   {code:title=SplitTransaction.java|borderStyle=solid}
  if (!testing) {
MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
throw new IOException("some unexpected error in split");
  }
   {code} 
2. update the regionserver code,restart;
3. create a table & put some data to the table;
4. split the table;
5. kill the regionserver hosted the table;
6. wait some time after master ServerShutdownHandler.process execute,then scan 
the table,u'll find the data wrote before lost.

We can fix the bug just use below code:
{code:title=SplitTransaction.java|borderStyle=solid}
  this.journal.add(JournalEntry.PONR); 
  if (!testing) {
MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
throw new IOException("some unexpected error in split");
  }
{code} 


> When split occurs error,it'll cause data loss
> -
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use below code:
> {code:title=SplitTransaction.java|borderStyle=solid}
>   this.journal.add(JournalEntry.PONR); 
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
> {code} 
> {code:title=CompactSplitThread.java|borderStyle=solid}
>   if (st.rollback(this.server, this.server)) {
>   LOG.info("Successful rollback of failed split of " +
> parent.getRegionNameAsString());
>   } 
>   else {
>   this.server.abort("Abort; we got an error after 
> point-of-no-return");
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-09 Thread bluedavy (Created) (JIRA)
When split doing this.parent.close(false) occurs error,it'll cause the splited 
region cann't write & read
-

 Key: HBASE-4563
 URL: https://issues.apache.org/jira/browse/HBASE-4563
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: bluedavy
Priority: Blocker
 Fix For: 0.90.5


Follow below steps to replay the problem:
1. change the SplitTransaction.java as below,just like mock the hdfs error.
   {code:title=SplitTransaction.java|borderStyle=solid}
  List hstoreFilesToSplit = this.parent.close(false);
  throw new IOException("some unexpected error in close store files");
   {code} 
2. update the regionserver code,restart;
3. create a table & put some data to the table;
4. split the table;
5. scan the table,then it'll fail.

We can fix the bug just use below code:
{code:title=SplitTransaction.java|borderStyle=solid}
  List hstoreFilesToSplit = null;
  try{
  hstoreFilesToSplit = this.parent.close(false);
  }
  catch(IOException  e){
  this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
  throw e;
  }
{code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta occurs error,it'll cause data loss

2011-10-09 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Summary: When split doing offlineParentInMeta occurs error,it'll cause data 
loss  (was: When split occurs error,it'll cause data loss)

> When split doing offlineParentInMeta occurs error,it'll cause data loss
> ---
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use below code:
> {code:title=SplitTransaction.java|borderStyle=solid}
>   this.journal.add(JournalEntry.PONR); 
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
> {code} 
> {code:title=CompactSplitThread.java|borderStyle=solid}
>   if (st.rollback(this.server, this.server)) {
>   LOG.info("Successful rollback of failed split of " +
> parent.getRegionNameAsString());
>   } 
>   else {
>   this.server.abort("Abort; we got an error after 
> point-of-no-return");
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3872) Hole in split transaction rollback; edits to .META. need to be rolled back even if it seems like they didn't make it

2011-10-09 Thread bluedavy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123861#comment-13123861
 ] 

bluedavy commented on HBASE-3872:
-

I created the HBASE-4562,HBASE-4563.

> Hole in split transaction rollback; edits to .META. need to be rolled back 
> even if it seems like they didn't make it
> 
>
> Key: HBASE-3872
> URL: https://issues.apache.org/jira/browse/HBASE-3872
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.3
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.90.4
>
> Attachments: 3872-v2.txt, 3872.txt
>
>
> Saw this interesting one on a cluster of ours.  The cluster was configured 
> with too few handlers so lots of the phenomeneon where actions were queued 
> but then by the time they got into the server and tried respond to the 
> client, the client had disconnected because of the timeout of 60 seconds.  
> Well, the meta edits for a split were queued at the regionserver carrying 
> .META. and by the time it went to write back, the client had gone (the first 
> insert of parent offline with daughter regions added as info:splitA and 
> info:splitB).  The client presumed the edits failed and 'successfully' rolled 
> back the transaction (failing to undo .META. edits thinking they didn't go 
> through).
> A few minutes later the .META. scanner on master runs.  It sees 'no 
> references' in daughters -- the daughters had been cleaned up as part of the 
> split transaction rollback -- so it thinks its safe to delete the parent.
> Two things:
> + Tighten up check in master... need to check daughter region at least exists 
> and possibly the daughter region has an entry in .META.
> + Dependent on the edit that fails, schedule rollback edits though it will 
> seem like they didn't go through.
> This is pretty critical one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4335) Splits can create temporary holes in .META. that confuse clients and regionservers

2011-10-09 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4335:
-

Attachment: 4335-v5.txt

Added license notice to new test.
Do I see some +1's?

> Splits can create temporary holes in .META. that confuse clients and 
> regionservers
> --
>
> Key: HBASE-4335
> URL: https://issues.apache.org/jira/browse/HBASE-4335
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: Joe Pallas
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4335-v2.txt, 4335-v3.txt, 4335-v4.txt, 4335-v5.txt, 
> 4335.txt
>
>
> When a SplitTransaction is performed, three updates are done to .META.:
> 1. The parent region is marked as splitting (and hence offline)
> 2. The first daughter region is added (same start key as parent)
> 3. The second daughter region is added (split key is start key)
> (later, the original parent region is deleted, but that's not important to 
> this discussion)
> Steps 2 and 3 are actually done concurrently by 
> SplitTransaction.DaughterOpener threads.  While the master is notified when a 
> split is complete, the only visibility that clients have is whether the 
> daughter regions have appeared in .META.
> If the second daughter is added to .META. first, then .META. will contain the 
> (offline) parent region followed by the second daughter region.  If the 
> client looks up a key that is greater than (or equal to) the split, the 
> client will find the second daughter region and use it.  If the key is less 
> than the split key, the client will find the parent region and see that it is 
> offline, triggering a retry.
> If the first daughter is added to .META. before the second daughter, there is 
> a window during which .META. has a hole: the first daughter effectively hides 
> the parent region (same start key), but there is no entry for the second 
> daughter.  A region lookup will find the first daughter for all keys in the 
> parent's range, but the first daughter does not include keys at or beyond the 
> split key.
> See HBASE-4333 and HBASE-4334 for details on how this causes problems and 
> suggestions for mitigating this in the client and regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-09 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4563:


Attachment: HBASE-4562&4563.patch

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4562&4563.patch
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use below code:
> {code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = null;
>   try{
>   hstoreFilesToSplit = this.parent.close(false);
>   }
>   catch(IOException  e){
>   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
>   throw e;
>   }
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta occurs error,it'll cause data loss

2011-10-09 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: HBASE-4562&4563.patch

> When split doing offlineParentInMeta occurs error,it'll cause data loss
> ---
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4562&4563.patch
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use below code:
> {code:title=SplitTransaction.java|borderStyle=solid}
>   this.journal.add(JournalEntry.PONR); 
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
> {code} 
> {code:title=CompactSplitThread.java|borderStyle=solid}
>   if (st.rollback(this.server, this.server)) {
>   LOG.info("Successful rollback of failed split of " +
> parent.getRegionNameAsString());
>   } 
>   else {
>   this.server.abort("Abort; we got an error after 
> point-of-no-return");
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-09 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4563:


Component/s: regionserver

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4562&4563.patch
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use below code:
> {code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = null;
>   try{
>   hstoreFilesToSplit = this.parent.close(false);
>   }
>   catch(IOException  e){
>   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
>   throw e;
>   }
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4102) atomicAppend: A put that appends to the latest version of a cell; i.e. reads current value then adds the bytes offered by the client to the tail and writes out a new en

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123876#comment-13123876
 ] 

Ted Yu commented on HBASE-4102:
---

{code}
+/**
+ * Testing of HRegion.incrementColumnValue
+ *
+ */
+public class TestAtomicOperation extends HBaseTestCase {
{code}
Javadoc should be updated for the new test.
For Mutation.java:
{code}
+   * @return the number of different families included in this put
+   */
+  public int numFamilies() {
{code}
I think put in javadoc should be mutation.
For Append.readFields():
{code}
+if (version > APPEND_VERSION) {
+  throw new IOException("version not supported");
{code}
Value of version should be included in the exception.
For RegionCoprocessorHost.postAppend():
{code}
+   * @param appent Append object
{code}
there was a typo above.
For HRegion.append():
{code}
+  public Result append(Append append, Integer lockid, boolean writeToWAL)
+  throws IOException {
...
+checkRow(row, "increment");
{code}
Second parameter above should be "append"


> atomicAppend: A put that appends to the latest version of a cell; i.e. reads 
> current value then adds the bytes offered by the client to the tail and 
> writes out a new entry
> ---
>
> Key: HBASE-4102
> URL: https://issues.apache.org/jira/browse/HBASE-4102
> Project: HBase
>  Issue Type: New Feature
>Reporter: stack
>Assignee: Lars Hofhansl
> Attachments: 4102.txt
>
>
> Its come up a few times that clients want to add to an existing cell rather 
> than make a new cell each time.  At our place, the frontend keeps a list of 
> urls a user has visited -- their md5s -- and updates it as user progresses.  
> Rather than read, modify client-side, then write new value back to hbase, it 
> would be sweet if could do it all in one operation in hbase server.  TSDB 
> aims to be space efficient.  Rather than pay the cost of the KV wrapper per 
> metric, it would rather have a KV for an interval an in this KV have a value 
> that is all the metrics for the period.
> It could be done as a coprocessor but this feels more like a fundamental 
> feature.
> Benoît suggests that atomicAppend take a flag to indicate whether or not the 
> client wants to see the resulting cell; often a client won't want to see the 
> result and in this case, why pay the price formulating and delivering a 
> response that client just drops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-09 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4540:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> OpenedRegionHandler is not enforcing atomicity of the operation it is 
> performing
> 
>
> Key: HBASE-4540
> URL: https://issues.apache.org/jira/browse/HBASE-4540
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
> by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by 
> RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its 
> in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of 
> R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open 
> the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ==
> 2011-10-05 20:49:45,301 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
> processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
> running balancer because 1 region(s) in transition: 
> {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
>  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x132d3dc13090023 Deleting existing unassigned node for 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x132d3dc13090023 Attempting to delete unassigned node 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
> RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =
> 2011-10-05 20:50:48,066 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
> region was in  the state null and not in expected PENDING_OPEN or OPENING 
> states
> 2011-10-05 20:50:53,743 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
> Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
> region was in  the state null and not in expected PENDING_OPEN or OPENING 
> states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4539) OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading to HMaster abort

2011-10-09 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-4539.
---

Resolution: Fixed

Fixed as part of HBASE-4540

> OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading 
> to HMaster abort
> -
>
> Key: HBASE-4539
> URL: https://issues.apache.org/jira/browse/HBASE-4539
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>
> Steps to reproduce
> ==
> -> Region R1 is being opened in RS1.  
> ->After processing the znode to OPENED RS1 goes down.
> ->Now before the OpenedRegionHandler executor deletes the znode if 
> ServerShutDownHandler tries to assign the region to RS2, RS2 transits the 
> node to OPENED and this OpenedRegionHandler executor deletes the znode.  
> ->Now if the first OpenedRegionHandler tries deleting the znode it throws 
> NoNode Exception and causes the HMaster to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-09 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4540:
--

Fix Version/s: 0.92.0

> OpenedRegionHandler is not enforcing atomicity of the operation it is 
> performing
> 
>
> Key: HBASE-4540
> URL: https://issues.apache.org/jira/browse/HBASE-4540
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.92.0
>
> Attachments: HBASE-4540_1.patch
>
>
> -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
> by RS1.
> -> RS1 goes down.
> -> Servershutdownhandler assigns the region R1 to RS2.
> -> The znode of R1 is moved to OFFLINE state by master or OPENING state by 
> RS2 if RS2 has started opening the region.
> -> Now the first OpenedRegionHandler tries to delete the znode thinking its 
> in OPENED state but fails.
> -> Though it fails it removes the node from RIT and adds RS1 as the owner of 
> R1 in master's memory.
> -> Now when RS2 completes opening the region the master is not able to open 
> the region as already the reigon has been deleted from RIT.
> {code}
> Master
> ==
> 2011-10-05 20:49:45,301 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
> processing of shutdown of linux146,60020,1317827727647
> 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
> running balancer because 1 region(s) in transition: 
> {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
>  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
> 2011-10-05 20:49:57,720 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x132d3dc13090023 Deleting existing unassigned node for 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
> 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x132d3dc13090023 Attempting to delete unassigned node 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
> RS_ZK_REGION_OPENING state
> After the region is opened in RS2
> =
> 2011-10-05 20:50:48,066 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
> 2011-10-05 20:50:48,290 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
> region was in  the state null and not in expected PENDING_OPEN or OPENING 
> states
> 2011-10-05 20:50:53,743 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
> region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
> 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
> Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-10-05 20:50:54,397 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
> 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
> region was in  the state null and not in expected PENDING_OPEN or OPENING 
> states
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4539) OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading to HMaster abort

2011-10-09 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4539:
--

Fix Version/s: 0.92.0

> OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading 
> to HMaster abort
> -
>
> Key: HBASE-4539
> URL: https://issues.apache.org/jira/browse/HBASE-4539
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.92.0
>
>
> Steps to reproduce
> ==
> -> Region R1 is being opened in RS1.  
> ->After processing the znode to OPENED RS1 goes down.
> ->Now before the OpenedRegionHandler executor deletes the znode if 
> ServerShutDownHandler tries to assign the region to RS2, RS2 transits the 
> node to OPENED and this OpenedRegionHandler executor deletes the znode.  
> ->Now if the first OpenedRegionHandler tries deleting the znode it throws 
> NoNode Exception and causes the HMaster to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta occurs error,it'll cause data loss

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123881#comment-13123881
 ] 

Ted Yu commented on HBASE-4562:
---

Normally one patch should fix one problem.
{code}
+} catch(IOException e){
+  throw e;
{code}
I saw the above code twice in the patch.
Can you explain why it is needed ?

Please also run test suite and tell us the results.

> When split doing offlineParentInMeta occurs error,it'll cause data loss
> ---
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4562&4563.patch
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use below code:
> {code:title=SplitTransaction.java|borderStyle=solid}
>   this.journal.add(JournalEntry.PONR); 
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
> {code} 
> {code:title=CompactSplitThread.java|borderStyle=solid}
>   if (st.rollback(this.server, this.server)) {
>   LOG.info("Successful rollback of failed split of " +
> parent.getRegionNameAsString());
>   } 
>   else {
>   this.server.abort("Abort; we got an error after 
> point-of-no-return");
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4335) Splits can create temporary holes in .META. that confuse clients and regionservers

2011-10-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123882#comment-13123882
 ] 

Ted Yu commented on HBASE-4335:
---

@Lars:
You would get +1's after declaring the passing of test suite.

> Splits can create temporary holes in .META. that confuse clients and 
> regionservers
> --
>
> Key: HBASE-4335
> URL: https://issues.apache.org/jira/browse/HBASE-4335
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: Joe Pallas
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4335-v2.txt, 4335-v3.txt, 4335-v4.txt, 4335-v5.txt, 
> 4335.txt
>
>
> When a SplitTransaction is performed, three updates are done to .META.:
> 1. The parent region is marked as splitting (and hence offline)
> 2. The first daughter region is added (same start key as parent)
> 3. The second daughter region is added (split key is start key)
> (later, the original parent region is deleted, but that's not important to 
> this discussion)
> Steps 2 and 3 are actually done concurrently by 
> SplitTransaction.DaughterOpener threads.  While the master is notified when a 
> split is complete, the only visibility that clients have is whether the 
> daughter regions have appeared in .META.
> If the second daughter is added to .META. first, then .META. will contain the 
> (offline) parent region followed by the second daughter region.  If the 
> client looks up a key that is greater than (or equal to) the split, the 
> client will find the second daughter region and use it.  If the key is less 
> than the split key, the client will find the parent region and see that it is 
> offline, triggering a retry.
> If the first daughter is added to .META. before the second daughter, there is 
> a window during which .META. has a hole: the first daughter effectively hides 
> the parent region (same start key), but there is no entry for the second 
> daughter.  A region lookup will find the first daughter for all keys in the 
> parent's range, but the first daughter does not include keys at or beyond the 
> split key.
> See HBASE-4333 and HBASE-4334 for details on how this causes problems and 
> suggestions for mitigating this in the client and regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta occurs error,it'll cause data loss

2011-10-09 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: (was: HBASE-4562&4563.patch)

> When split doing offlineParentInMeta occurs error,it'll cause data loss
> ---
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use below code:
> {code:title=SplitTransaction.java|borderStyle=solid}
>   this.journal.add(JournalEntry.PONR); 
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
> {code} 
> {code:title=CompactSplitThread.java|borderStyle=solid}
>   if (st.rollback(this.server, this.server)) {
>   LOG.info("Successful rollback of failed split of " +
> parent.getRegionNameAsString());
>   } 
>   else {
>   this.server.abort("Abort; we got an error after 
> point-of-no-return");
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4563) When split doing this.parent.close(false) occurs error,it'll cause the splited region cann't write & read

2011-10-09 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4563:


Attachment: (was: HBASE-4562&4563.patch)

> When split doing this.parent.close(false) occurs error,it'll cause the 
> splited region cann't write & read
> -
>
> Key: HBASE-4563
> URL: https://issues.apache.org/jira/browse/HBASE-4563
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the hdfs error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = this.parent.close(false);
>   throw new IOException("some unexpected error in close store files");
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. scan the table,then it'll fail.
> We can fix the bug just use below code:
> {code:title=SplitTransaction.java|borderStyle=solid}
>   List hstoreFilesToSplit = null;
>   try{
>   hstoreFilesToSplit = this.parent.close(false);
>   }
>   catch(IOException  e){
>   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
>   throw e;
>   }
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira