[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread bluedavy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128685#comment-13128685
 ] 

bluedavy commented on HBASE-4562:
-

the patch-0.90 is for 0.90.4...

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, 
 HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, 
 test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2856) TestAcidGuarantee broken on trunk

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128687#comment-13128687
 ] 

jirapos...@reviews.apache.org commented on HBASE-2856:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2224/#review2614
---


Amit: Did you rebase before uploading the new patch. That, unfortunately, is 
making it hard to isolate the changes between r6 and r7. Will review tomorrow 
morning.

But I did read your description about the issues you mentioned. 

Regarding (b)-- we had already discussed in person. That makes sense.

And really nice catch on (a) too!! That is indeed subtle and tricky. Super!!!


- Kannan


On 2011-10-15 04:08:41, Amitanand Aiyer wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2224/
bq.  ---
bq.  
bq.  (Updated 2011-10-15 04:08:41)
bq.  
bq.  
bq.  Review request for Ted Yu, Michael Stack, Kannan Muthukkaruppan, and 
Karthik Ranganathan.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  address the 2856 issues by writing the memstoreTS to the disk.
bq.  
bq.  version v11 of the patch.
bq.  
bq.  uploading it here for easier review process.
bq.  
bq.  
bq.  This addresses bug HBASE-2856.
bq.  https://issues.apache.org/jira/browse/HBASE-2856
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./pom.xml 1183581 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 
1183581 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java 
1183581 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java 
1183581 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java 
1183581 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/ColumnCount.java 
1183581 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java 
1183581 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java 
1183581 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1183581 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 
1183581 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 1183581 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 
1183581 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
 1183581 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183581 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1183581 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 
1183581 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 
1183581 
bq./src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1183581 
bq./src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java 1183581 
bq./src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java 
1183581 
bq./src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java 
1183581 
bq.
/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
 1183581 
bq.
/src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java 
1183581 
bq.
/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java
 1183581 
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 
1183581 
bq.  
bq.  Diff: https://reviews.apache.org/r/2224/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Amitanand
bq.  
bq.



 TestAcidGuarantee broken on trunk 
 --

 Key: HBASE-2856
 URL: https://issues.apache.org/jira/browse/HBASE-2856
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89.20100621
Reporter: ryan rawson
Assignee: Amitanand Aiyer
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 2856-v5.txt, 
 acid.txt


 TestAcidGuarantee has a test whereby it attempts to read a number of columns 
 from a row, and every so often the first column of N is different, when it 
 should be the same.  This is a bug deep inside the scanner whereby the first 
 peek() of a row is done at time T then the rest of the read is done at T+1 
 after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' 
 data becomes committed and flushed to disk.
 One possible solution is 

[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread bluedavy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128690#comment-13128690
 ] 

bluedavy commented on HBASE-4562:
-

em,OK,I renamed the current patch for 0.90.4.

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.92.patch, 
 HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.92.txt, 
 test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: (was: HBASE-4562-0.90.patch)

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.92.patch, 
 HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.92.txt, 
 test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: (was: test-4562-0.90.txt)

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.92.patch, 
 HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.92.txt, 
 test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: test-4562-0.90.4.txt
HBASE-4562-0.90.4.patch

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.92.patch, 
 HBASE-4562-trunk.patch, test-4562-0.90.4.txt, test-4562-0.92.txt, 
 test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128698#comment-13128698
 ] 

jirapos...@reviews.apache.org commented on HBASE-4536:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2616
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
https://reviews.apache.org/r/2178/#comment5911

Please add 'any' in front of columns and change columns to singular form.
Maybe columns.size should be checked against 0 as well ?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
https://reviews.apache.org/r/2178/#comment5912

Much shorter and readable now.


- Ted


On 2011-10-17 05:32:49, Lars Hofhansl wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2178/
bq.  ---
bq.  
bq.  (Updated 2011-10-17 05:32:49)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  HBase timerange Gets and Scans allow to do timetravel in HBase. I.e. 
look at the state of the data at any point in the past, provided the data is 
still around.
bq.  This did not work for deletes, however. Deletes would always mask all puts 
in the past.
bq.  This change adds a flag that can be on HColumnDescriptor to enable 
retention of deleted rows.
bq.  These rows are still subject to TTL and/or VERSIONS.
bq.  
bq.  This changes the following:
bq.  1. There is a new flag on HColumnDescriptor enabling that behavior.
bq.  2. Allow gets/scans with a timerange to retrieve rows hidden by a delete 
marker, if the timerange does not include the delete marker.
bq.  3. Do not unconditionally collect all deleted rows during a compaction.
bq.  4. Allow a raw Scan, which retrieves all delete markers and deleted rows.
bq.  
bq.  The change is small'ish, but the logic is intricate, so please review 
carefully.
bq.  
bq.  
bq.  This addresses bug HBASE-4536.
bq.  https://issues.apache.org/jira/browse/HBASE-4536
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
 1184947 
bq.http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 
1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
 1184947 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java
 1184947 
bq.

[jira] [Commented] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128709#comment-13128709
 ] 

Hudson commented on HBASE-4563:
---

Integrated in HBase-TRUNK #2329 (See 
[https://builds.apache.org/job/HBase-TRUNK/2329/])
HBASE-4563  When error occurs in this.parent.close(false) of split, the 
split region cannot write or read

larsh : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java


 When error occurs in this.parent.close(false) of split, the split region 
 cannot write or read
 -

 Key: HBASE-4563
 URL: https://issues.apache.org/jira/browse/HBASE-4563
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4, 0.92.0
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, 
 HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, 
 test-4563-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the hdfs error.
{code:title=SplitTransaction.java|borderStyle=solid}
   ListStoreFile hstoreFilesToSplit = this.parent.close(false);
   throw new IOException(some unexpected error in close store files);
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. scan the table,then it'll fail.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread bluedavy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bluedavy updated HBASE-4562:


Attachment: test-4562-0.90.txt
HBASE-4562-0.90.patch

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, 
 HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, 
 test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread bluedavy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128713#comment-13128713
 ] 

bluedavy commented on HBASE-4562:
-

@Lars
I attached the patch for latest 0.90,pls apply it again  commit,thks.

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, 
 HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, 
 test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4601) HBase table admin tool

2011-10-17 Thread Thomas Pan (Created) (JIRA)
HBase table admin tool
--

 Key: HBASE-4601
 URL: https://issues.apache.org/jira/browse/HBASE-4601
 Project: HBase
  Issue Type: New Feature
Reporter: Thomas Pan


Similar to the HFile tool, we need to have a HBase table level tool to handle 
the following tasks:
1. Balance table level regions among all the region servers in the cluster. 
There need to be different balancing algorithms available for cherry picking.
2. Create tables with the proper settings
3. Alter existing tables
4. Compact existing tables, regardless of minor or major
5. Erases existing tables

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128730#comment-13128730
 ] 

jirapos...@reviews.apache.org commented on HBASE-4528:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/#review2617
---


@Dhruba
Any figures to know the improvement in write performance?

- ramkrishna


On 2011-10-17 04:39:55, Dhruba Borthakur wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2141/
bq.  ---
bq.  
bq.  (Updated 2011-10-17 04:39:55)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  The changes the multiPut operation so that the sync to the wal occurs 
outside the rowlock.
bq.  
bq.  This enhancement is done only to HRegion.mut(Put[]) because this is the 
only method that gets invoked from an application. The HRegion.put(Put) is used 
only by unit tests and should possibly be deprecated.
bq.  
bq.  
bq.  This addresses bug HBASE-4528.
bq.  https://issues.apache.org/jira/browse/HBASE-4528
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1184991 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 
1184991 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 
1184991 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 1184991 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184991 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 
1184991 
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java 
PRE-CREATION 
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 
1184991 
bq.
/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 
1184991 
bq.  
bq.  Diff: https://reviews.apache.org/r/2141/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  I ran TestLogRolling over and over again, about 50 times, not failed a 
single time.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Dhruba
bq.  
bq.



 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
 appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, 
 appendNoSyncPut5.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4563) When error occurs in this.parent.close(false) of split, the split region cannot write or read

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128742#comment-13128742
 ] 

Hudson commented on HBASE-4563:
---

Integrated in HBase-0.92 #66 (See 
[https://builds.apache.org/job/HBase-0.92/66/])
HBASE-4563  When error occurs in this.parent.close(false) of split, the 
split region cannot write or read

larsh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java


 When error occurs in this.parent.close(false) of split, the split region 
 cannot write or read
 -

 Key: HBASE-4563
 URL: https://issues.apache.org/jira/browse/HBASE-4563
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4, 0.92.0
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4563-0.90.patch, HBASE-4563-0.92.patch, 
 HBASE-4563-trunk.patch, test-4563-0.90.txt, test-4563-0.92.txt, 
 test-4563-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the hdfs error.
{code:title=SplitTransaction.java|borderStyle=solid}
   ListStoreFile hstoreFilesToSplit = this.parent.close(false);
   throw new IOException(some unexpected error in close store files);
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. scan the table,then it'll fail.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4486) Improve Javadoc for HTableDescriptor

2011-10-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128747#comment-13128747
 ] 

ramkrishna.s.vasudevan commented on HBASE-4486:
---

{code}
+   * Construct a table descriptor specifying table name.
* @param name Table name.
{code}
We have 2 constructors.  Your explanation says clearly as one constructor takes 
the table name the other byte array of table name.
May be we can also add
* @param name Table name.- as byte array.

 Improve Javadoc for HTableDescriptor
 

 Key: HBASE-4486
 URL: https://issues.apache.org/jira/browse/HBASE-4486
 Project: HBase
  Issue Type: Improvement
  Components: client, documentation
Reporter: Akash Ashok
Assignee: Akash Ashok
Priority: Minor
 Attachments: HBase-4486-v2.patch, HBase-4486.patch, 
 HTableDescriptor-v2.html, HTableDescriptor.html




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4591) TTL for old HLogs should be calculated from last modification time.

2011-10-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128761#comment-13128761
 ] 

ramkrishna.s.vasudevan commented on HBASE-4591:
---

+1

 TTL for old HLogs should be calculated from last modification time.
 ---

 Key: HBASE-4591
 URL: https://issues.apache.org/jira/browse/HBASE-4591
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.89.20100621
Reporter: Madhuwanti Vaidya
Assignee: Madhuwanti Vaidya
Priority: Minor



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128764#comment-13128764
 ] 

ramkrishna.s.vasudevan commented on HBASE-4595:
---

+1

 HFilePrettyPrinter Scanned kv count always 0
 

 Key: HBASE-4595
 URL: https://issues.apache.org/jira/browse/HBASE-4595
 Project: HBase
  Issue Type: Bug
  Components: io
Affects Versions: 0.92.0, 0.94.0, 0.92.1
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Attachments: HBASE-4595.patch


 The count variable used to print the Scanned kv count is never 
 incremented.
 A local count variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4462) Properly treating SocketTimeoutException

2011-10-17 Thread ramkrishna.s.vasudevan (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reassigned HBASE-4462:
-

Assignee: ramkrishna.s.vasudevan

 Properly treating SocketTimeoutException
 

 Key: HBASE-4462
 URL: https://issues.apache.org/jira/browse/HBASE-4462
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.5


 SocketTimeoutException is currently treated like any IOE inside of 
 HCM.getRegionServerWithRetries and I think this is a problem. This method 
 should only do retries in cases where we are pretty sure the operation will 
 complete, but with STE we already waited for (by default) 60 seconds and 
 nothing happened.
 I found this while debugging Douglas Campbell's problem on the mailing list 
 where it seemed like he was using the same scanner from multiple threads, but 
 actually it was just the same client doing retries while the first run didn't 
 even finish yet (that's another problem). You could see the first scanner, 
 then up to two other handlers waiting for it to finish in order to run 
 (because of the synchronization on RegionScanner).
 So what should we do? We could treat STE as a DoNotRetryException and let the 
 client deal with it, or we could retry only once.
 There's also the option of having a different behavior for get/put/icv/scan, 
 the issue with operations that modify a cell is that you don't know if the 
 operation completed or not (same when a RS dies hard after completing let's 
 say a Put but just before returning to the client).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Work started] (HBASE-4602) Make the suite run in at least half the time

2011-10-17 Thread nkeywal (Work started) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-4602 started by nkeywal.

 Make the suite run in at least half the time
 

 Key: HBASE-4602
 URL: https://issues.apache.org/jira/browse/HBASE-4602
 Project: HBase
  Issue Type: Umbrella
 Environment: All.
Reporter: nkeywal
Assignee: nkeywal

 - Cutting down on the number of cluster spinups by coalescing related tests 
 rather than have each spin up its own cluster
 - Make cluster start/stop faster
 - Rewriting long-running tests so they do not need to be run on a cluster; 
 e.g. by instead mocking expected signals/messages
 - Move long running tests out of the unit test suite to instead run as part 
 of the recently introduced 'integration test' step

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4602) Make the suite run in at least half the time

2011-10-17 Thread nkeywal (Created) (JIRA)
Make the suite run in at least half the time


 Key: HBASE-4602
 URL: https://issues.apache.org/jira/browse/HBASE-4602
 Project: HBase
  Issue Type: Umbrella
 Environment: All.
Reporter: nkeywal
Assignee: nkeywal


- Cutting down on the number of cluster spinups by coalescing related tests 
rather than have each spin up its own cluster
- Make cluster start/stop faster
- Rewriting long-running tests so they do not need to be run on a cluster; e.g. 
by instead mocking expected signals/messages
- Move long running tests out of the unit test suite to instead run as part of 
the recently introduced 'integration test' step

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4602) Make the suite run in at least half the time

2011-10-17 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4602:
---

Attachment: tests_list.xlsx

Test list and execution time, as of 14th oct. 2011

 Make the suite run in at least half the time
 

 Key: HBASE-4602
 URL: https://issues.apache.org/jira/browse/HBASE-4602
 Project: HBase
  Issue Type: Umbrella
 Environment: All.
Reporter: nkeywal
Assignee: nkeywal
 Attachments: tests_list.xlsx


 - Cutting down on the number of cluster spinups by coalescing related tests 
 rather than have each spin up its own cluster
 - Make cluster start/stop faster
 - Rewriting long-running tests so they do not need to be run on a cluster; 
 e.g. by instead mocking expected signals/messages
 - Move long running tests out of the unit test suite to instead run as part 
 of the recently introduced 'integration test' step

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4603) Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers

2011-10-17 Thread nkeywal (Created) (JIRA)
Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers
---

 Key: HBASE-4603
 URL: https://issues.apache.org/jira/browse/HBASE-4603
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.92.0
 Environment: all.
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


This functions waits for at least 2 times 
hbase.master.wait.on.regionservers.interval, defaulted at 3 seconds, i.e. 6 
seconds for every mini hbase cluster starts.

In the context of a mini cluster, it's not useful, as the regions servers are 
created locally.

Changing this to a lower value such as 100ms gives 5.8 second per HBase cluser 
start. It should lower the build time on the apache server by more than 8%.

Beeing more aggressive (removing all the wait time) could be possible as well. 
To be studied later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4603) Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers

2011-10-17 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4603:
---

Attachment: 20111017_4603_MiniHBaseCluster.patch

Fix by setting the value of hbase.master.wait.on.regionservers.interval in 
the MiniHBaseCluster class.

 Uneeded sleep time for tests in 
 hbase.master.ServerManager#waitForRegionServers
 ---

 Key: HBASE-4603
 URL: https://issues.apache.org/jira/browse/HBASE-4603
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.92.0
 Environment: all.
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 20111017_4603_MiniHBaseCluster.patch


 This functions waits for at least 2 times 
 hbase.master.wait.on.regionservers.interval, defaulted at 3 seconds, i.e. 6 
 seconds for every mini hbase cluster starts.
 In the context of a mini cluster, it's not useful, as the regions servers are 
 created locally.
 Changing this to a lower value such as 100ms gives 5.8 second per HBase 
 cluser start. It should lower the build time on the apache server by more 
 than 8%.
 Beeing more aggressive (removing all the wait time) could be possible as 
 well. To be studied later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-17 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128867#comment-13128867
 ] 

Jonathan Hsieh commented on HBASE-4570:
---

@stack I've done testing on trunk and an 0.90 branch and the symptoms 
encountered with the testing programs is fixed.  Would be great to get on 0.90, 
0.92 and trunk.  Thanks!

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4604) hbase.client.TestHTablePool could start a single cluster instead of one per method

2011-10-17 Thread nkeywal (Created) (JIRA)
hbase.client.TestHTablePool could start a single cluster instead of one per 
method
--

 Key: HBASE-4604
 URL: https://issues.apache.org/jira/browse/HBASE-4604
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.92.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


This tests starts/stops one cluster per method, while is would be possible to 
start it for the whole class.

Using a single cluster allows the test to take 20s instead of 175s (after 
HBASE-4603, much more before).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4604) hbase.client.TestHTablePool could start a single cluster instead of one per method

2011-10-17 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4604:
---

Attachment: 20111017_4604_TestHTablePool.patch

 hbase.client.TestHTablePool could start a single cluster instead of one per 
 method
 --

 Key: HBASE-4604
 URL: https://issues.apache.org/jira/browse/HBASE-4604
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.92.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 20111017_4604_TestHTablePool.patch


 This tests starts/stops one cluster per method, while is would be possible to 
 start it for the whole class.
 Using a single cluster allows the test to take 20s instead of 175s (after 
 HBASE-4603, much more before).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-17 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129013#comment-13129013
 ] 

Todd Lipcon commented on HBASE-4552:


The trick is making sure it's atomic inside the region server - not just that 
the client sends all of the files for a given region in one RPC. If there are 
any concurrent scanners, then they should either see all of the new data or 
none of the new data on a given row. So we need some region-wide coordination. 
I think probably we have to take a write-lock on HRegion#lock

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4486) Improve Javadoc for HTableDescriptor

2011-10-17 Thread Akash Ashok (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129017#comment-13129017
 ] 

Akash Ashok commented on HBASE-4486:


Thanks Ram. I shall modify that. I would be glad to incorporate if there are 
any other comments.

 Improve Javadoc for HTableDescriptor
 

 Key: HBASE-4486
 URL: https://issues.apache.org/jira/browse/HBASE-4486
 Project: HBase
  Issue Type: Improvement
  Components: client, documentation
Reporter: Akash Ashok
Assignee: Akash Ashok
Priority: Minor
 Attachments: HBase-4486-v2.patch, HBase-4486.patch, 
 HTableDescriptor-v2.html, HTableDescriptor.html




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-17 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4570:
--

Fix Version/s: 0.90.5

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-17 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129026#comment-13129026
 ] 

Todd Lipcon commented on HBASE-4570:


Cool, I will commit this to 90, 92, and trunk momentarily.

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4570) Scan ACID problem with concurrent puts.

2011-10-17 Thread Todd Lipcon (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HBASE-4570.


   Resolution: Fixed
Fix Version/s: 0.92.0
 Assignee: Jonathan Hsieh
 Hadoop Flags: Reviewed

Fixed in 90, 92, trunk branches

 Scan ACID problem with concurrent puts.
 ---

 Key: HBASE-4570
 URL: https://issues.apache.org/jira/browse/HBASE-4570
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.1, 0.90.3
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.92.0, 0.90.5

 Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt


 When scanning a table sometimes rows that have multiple column families get 
 split into two rows if there are concurrent writes.  In this particular case 
 we are overwriting the contents of a Get directly back onto itself as a Put.
 For example, this is a two cf row (with f1, f2, .. f9 cfs).  It is 
 actually returned as two rows (#55 and #56). Interestingly if the two were 
 merged we would have a single proper row.
 Row row024461 had time stamps: [55: 
 keyvalues={row024461/f0:data/1318200440867/Put/vlen=1000, 
 row024461/f0:qual/1318200440867/Put/vlen=10, 
 row024461/f1:data/1318200440867/Put/vlen=1000, 
 row024461/f1:qual/1318200440867/Put/vlen=10, 
 row024461/f2:data/1318200440867/Put/vlen=1000, 
 row024461/f2:qual/1318200440867/Put/vlen=10, 
 row024461/f3:data/1318200440867/Put/vlen=1000, 
 row024461/f3:qual/1318200440867/Put/vlen=10, 
 row024461/f4:data/1318200440867/Put/vlen=1000, 
 row024461/f4:qual/1318200440867/Put/vlen=10}, 
 56: keyvalues={row024461/f5:data/1318200440867/Put/vlen=1000, 
 row024461/f5:qual/1318200440867/Put/vlen=10, 
 row024461/f6:data/1318200440867/Put/vlen=1000, 
 row024461/f6:qual/1318200440867/Put/vlen=10, 
 row024461/f7:data/1318200440867/Put/vlen=1000, 
 row024461/f7:qual/1318200440867/Put/vlen=10, 
 row024461/f8:data/1318200440867/Put/vlen=1000, 
 row024461/f8:qual/1318200440867/Put/vlen=10, 
 row024461/f9:data/1318200440867/Put/vlen=1000, 
 row024461/f9:qual/1318200440867/Put/vlen=10}]
 I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is 
 consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2739) Master should fail to start if it cannot successfully split logs

2011-10-17 Thread Alex Newman (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129027#comment-13129027
 ] 

Alex Newman commented on HBASE-2739:


poke

 Master should fail to start if it cannot successfully split logs
 

 Key: HBASE-2739
 URL: https://issues.apache.org/jira/browse/HBASE-2739
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.20.4, 0.90.0
Reporter: Todd Lipcon
Assignee: Alex Newman
Priority: Critical

 In trunk, in splitLogAfterStartup(), we log the error splitting, but don't 
 shut down. Depending on configuration, we should probably shut down here 
 rather than continue with dataloss.
 In 0.20, we print the stacktrace to stdout in verifyClusterState, but 
 continue through and often fail to start up 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

2011-10-17 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129035#comment-13129035
 ] 

Todd Lipcon commented on HBASE-3929:


Thanks for updating the patch to trunk. A couple of comments (fun to look back 
over my own code from a few months back):

- let's rename {{pkv}} to {{prevKV}}
- in the case of an empty HFile, we would currently throw a divide-by-zero. In 
LongStats.toString, we should check for count == 0 and return no data or 
something


 Add option to HFile tool to produce basic stats
 ---

 Key: HBASE-3929
 URL: https://issues.apache.org/jira/browse/HBASE-3929
 Project: HBase
  Issue Type: New Feature
  Components: io
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.94.0

 Attachments: hbase-3929-draft.patch, hbase-3929-draft.txt


 In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce 
 some basic statistics about it:
 - min/mean/max key size, value size (uncompressed)
 - min/mean/max number of columns per row (uncompressed)
 - min/mean/max number of bytes per row (uncompressed)
 - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2739) Master should fail to start if it cannot successfully split logs

2011-10-17 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129036#comment-13129036
 ] 

Ted Yu commented on HBASE-2739:
---

In TRUNK, MasterFileSystem.splitLog() has the following:
{code}
  } catch (IOException e) {
LOG.error(Failed distributed splitting  + serverNames, e);
  }
...
} catch (IOException e) {
  LOG.error(Failed splitting  + logDir.toString(), e);
} finally {

{code}

 Master should fail to start if it cannot successfully split logs
 

 Key: HBASE-2739
 URL: https://issues.apache.org/jira/browse/HBASE-2739
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.20.4, 0.90.0
Reporter: Todd Lipcon
Assignee: Alex Newman
Priority: Critical

 In trunk, in splitLogAfterStartup(), we log the error splitting, but don't 
 shut down. Depending on configuration, we should probably shut down here 
 rather than continue with dataloss.
 In 0.20, we print the stacktrace to stdout in verifyClusterState, but 
 continue through and often fail to start up 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4605) Add constraints as a top-level feature

2011-10-17 Thread Jesse Yates (Created) (JIRA)
Add constraints as a top-level feature
--

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates


From Jesse's comment on dev:
{quote}
What I would like to propose is a simple interface that people can use to 
implement a 'constraint' (matching the classic database definition). This would 
help ease of adoption by helping HBase more easily check that box, help 
minimize code duplication across organizations, and lead to easier adoption.

Essentially, people would implement a 'Constraint' interface for checking keys 
before they are put into a table. Puts that are valid get written to the table, 
but if not people can will throw an exception that gets propagated back to the 
client explaining why the put was invalid.

Constraints would be set on a per-table basis and the user would be expected to 
ensure the jars containing the constraint are present on the machines serving 
that table.

Yes, people could roll their own mechanism for doing this via coprocessors each 
time, but this would make it easier to do so, so you only have to implement a 
very minimal interface and not worry about the specifics.
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4579) CST.requestCompaction semantics changed, logs are now spammed when too many store files

2011-10-17 Thread Jean-Daniel Cryans (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans resolved HBASE-4579.
---

  Resolution: Fixed
Release Note: MemStoreFlusher will now request only 1 compaction when 
waiting because there are too many store files, instead of one per check
Hadoop Flags: Reviewed

Committed to 0.92 and trunk, thanks for the review Ted!

 CST.requestCompaction semantics changed, logs are now spammed when too many 
 store files
 ---

 Key: HBASE-4579
 URL: https://issues.apache.org/jira/browse/HBASE-4579
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-4579-v2.patch, HBASE-4579.patch


 Another bug I'm not so sure what's going on. I see this in my log:
 {quote}
 2011-10-12 00:23:43,435 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:44,335 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:45,236 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:46,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:47,036 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:47,936 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 {quote}
 It spams for a while, and a little later instead I get:
 {quote}
 2011-10-12 00:26:52,139 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:53,040 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:53,940 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:54,840 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:55,741 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:56,641 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 {quote}
 I believe I also saw something like that for flushes, but the region was 
 closing so at least I know why it was spamming (would be nice if it just 
 unrequested the flush):
 {quote}
 2011-10-12 00:26:40,693 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 Flush requested on 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
 2011-10-12 00:26:40,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 NOT flushing memstore for region 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
 flushing=false, writesEnabled=false
 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 Flush requested on 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 NOT flushing memstore for region 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
 flushing=false, writesEnabled=false
 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 Flush requested on 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 NOT flushing memstore for region 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
 flushing=false, writesEnabled=false
 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 Flush requested on 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 NOT flushing memstore for region 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
 flushing=false, writesEnabled=false
 2011-10-12 00:26:40,921 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 Flush requested on 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
 2011-10-12 00:26:40,922 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 NOT flushing memstore for region 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
 flushing=false, 

[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature

2011-10-17 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129064#comment-13129064
 ] 

Jesse Yates commented on HBASE-4605:


I've been thinking about how to go about implementing this and have two ways to 
go about it.

Method 1:
My idea is to write a ConstraintProcessor that is a system level CP with system 
wide support for setting constraints on a table. this requires add 'top-level' 
configuration values that the user would set for constraints to run (which 
would be ordered like coprocessors), but they would just implement the 
'Constraint' interface. This means modifying HTD and the shell to enable all 
these

This means people need to distribute the jars and set conf values similar to 
what they would have to do before, but we would handle making sure the 
implemented Constraints get run in the right place, and propagate the errors 
(e.g. ConstraintFailedExecption) back to the users. 

Pluses:
1) Allows users to get back multiple reasons why a put failed.
2) Allows a ConstraintImpl to be a subclass of any arbitrary class and not 
bound to some abstract constraint.
3) People don't have to worry about it being a coprocessor - it is notionally 
divorced.

Minuses:
1) Requires changing a bunch of code in HTableDescriptor and essentially 
duplicating a lot the checking/setting already done for coprocessors. This can 
be gotten around by generalizing the mechanism for storing classes in the HTD.

Method 2 (already implemented, patch coming):
Add superclass AbstractConstraint which only exposes a check(Put) method. It is 
actually a Coprocoessor which is loaded, processes the check and then returns 
the error to the client (wrapped in an IOException) on failure. 
Pluses:
1) We don't have to implement any new mechanisms for specifying the constraint, 
people just have to add it as a coprocessor.
Minuses:
1) It could be confusing since with this mechanism, you just want people to 
think in terms of Constraints, not coprocessors
2) You are bound to extending the AbstractCoprocessor, not just implementing 
the interface
3) If just one constraint fails, then the put is rejected, so you can't find 
out all the reasons it would fail (useful if cleaning data).
4) It doesn't really help 'simplify' the use of HBase. In fact, it increases 
the complexity.

 Add constraints as a top-level feature
 --

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates

 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4606) Remove spam in HCM and fix a list.size == 0

2011-10-17 Thread Jean-Daniel Cryans (Created) (JIRA)
Remove spam in HCM and fix a list.size == 0
---

 Key: HBASE-4606
 URL: https://issues.apache.org/jira/browse/HBASE-4606
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.92.0


As discussed on the ML, HCM in 0.92 is being spammy with expecting X results 
which is a debug leftover. Also right next to it I see a list.size == 0, which 
should be converted into isEmpty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4606) Remove spam in HCM and fix a list.size == 0

2011-10-17 Thread Jean-Daniel Cryans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-4606:
--

Attachment: HBASE-4606.patch

Here's the patch I'm committing.

 Remove spam in HCM and fix a list.size == 0
 ---

 Key: HBASE-4606
 URL: https://issues.apache.org/jira/browse/HBASE-4606
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.92.0

 Attachments: HBASE-4606.patch


 As discussed on the ML, HCM in 0.92 is being spammy with expecting X 
 results which is a debug leftover. Also right next to it I see a list.size 
 == 0, which should be converted into isEmpty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4606) Remove spam in HCM and fix a list.size == 0

2011-10-17 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129077#comment-13129077
 ] 

Ted Yu commented on HBASE-4606:
---

+1 on patch.

 Remove spam in HCM and fix a list.size == 0
 ---

 Key: HBASE-4606
 URL: https://issues.apache.org/jira/browse/HBASE-4606
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.92.0

 Attachments: HBASE-4606.patch


 As discussed on the ML, HCM in 0.92 is being spammy with expecting X 
 results which is a debug leftover. Also right next to it I see a list.size 
 == 0, which should be converted into isEmpty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1611) Have shell output binary hex-encoded rather than octal-encoded

2011-10-17 Thread Alex Newman (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129085#comment-13129085
 ] 

Alex Newman commented on HBASE-1611:


I am a  bit confused about this bug. When I print strings, it seems to be 
trying to use hexadecimal, but every-once in a while, it just falls apart.

For instance the following works fine
\x00\x00\x00\x00 column=on:\x00\x00\x1A\x1F, timestamp=1318533963647, 
value=\x00\x00\x00\x00

but
 \x00\x00\x00\x00 column=on:\x00\x00\x1A$, timestamp=1318533963647, 
value=\x00\x00\x00\x00
falls apart, for some reason.

Is this what you are discussing?

 Have shell output binary hex-encoded rather than octal-encoded
 --

 Key: HBASE-1611
 URL: https://issues.apache.org/jira/browse/HBASE-1611
 Project: HBase
  Issue Type: Bug
Reporter: stack
  Labels: noob

 Native Ruby String dump and inspect output unprintables in octal.  Don't seem 
 to be able to change that fact.  Figure way to do them as hex to match 
 binaries in UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

2011-10-17 Thread Matteo Bertozzi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-3929:
---

Attachment: HBASE-3929-v2.patch

 Add option to HFile tool to produce basic stats
 ---

 Key: HBASE-3929
 URL: https://issues.apache.org/jira/browse/HBASE-3929
 Project: HBase
  Issue Type: New Feature
  Components: io
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.94.0

 Attachments: HBASE-3929-v2.patch, hbase-3929-draft.patch, 
 hbase-3929-draft.txt


 In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce 
 some basic statistics about it:
 - min/mean/max key size, value size (uncompressed)
 - min/mean/max number of columns per row (uncompressed)
 - min/mean/max number of bytes per row (uncompressed)
 - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats

2011-10-17 Thread Matteo Bertozzi (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129088#comment-13129088
 ] 

Matteo Bertozzi commented on HBASE-3929:


Currently HFilePrettyPrinter raise a couple of exceptions if the HFile is 
Empty, just because it doesn't check if seekTo() returns true or false, and the 
first call after seekTo() is a scanner.getKeyValue() so you get a NPE...

I've added a v2 patch with the pkv rename, count == 0 handled, and seekTo 
checked to fix the NPE.

 Add option to HFile tool to produce basic stats
 ---

 Key: HBASE-3929
 URL: https://issues.apache.org/jira/browse/HBASE-3929
 Project: HBase
  Issue Type: New Feature
  Components: io
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.94.0

 Attachments: HBASE-3929-v2.patch, hbase-3929-draft.patch, 
 hbase-3929-draft.txt


 In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce 
 some basic statistics about it:
 - min/mean/max key size, value size (uncompressed)
 - min/mean/max number of columns per row (uncompressed)
 - min/mean/max number of bytes per row (uncompressed)
 - the key of the largest row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4606) Remove spam in HCM and fix a list.size == 0

2011-10-17 Thread Jean-Daniel Cryans (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans resolved HBASE-4606.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.92 and trunk, thanks for taking a look at the patch Ted.

 Remove spam in HCM and fix a list.size == 0
 ---

 Key: HBASE-4606
 URL: https://issues.apache.org/jira/browse/HBASE-4606
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.92.0

 Attachments: HBASE-4606.patch


 As discussed on the ML, HCM in 0.92 is being spammy with expecting X 
 results which is a debug leftover. Also right next to it I see a list.size 
 == 0, which should be converted into isEmpty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature

2011-10-17 Thread Gary Helmling (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129094#comment-13129094
 ] 

Gary Helmling commented on HBASE-4605:
--

{quote}
My idea is to write a ConstraintProcessor that is a system level CP with system 
wide support for setting constraints on a table. this requires add 'top-level' 
configuration values that the user would set for constraints to run (which 
would be ordered like coprocessors), but they would just implement the 
'Constraint' interface. This means modifying HTD and the shell to enable all 
these
{quote}

I like the idea of using a system level coprocessor with a minimal extension 
interface for the checks to be performed.  For the actual interface, you could 
even use Predicate from the google guava lib, or have Constraint just be a 
named interface that extends PredicatePut.  Not critical, but plugging in to 
a standard interface instead of doing a one-off may enable future uses...

For setting the constraint implementations to be applied per table, I agree 
that using table attributes is probably easiest.  But I don't see why we need 
to modify HTableDescriptor to enable this?  We currently have HBASE-4554, which 
is looking to enable setting table attributes for coprocessors from the shell.  
It seems like we could make that sufficiently generic to enable both the 
coprocessors case and this with just changes to the shell code?

 Add constraints as a top-level feature
 --

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates

 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129096#comment-13129096
 ] 

jirapos...@reviews.apache.org commented on HBASE-4528:
--



bq.  On 2011-10-17 08:46:53, ramkrishna vasudevan wrote:
bq.   @Dhruba
bq.   Any figures to know the improvement in write performance?

I had updated the performance numbers in the JIRA comments. Please let me know 
if they are adequate.


- Dhruba


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/#review2617
---


On 2011-10-17 04:39:55, Dhruba Borthakur wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2141/
bq.  ---
bq.  
bq.  (Updated 2011-10-17 04:39:55)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  The changes the multiPut operation so that the sync to the wal occurs 
outside the rowlock.
bq.  
bq.  This enhancement is done only to HRegion.mut(Put[]) because this is the 
only method that gets invoked from an application. The HRegion.put(Put) is used 
only by unit tests and should possibly be deprecated.
bq.  
bq.  
bq.  This addresses bug HBASE-4528.
bq.  https://issues.apache.org/jira/browse/HBASE-4528
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1184991 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java 
1184991 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 
1184991 
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 1184991 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1184991 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 
1184991 
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java 
PRE-CREATION 
bq./src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 
1184991 
bq.
/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 
1184991 
bq.  
bq.  Diff: https://reviews.apache.org/r/2141/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  I ran TestLogRolling over and over again, about 50 times, not failed a 
single time.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Dhruba
bq.  
bq.



 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
 appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, 
 appendNoSyncPut5.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4274) RS should periodically ping its HLog pipeline even if no writes are arriving

2011-10-17 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129147#comment-13129147
 ] 

Ted Yu commented on HBASE-4274:
---

Gary has addressed rolling restart of DNs.
Can we move this issue to 0.94 ?

 RS should periodically ping its HLog pipeline even if no writes are arriving
 

 Key: HBASE-4274
 URL: https://issues.apache.org/jira/browse/HBASE-4274
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, wal
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0


 If you restart HDFS underneath HBase, when HBase isn't taking any write load, 
 the region servers won't notice that there's any problem until the next 
 time they take a write, at which point they will abort (because the pipeline 
 is gone from beneath them). It would be better if they wrote some garbage to 
 their HLog once every few seconds as a sort of keepalive, so they will 
 aggressively abort as soon as there's an issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4361) Certain filter expressions fail in the shell

2011-10-17 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4361:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

Moving the remaining work to 0.94

 Certain filter expressions fail in the shell
 

 Key: HBASE-4361
 URL: https://issues.apache.org/jira/browse/HBASE-4361
 Project: HBase
  Issue Type: Bug
  Components: filters, shell
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.94.0

 Attachments: Filter Language.docx, small-improvements.txt


 Running the following in the shell hangs and then fails:
 {noformat}
 scan 't1', { FILTER = SingleColumnValueFilter(, '1', 'f1', 'col_a') }
 {noformat}
 The error seems to be: org.jruby.exceptions.RaiseException: (NoMethodError) 
 undefined method `write' for true:TrueClass

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature

2011-10-17 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129155#comment-13129155
 ] 

nkeywal commented on HBASE-4605:


It would be a very useful features. 

For what it worth, I am quite fine with method 2.

Some points I found useful in the SQL engines: 
- capacity to disable the constraints for a connection (basically because 
yoy're doing a big set of operation and you want the best possible performances
- capacity to disable globally (upgrade)

I also wonder how we should manage the evolution of the constraint. On an SQL 
db, there is only one possible set; considering the amount of data, the 
traditional upgrade if the traditional sql db could not be possible here; so 
managing evolution of constraint would make sense.

A comment as well: in a SQL system, the constraints are linked to the 
transaction: it's checked once the transaction is committed. This is important 
for cross table constraint checks, as there is no transaction between tables in 
HBase. I would tend to believe that's mainly a question of documentation 
(insert in the right order), but it's something to remember anyway (especially 
as you want to duplicate the relationships with HBase more than with a sql 
db)...

I don't know the status of HCatalog, but I think the HCatalog schema will be 
transformable as HBase constraints, adding value to the two of them...

 Add constraints as a top-level feature
 --

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates

 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4388) Second start after migration from 90 to trunk crashes

2011-10-17 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129156#comment-13129156
 ] 

Ted Yu commented on HBASE-4388:
---

The new test would be similar to TestDFSUpgradeFromImage which uses 
hadoop-14-dfs-dir.tgz as a source for the old image.
We should create some HBase table using hbase-0.90 and tar the entire dataset.
Then write a unit test which untars the old dataset, and starts hbase-0.92. 
This should successfully read
the old image and upgrade it.

 Second start after migration from 90 to trunk crashes
 -

 Key: HBASE-4388
 URL: https://issues.apache.org/jira/browse/HBASE-4388
 Project: HBase
  Issue Type: Bug
  Components: master, migration
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: stack
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4388-v2.txt, 4388-v3.txt, 4388.txt, 
 hbase-master-nase.log, meta.tgz


 I started a trunk cluster to upgrade from 90, inserted a ton of data, then 
 did a clean shutdown. When I started again, I got the following exception:
 11/09/13 12:29:09 INFO master.HMaster: Meta has HRI with HTDs. Updating meta 
 now.
 11/09/13 12:29:09 FATAL master.HMaster: Unhandled exception. Starting 
 shutdown.
 java.lang.NegativeArraySizeException: -102
 at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:147)
 at 
 org.apache.hadoop.hbase.HTableDescriptor.readFields(HTableDescriptor.java:606)
 at 
 org.apache.hadoop.hbase.migration.HRegionInfo090x.readFields(HRegionInfo090x.java:641)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:133)
 at 
 org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:103)
 at 
 org.apache.hadoop.hbase.util.Writables.getHRegionInfoForMigration(Writables.java:228)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.getHRegionInfoForMigration(MetaEditor.java:350)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor$1.visit(MetaEditor.java:273)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:633)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
 at 
 org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:235)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaWithNewRegionInfo(MetaEditor.java:284)
 at 
 org.apache.hadoop.hbase.catalog.MetaEditor.migrateRootAndMeta(MetaEditor.java:298)
 at 
 org.apache.hadoop.hbase.master.HMaster.updateMetaWithNewHRI(HMaster.java:529)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:472)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4254) Get tests passing on Hadoop 23

2011-10-17 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4254:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

HBASE-4510 is marked for 0.94
Moving this issue to 0.94 as well.

 Get tests passing on Hadoop 23
 --

 Key: HBASE-4254
 URL: https://issues.apache.org/jira/browse/HBASE-4254
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.94.0


 Currently some 30 or so tests are failing on the HBase-trunk-on-hadoop-23 
 build. It looks like most are reflection-based issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4254) Get tests passing on Hadoop 23

2011-10-17 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129168#comment-13129168
 ] 

Todd Lipcon commented on HBASE-4254:


Why move to 0.94? This isn't marked a blocker, and we should be targeting the 
92 branch, even if it doesn't make it for 0.92.0.

 Get tests passing on Hadoop 23
 --

 Key: HBASE-4254
 URL: https://issues.apache.org/jira/browse/HBASE-4254
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.94.0


 Currently some 30 or so tests are failing on the HBase-trunk-on-hadoop-23 
 build. It looks like most are reflection-based issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4254) Get tests passing on Hadoop 23

2011-10-17 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129178#comment-13129178
 ] 

Ted Yu commented on HBASE-4254:
---

Feel free to bring this and HBASE-4510 to 0.92, knowing they're not blockers.

 Get tests passing on Hadoop 23
 --

 Key: HBASE-4254
 URL: https://issues.apache.org/jira/browse/HBASE-4254
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.94.0


 Currently some 30 or so tests are failing on the HBase-trunk-on-hadoop-23 
 build. It looks like most are reflection-based issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-17 Thread Liyin Tang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-4585:
--

Attachment: (was: hbase-4585-apache.patch)

 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: hbase-4585-89.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-17 Thread Liyin Tang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-4585:
--

Attachment: (was: hbase-4585-trunk.patch)

 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: hbase-4585-89.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-17 Thread Liyin Tang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-4585:
--

Attachment: hbase-4585-apache.patch

The patch for apache-trunk is ready.

 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: hbase-4585-89.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename

2011-10-17 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129188#comment-13129188
 ] 

Ted Yu commented on HBASE-4553:
---

We should impose an upper limit on the amount of time we wait in the loop.

 The update of .tableinfo is not atomic; we remove then rename
 -

 Key: HBASE-4553
 URL: https://issues.apache.org/jira/browse/HBASE-4553
 Project: HBase
  Issue Type: Task
Reporter: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBase-4553-TestAvroServer.patch


 This comes of HBASE-4547.  The rename in 0.20 hdfs fails if file exists 
 already.  In 0.20+ its better but still 'some' issues if existing reader when 
 file is renamed.  This issue is about fixing this (though we depend on fix 
 first being in hdfs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-17 Thread Liyin Tang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-4585:
--

Attachment: hbase-4585-apache-trunk.patch

 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: hbase-4585-89.patch, hbase-4585-apache-trunk.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4538) NPE in AssignmentManager#updateTimers

2011-10-17 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4538:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

This was the last time NPE happened, 28 builds ago:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/42/testReport/junit/org.apache.hadoop.hbase.client/TestAdmin/testOnlineChangeTableSchema/

If it happens again, we would get accurate line numbers.

 NPE in AssignmentManager#updateTimers
 -

 Key: HBASE-4538
 URL: https://issues.apache.org/jira/browse/HBASE-4538
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.94.0


 Saw this in a failed TestAdmin on 0.92
 {code}
 2011-10-05 01:18:58,890 ERROR 
 [MASTER_OPEN_REGION-sv4r9s38,52146,131098450-2] 
 executor.EventHandler(171): Caught throwable while processing event 
 RS_ZK_REGION_OPENED
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.updateTimers(AssignmentManager.java:1053)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:1027)
 at 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler.process(OpenedRegionHandler.java:108)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Work started] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-17 Thread Liyin Tang (Work started) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-4585 started by Liyin Tang.

 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: hbase-4585-89.patch, hbase-4585-apache-trunk.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature

2011-10-17 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129203#comment-13129203
 ] 

Jesse Yates commented on HBASE-4605:


@gary:
{quote}
I like the idea of using a system level coprocessor with a minimal extension 
interface for the checks to be performed. For the actual interface, you could 
even use Predicate from the google guava lib, or have Constraint just be a 
named interface that extends PredicatePut. Not critical, but plugging in to a 
standard interface instead of doing a one-off may enable future uses...
{quote}
That is exactly what I was thinking for the top-level implementation

{quote}
seems like we could make that sufficiently generic to enable both the 
coprocessors case and this with just changes to the shell code
{quote}
+1 

Right now coprocessors have a special syntax for loading on table level, which 
feels kind of clunky to do by hand (specifying COPROCESSOR$). I feel like we 
could definitely help enable setting values with a more concrete syntax (like a 
setCoprocessor method that we have on the HTableDescriptor now), which should 
handle the numbering, etc. 

So using an abstract version of the stuff from 4554 would definitely help with 
that. I don't know if we can just the use update shell though - we would 
probably need to update the java connection as well.

Right now the code for storing things in the conf would be fine, we just need 
to abstract it a little bit, so it would look something like:
{code}
public void addCoprocessor(name){
 addProcessingElement(coprocessor$, name);}

public void addConstriant(name){
 addProcessingElement(constriant$, name);}

public void addProcessingElement(String tag, String value){
...//all the checking/add currently in addCoprocessor
}
{code}

@keywal:

Since they are just table configuration values, turning them on/off will be 
relatively painless. 

Cross-table transactions is separate can of worms and really goes against the 
whole design paradigm of HBase (see discussion on dev about this). This would 
be optimized to do single table checking, though people could implement cross 
table checks at serious cost (and later we can build in more optimized 
mechanisms if it is a common thing people do).

{quote}
HCatalog schema will be transformable as HBase constraints, adding value to the 
two of them...
{quote}

That should be super simple, it would just take a simple tool to create the 
corresponding constraints. I would use constraints to enforce things like data 
sanitation, rather than schema enforcement (its the last ditch barrier to 
things going into a table properly, since shipping things across the wire is 
expensive), which should be done client side, but it could definitely work.



 Add constraints as a top-level feature
 --

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates

 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2739) Master should fail to start if it cannot successfully split logs

2011-10-17 Thread Alex Newman (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129225#comment-13129225
 ] 

Alex Newman commented on HBASE-2739:


- Is the appropriate way to fail, just to throw? What's the science here?
- I just created a conf variable which controls whether or not this should 
cause a shutdown. Are there any other cases where we do/don't want to shutdown? 
Do we need a different variable for distributed / single node spliting?

 Master should fail to start if it cannot successfully split logs
 

 Key: HBASE-2739
 URL: https://issues.apache.org/jira/browse/HBASE-2739
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.20.4, 0.90.0
Reporter: Todd Lipcon
Assignee: Alex Newman
Priority: Critical

 In trunk, in splitLogAfterStartup(), we log the error splitting, but don't 
 shut down. Depending on configuration, we should probably shut down here 
 rather than continue with dataloss.
 In 0.20, we print the stacktrace to stdout in verifyClusterState, but 
 continue through and often fail to start up 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-17 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4585:
--

Fix Version/s: 0.94.0

 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.94.0

 Attachments: hbase-4585-89.patch, hbase-4585-apache-trunk.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4607) SplitLogWorker should correctly terminate when waiting for ZK node

2011-10-17 Thread Mikhail Bautin (Created) (JIRA)
SplitLogWorker should correctly terminate when waiting for ZK node
--

 Key: HBASE-4607
 URL: https://issues.apache.org/jira/browse/HBASE-4607
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor


This is an attempt to fix the fact that SplitLogWorker threads are not being 
terminated properly in some unit tests. This probably does not happen in 
production because the master always creates the log-splitting ZK node, but it 
does happen in 89-fb. Thanks to Prakash Khemani for help on this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4254) Get tests passing on Hadoop 23

2011-10-17 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4254:
--

Fix Version/s: (was: 0.94.0)
   0.92.0

 Get tests passing on Hadoop 23
 --

 Key: HBASE-4254
 URL: https://issues.apache.org/jira/browse/HBASE-4254
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0


 Currently some 30 or so tests are failing on the HBase-trunk-on-hadoop-23 
 build. It looks like most are reflection-based issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4606) Remove spam in HCM and fix a list.size == 0

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129279#comment-13129279
 ] 

Hudson commented on HBASE-4606:
---

Integrated in HBase-0.92 #71 (See 
[https://builds.apache.org/job/HBase-0.92/71/])
HBASE-4606  Remove spam in HCM and fix a list.size == 0

jdcryans : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java


 Remove spam in HCM and fix a list.size == 0
 ---

 Key: HBASE-4606
 URL: https://issues.apache.org/jira/browse/HBASE-4606
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.92.0

 Attachments: HBASE-4606.patch


 As discussed on the ML, HCM in 0.92 is being spammy with expecting X 
 results which is a debug leftover. Also right next to it I see a list.size 
 == 0, which should be converted into isEmpty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4460) Support running an embedded ThriftServer within a RegionServer

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129286#comment-13129286
 ] 

jirapos...@reviews.apache.org commented on HBASE-4460:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2410/
---

Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and 
Andrew Purtell.


Summary
---

Rather than a separate process, it can be advantageous in some situations for 
each RegionServer to embed their own ThriftServer. This allows each embedded 
ThriftServer to short-circuit any queries that should be executed on the local 
RS and skip the extra hop. This then enables the building of fat Thrift clients 
that cache region locations and avoid extra hops all together.


This addresses bug HBASE-4460.
https://issues.apache.org/jira/browse/HBASE-4460


Diffs
-

  /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376 
  /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
1174376 
  /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/2410/diff


Testing
---

Running this already on our hbase-92-based branch and running test site.


Thanks,

Jonathan



 Support running an embedded ThriftServer within a RegionServer
 --

 Key: HBASE-4460
 URL: https://issues.apache.org/jira/browse/HBASE-4460
 Project: HBase
  Issue Type: New Feature
  Components: regionserver, thrift
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Attachments: HBASE-4460-v1.patch


 Rather than a separate process, it can be advantageous in some situations for 
 each RegionServer to embed their own ThriftServer.  This allows each embedded 
 ThriftServer to short-circuit any queries that should be executed on the 
 local RS and skip the extra hop.  This then enables the building of fat 
 Thrift clients that cache region locations and avoid extra hops all together.
 This JIRA is just about the embedded ThriftServer.  Will open others for the 
 rest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread Lars Hofhansl (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-4562.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.90, 0.92, and trunk.

The 0.90 patch still did not apply to the current 0.90 branch. I applied the 
changes manually this time, but in the future it would be great to base patches 
of the latests state of the branch in SVN.


 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, 
 HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, 
 test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4607) Split log worker should terminate properly when waiting for znode

2011-10-17 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4607:
--

Summary: Split log worker should terminate properly when waiting for znode  
(was: SplitLogWorker should correctly terminate when waiting for ZK node)

 Split log worker should terminate properly when waiting for znode
 -

 Key: HBASE-4607
 URL: https://issues.apache.org/jira/browse/HBASE-4607
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor

 This is an attempt to fix the fact that SplitLogWorker threads are not being 
 terminated properly in some unit tests. This probably does not happen in 
 production because the master always creates the log-splitting ZK node, but 
 it does happen in 89-fb. Thanks to Prakash Khemani for help on this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4603) Uneeded sleep time for tests in hbase.master.ServerManager#waitForRegionServers

2011-10-17 Thread Jonathan Gray (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129295#comment-13129295
 ] 

Jonathan Gray commented on HBASE-4603:
--

There was a nice param in HBASE-3380 that is in 90 but not 92/trunk.  I'm going 
to see if we can get that brought into the active branches, then we can just 
set the maxServers config to the # of RS set to start, and then it will just 
work instantly w/o having to wait for this interval/sleep loop.

 Uneeded sleep time for tests in 
 hbase.master.ServerManager#waitForRegionServers
 ---

 Key: HBASE-4603
 URL: https://issues.apache.org/jira/browse/HBASE-4603
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.92.0
 Environment: all.
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 20111017_4603_MiniHBaseCluster.patch


 This functions waits for at least 2 times 
 hbase.master.wait.on.regionservers.interval, defaulted at 3 seconds, i.e. 6 
 seconds for every mini hbase cluster starts.
 In the context of a mini cluster, it's not useful, as the regions servers are 
 created locally.
 Changing this to a lower value such as 100ms gives 5.8 second per HBase 
 cluser start. It should lower the build time on the apache server by more 
 than 8%.
 Beeing more aggressive (removing all the wait time) could be possible as 
 well. To be studied later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4607) Split log worker should terminate properly when waiting for znode

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129297#comment-13129297
 ] 

jirapos...@reviews.apache.org commented on HBASE-4607:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2411/
---

Review request for hbase and Prakash Khemani.


Summary
---

This is an attempt to fix the fact that SplitLogWorker threads are not being 
terminated properly in some unit tests. This probably does not happen in 
production because the master always creates the log-splitting ZK node, but it 
does happen in 89-fb. Thanks to Prakash Khemani for help on this.


This addresses bug hbase-4607.
https://issues.apache.org/jira/browse/hbase-4607


Diffs
-

  src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java 
a43e0b3 

Diff: https://reviews.apache.org/r/2411/diff


Testing
---

I will run unit tests and post an update when they have passed.


Thanks,

Mikhail



 Split log worker should terminate properly when waiting for znode
 -

 Key: HBASE-4607
 URL: https://issues.apache.org/jira/browse/HBASE-4607
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor

 This is an attempt to fix the fact that SplitLogWorker threads are not being 
 terminated properly in some unit tests. This probably does not happen in 
 production because the master always creates the log-splitting ZK node, but 
 it does happen in 89-fb. Thanks to Prakash Khemani for help on this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers

2011-10-17 Thread Jonathan Gray (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129293#comment-13129293
 ] 

Jonathan Gray commented on HBASE-3380:
--

So it looks like we thought we'd do a proper fix for 0.92, but do we have one?  
There's some good config params that were committed as part of this JIRA into 
0.90 that are now not available in 0.92.

Should this be committed to 0.92 and trunk?  I'd like to at least bring these 
config params over since they are pretty nice (and will make a more elegant 
solution to stuff like HBASE-4603).

 Master failover can split logs of live servers
 --

 Key: HBASE-3380
 URL: https://issues.apache.org/jira/browse/HBASE-3380
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Jonathan Gray
Priority: Blocker
 Fix For: 0.90.0

 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch


 The reason why TestMasterFailover fails is that when it does the master 
 failover, the new master doesn't wait long enough for all region servers to 
 checkin so it goes ahead and split logs... which doesn't work because of the 
 way lease timeouts work:
 {noformat}
 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] 
 wal.HLogSplitter(256): Splitting hlog 1 of 1:
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204,
  length=0
 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-1,5,main]: starting
 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-2,5,main]: starting
 2010-12-21 07:30:36,977 INFO  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(625): Recovering file
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
 2010-12-21 07:30:36,979 WARN  [IPC Server handler 8 on 49187] 
 namenode.FSNamesystem(1122): DIR* NameSystem.startFile:
  failed to create file 
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for
  DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already being created by
  DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 ...
 2010-12-21 07:33:44,332 WARN  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(644): Waited 187354ms for lease recovery on
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204:
  org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file
  
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already
  being created by 
 DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 {noformat}
 I think that we should always check in ZK the number of live region servers 
 before waiting for them to check in, this way we know how many we should 
 expect during failover. There's also a case where we still want to timeout, 
 since RS can die during that time, but we should wait a bit longer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4608) HLog Compression

2011-10-17 Thread Li Pi (Created) (JIRA)
HLog Compression


 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi


The current bottleneck to HBase write speed is replicating the WAL appends 
across different datanodes. We can speed up this process by compressing the 
HLog. Current plan involves using a dictionary to compress table name, region 
id, cf name, and possibly other bits of repeated data. Also, HLog format may be 
changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers

2011-10-17 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129308#comment-13129308
 ] 

Ted Yu commented on HBASE-3380:
---

+1 on bringing over the parameters.

 Master failover can split logs of live servers
 --

 Key: HBASE-3380
 URL: https://issues.apache.org/jira/browse/HBASE-3380
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Jonathan Gray
Priority: Blocker
 Fix For: 0.90.0

 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch


 The reason why TestMasterFailover fails is that when it does the master 
 failover, the new master doesn't wait long enough for all region servers to 
 checkin so it goes ahead and split logs... which doesn't work because of the 
 way lease timeouts work:
 {noformat}
 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] 
 wal.HLogSplitter(256): Splitting hlog 1 of 1:
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204,
  length=0
 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-1,5,main]: starting
 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-2,5,main]: starting
 2010-12-21 07:30:36,977 INFO  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(625): Recovering file
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
 2010-12-21 07:30:36,979 WARN  [IPC Server handler 8 on 49187] 
 namenode.FSNamesystem(1122): DIR* NameSystem.startFile:
  failed to create file 
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for
  DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already being created by
  DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 ...
 2010-12-21 07:33:44,332 WARN  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(644): Waited 187354ms for lease recovery on
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204:
  org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file
  
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already
  being created by 
 DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 {noformat}
 I think that we should always check in ZK the number of live region servers 
 before waiting for them to check in, this way we know how many we should 
 expect during failover. There's also a case where we still want to timeout, 
 since RS can die during that time, but we should wait a bit longer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers

2011-10-17 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129334#comment-13129334
 ] 

Jean-Daniel Cryans commented on HBASE-3380:
---

Let's do it, +1.

 Master failover can split logs of live servers
 --

 Key: HBASE-3380
 URL: https://issues.apache.org/jira/browse/HBASE-3380
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Jonathan Gray
Priority: Blocker
 Fix For: 0.90.0

 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch


 The reason why TestMasterFailover fails is that when it does the master 
 failover, the new master doesn't wait long enough for all region servers to 
 checkin so it goes ahead and split logs... which doesn't work because of the 
 way lease timeouts work:
 {noformat}
 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] 
 wal.HLogSplitter(256): Splitting hlog 1 of 1:
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204,
  length=0
 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-1,5,main]: starting
 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-2,5,main]: starting
 2010-12-21 07:30:36,977 INFO  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(625): Recovering file
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
 2010-12-21 07:30:36,979 WARN  [IPC Server handler 8 on 49187] 
 namenode.FSNamesystem(1122): DIR* NameSystem.startFile:
  failed to create file 
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for
  DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already being created by
  DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 ...
 2010-12-21 07:33:44,332 WARN  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(644): Waited 187354ms for lease recovery on
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204:
  org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file
  
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already
  being created by 
 DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 {noformat}
 I think that we should always check in ZK the number of live region servers 
 before waiting for them to check in, this way we know how many we should 
 expect during failover. There's also a case where we still want to timeout, 
 since RS can die during that time, but we should wait a bit longer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4574) Node deleted but still in RIT printed too often

2011-10-17 Thread Jean-Daniel Cryans (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans resolved HBASE-4574.
---

   Resolution: Duplicate
Fix Version/s: (was: 0.92.0)

Dup of HBASE-4308

 Node deleted but still in RIT printed too often
 -

 Key: HBASE-4574
 URL: https://issues.apache.org/jira/browse/HBASE-4574
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans

 Looking at the 0.92 master logs, I see I often get this message:
 bq. WARN org.apache.hadoop.hbase.master.AssignmentManager: Node deleted but 
 still in RIT: TestTable,blah. state=OPEN, ts=1318369648361, server=blah
 The issue seems to be due to a race between OpenedRegionHandler and watchers 
 in AssignmentManager. Specifically, ORH first deletes the znode then deletes 
 the in-memory RIT data structure (via regionOnline). Between the two steps a 
 watcher is triggered and if it arrives first then it will see the region 
 still in RIT.
 If the message is really supposed to be a warning then in the current form 
 it's useless as people will see this message 99% of the time because of this 
 race.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers

2011-10-17 Thread Jonathan Gray (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129339#comment-13129339
 ] 

Jonathan Gray commented on HBASE-3380:
--

What's the best practice here?  Should I just commit this to 92 and trunk and 
make a note here?  Should I open a new jira since this is so old?

(Thanks for input guys)

 Master failover can split logs of live servers
 --

 Key: HBASE-3380
 URL: https://issues.apache.org/jira/browse/HBASE-3380
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Jonathan Gray
Priority: Blocker
 Fix For: 0.90.0

 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch


 The reason why TestMasterFailover fails is that when it does the master 
 failover, the new master doesn't wait long enough for all region servers to 
 checkin so it goes ahead and split logs... which doesn't work because of the 
 way lease timeouts work:
 {noformat}
 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] 
 wal.HLogSplitter(256): Splitting hlog 1 of 1:
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204,
  length=0
 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-1,5,main]: starting
 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-2,5,main]: starting
 2010-12-21 07:30:36,977 INFO  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(625): Recovering file
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
 2010-12-21 07:30:36,979 WARN  [IPC Server handler 8 on 49187] 
 namenode.FSNamesystem(1122): DIR* NameSystem.startFile:
  failed to create file 
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for
  DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already being created by
  DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 ...
 2010-12-21 07:33:44,332 WARN  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(644): Waited 187354ms for lease recovery on
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204:
  org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file
  
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already
  being created by 
 DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 {noformat}
 I think that we should always check in ZK the number of live region servers 
 before waiting for them to check in, this way we know how many we should 
 expect during failover. There's also a case where we still want to timeout, 
 since RS can die during that time, but we should wait a bit longer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4607) Split log worker should terminate properly when waiting for znode

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129340#comment-13129340
 ] 

jirapos...@reviews.apache.org commented on HBASE-4607:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2411/#review2635
---



src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java
https://reviews.apache.org/r/2411/#comment5940

If exitWorker is false, we would enter taskLoop at line 167.
Is this desirable ?


- Ted


On 2011-10-17 22:41:49, Mikhail Bautin wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2411/
bq.  ---
bq.  
bq.  (Updated 2011-10-17 22:41:49)
bq.  
bq.  
bq.  Review request for hbase and Prakash Khemani.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This is an attempt to fix the fact that SplitLogWorker threads are not 
being terminated properly in some unit tests. This probably does not happen in 
production because the master always creates the log-splitting ZK node, but it 
does happen in 89-fb. Thanks to Prakash Khemani for help on this.
bq.  
bq.  
bq.  This addresses bug hbase-4607.
bq.  https://issues.apache.org/jira/browse/hbase-4607
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java 
a43e0b3 
bq.  
bq.  Diff: https://reviews.apache.org/r/2411/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  I will run unit tests and post an update when they have passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Mikhail
bq.  
bq.



 Split log worker should terminate properly when waiting for znode
 -

 Key: HBASE-4607
 URL: https://issues.apache.org/jira/browse/HBASE-4607
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor

 This is an attempt to fix the fact that SplitLogWorker threads are not being 
 terminated properly in some unit tests. This probably does not happen in 
 production because the master always creates the log-splitting ZK node, but 
 it does happen in 89-fb. Thanks to Prakash Khemani for help on this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers

2011-10-17 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129341#comment-13129341
 ] 

Jean-Daniel Cryans commented on HBASE-3380:
---

Since it's almost a year old I'd prefer a new jira.

 Master failover can split logs of live servers
 --

 Key: HBASE-3380
 URL: https://issues.apache.org/jira/browse/HBASE-3380
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Jonathan Gray
Priority: Blocker
 Fix For: 0.90.0

 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch


 The reason why TestMasterFailover fails is that when it does the master 
 failover, the new master doesn't wait long enough for all region servers to 
 checkin so it goes ahead and split logs... which doesn't work because of the 
 way lease timeouts work:
 {noformat}
 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] 
 wal.HLogSplitter(256): Splitting hlog 1 of 1:
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204,
  length=0
 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-1,5,main]: starting
 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-2,5,main]: starting
 2010-12-21 07:30:36,977 INFO  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(625): Recovering file
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
 2010-12-21 07:30:36,979 WARN  [IPC Server handler 8 on 49187] 
 namenode.FSNamesystem(1122): DIR* NameSystem.startFile:
  failed to create file 
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for
  DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already being created by
  DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 ...
 2010-12-21 07:33:44,332 WARN  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(644): Waited 187354ms for lease recovery on
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204:
  org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file
  
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already
  being created by 
 DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 {noformat}
 I think that we should always check in ZK the number of live region servers 
 before waiting for them to check in, this way we know how many we should 
 expect during failover. There's also a case where we still want to timeout, 
 since RS can die during that time, but we should wait a bit longer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers

2011-10-17 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129342#comment-13129342
 ] 

Ted Yu commented on HBASE-3380:
---

Please open new JIRA where we may come up with better idea.

 Master failover can split logs of live servers
 --

 Key: HBASE-3380
 URL: https://issues.apache.org/jira/browse/HBASE-3380
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Jonathan Gray
Priority: Blocker
 Fix For: 0.90.0

 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch


 The reason why TestMasterFailover fails is that when it does the master 
 failover, the new master doesn't wait long enough for all region servers to 
 checkin so it goes ahead and split logs... which doesn't work because of the 
 way lease timeouts work:
 {noformat}
 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] 
 wal.HLogSplitter(256): Splitting hlog 1 of 1:
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204,
  length=0
 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-1,5,main]: starting
 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-2,5,main]: starting
 2010-12-21 07:30:36,977 INFO  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(625): Recovering file
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
 2010-12-21 07:30:36,979 WARN  [IPC Server handler 8 on 49187] 
 namenode.FSNamesystem(1122): DIR* NameSystem.startFile:
  failed to create file 
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for
  DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already being created by
  DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 ...
 2010-12-21 07:33:44,332 WARN  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(644): Waited 187354ms for lease recovery on
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204:
  org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file
  
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already
  being created by 
 DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 {noformat}
 I think that we should always check in ZK the number of live region servers 
 before waiting for them to check in, this way we know how many we should 
 expect during failover. There's also a case where we still want to timeout, 
 since RS can die during that time, but we should wait a bit longer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead

2011-10-17 Thread Jonathan Gray (Created) (JIRA)
ThriftServer.getRegionInfo() is expecting old ServerName format, need to use 
new Addressing class instead
-

 Key: HBASE-4609
 URL: https://issues.apache.org/jira/browse/HBASE-4609
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Minor
 Fix For: 0.92.0


ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't 
include start code.  Need to fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead

2011-10-17 Thread Jonathan Gray (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray updated HBASE-4609:
-

Attachment: HBASE-4609-v1.patch

As advertised.  Against trunk.

 ThriftServer.getRegionInfo() is expecting old ServerName format, need to use 
 new Addressing class instead
 -

 Key: HBASE-4609
 URL: https://issues.apache.org/jira/browse/HBASE-4609
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Minor
 Fix For: 0.92.0

 Attachments: HBASE-4609-v1.patch


 ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't 
 include start code.  Need to fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead

2011-10-17 Thread Jonathan Gray (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray updated HBASE-4609:
-

Status: Patch Available  (was: Open)

 ThriftServer.getRegionInfo() is expecting old ServerName format, need to use 
 new Addressing class instead
 -

 Key: HBASE-4609
 URL: https://issues.apache.org/jira/browse/HBASE-4609
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Minor
 Fix For: 0.92.0

 Attachments: HBASE-4609-v1.patch


 ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't 
 include start code.  Need to fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3380) Master failover can split logs of live servers

2011-10-17 Thread Jonathan Gray (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129344#comment-13129344
 ] 

Jonathan Gray commented on HBASE-3380:
--

Heartbeats still exist so I'm not sure much is different in 92 since we tackled 
this, right?

I will open a new JIRA though.

 Master failover can split logs of live servers
 --

 Key: HBASE-3380
 URL: https://issues.apache.org/jira/browse/HBASE-3380
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: Jonathan Gray
Priority: Blocker
 Fix For: 0.90.0

 Attachments: HBASE-3380-v1.patch, HBASE-3380-v2.patch


 The reason why TestMasterFailover fails is that when it does the master 
 failover, the new master doesn't wait long enough for all region servers to 
 checkin so it goes ahead and split logs... which doesn't work because of the 
 way lease timeouts work:
 {noformat}
 2010-12-21 07:30:36,977 DEBUG [Master:0;vesta.apache.org:33170] 
 wal.HLogSplitter(256): Splitting hlog 1 of 1:
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204,
  length=0
 2010-12-21 07:30:36,977 DEBUG [WriterThread-1] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-1,5,main]: starting
 2010-12-21 07:30:36,977 DEBUG [WriterThread-2] 
 wal.HLogSplitter$WriterThread(619): Writer thread 
 Thread[WriterThread-2,5,main]: starting
 2010-12-21 07:30:36,977 INFO  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(625): Recovering file
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
 2010-12-21 07:30:36,979 WARN  [IPC Server handler 8 on 49187] 
 namenode.FSNamesystem(1122): DIR* NameSystem.startFile:
  failed to create file 
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for
  DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already being created by
  DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 ...
 2010-12-21 07:33:44,332 WARN  [Master:0;vesta.apache.org:33170] 
 util.FSUtils(644): Waited 187354ms for lease recovery on
  
 hdfs://localhost:49187/user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204:
  org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file
  
 /user/hudson/.logs/vesta.apache.org,38743,1292916616340/vesta.apache.org%3A38743.1292916617204
  for DFSClient_hb_m_vesta.apache.org:33170_1292916630791 on client 127.0.0.1, 
 because this file is already
  being created by 
 DFSClient_hb_rs_vesta.apache.org,38743,1292916616340_1292916617166 on 
 127.0.0.1
 {noformat}
 I think that we should always check in ZK the number of live region servers 
 before waiting for them to check in, this way we know how many we should 
 expect during failover. There's also a case where we still want to timeout, 
 since RS can die during that time, but we should wait a bit longer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-10-17 Thread Jonathan Gray (Created) (JIRA)
Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
(definitely bring in config params, decide if we need to do more to fix the bug)
-

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0


Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We added 
some more config parameters to better control the master startup loop where it 
waits for RS to heartbeat in.  We had thought at the time that 92 would have a 
different solution but it is still relying on heartbeats to learn about RSs.

For now, we should definitely bring these config params into 92/trunk.  
Otherwise this is an incompatible regression and adding these will also make 
things like what was just reported over in HBASE-4603 trivial to fix in an 
optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature

2011-10-17 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129393#comment-13129393
 ] 

Andrew Purtell commented on HBASE-4605:
---

+1 on choice #1, with generalization of argument passing to coprocessors via 
table attributes. HBASE-4048 and HBASE-4554 are a start there.

 Add constraints as a top-level feature
 --

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates

 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4611) Add support for Phabricator/Differential as an alternative code review tool

2011-10-17 Thread Jonathan Gray (Created) (JIRA)
Add support for Phabricator/Differential as an alternative code review tool
---

 Key: HBASE-4611
 URL: https://issues.apache.org/jira/browse/HBASE-4611
 Project: HBase
  Issue Type: Task
Reporter: Jonathan Gray


From http://phabricator.org/ : Phabricator is a open source collection of web 
applications which make it easier to write, review, and share source code. It 
is currently available as an early release. Phabricator was developed at 
Facebook.

It's open source so pretty much anyone could host an instance of this software.

To begin with, there will be a public-facing instance located at 
http://reviews.facebook.net (sponsored by Facebook and hosted by the OSUOSL 
http://osuosl.org).

We will use this JIRA to deal with adding (and ensuring) Apache-friendly 
support that will allow us to do code reviews with Phabricator for HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4611) Add support for Phabricator/Differential as an alternative code review tool

2011-10-17 Thread Jonathan Gray (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129397#comment-13129397
 ] 

Jonathan Gray commented on HBASE-4611:
--

In addition to being a (better) code review tool, the Phabricator suite also 
includes stuff like repo/revision browsing, nice command-line tools, pastebin, 
etc. which should be available for the HBase repos.

 Add support for Phabricator/Differential as an alternative code review tool
 ---

 Key: HBASE-4611
 URL: https://issues.apache.org/jira/browse/HBASE-4611
 Project: HBase
  Issue Type: Task
Reporter: Jonathan Gray

 From http://phabricator.org/ : Phabricator is a open source collection of 
 web applications which make it easier to write, review, and share source 
 code. It is currently available as an early release. Phabricator was 
 developed at Facebook.
 It's open source so pretty much anyone could host an instance of this 
 software.
 To begin with, there will be a public-facing instance located at 
 http://reviews.facebook.net (sponsored by Facebook and hosted by the OSUOSL 
 http://osuosl.org).
 We will use this JIRA to deal with adding (and ensuring) Apache-friendly 
 support that will allow us to do code reviews with Phabricator for HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread bluedavy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129416#comment-13129416
 ] 

bluedavy commented on HBASE-4562:
-

em,thks.

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, 
 HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, 
 test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4606) Remove spam in HCM and fix a list.size == 0

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129420#comment-13129420
 ] 

Hudson commented on HBASE-4606:
---

Integrated in HBase-TRUNK #2332 (See 
[https://builds.apache.org/job/HBase-TRUNK/2332/])
HBASE-4606  Remove spam in HCM and fix a list.size == 0

jdcryans : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java


 Remove spam in HCM and fix a list.size == 0
 ---

 Key: HBASE-4606
 URL: https://issues.apache.org/jira/browse/HBASE-4606
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.92.0

 Attachments: HBASE-4606.patch


 As discussed on the ML, HCM in 0.92 is being spammy with expecting X 
 results which is a debug leftover. Also right next to it I see a list.size 
 == 0, which should be converted into isEmpty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4579) CST.requestCompaction semantics changed, logs are now spammed when too many store files

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129419#comment-13129419
 ] 

Hudson commented on HBASE-4579:
---

Integrated in HBase-TRUNK #2332 (See 
[https://builds.apache.org/job/HBase-TRUNK/2332/])
HBASE-4579  CST.requestCompaction semantics changed, logs are now
   spammed when too many store files
forgot the CHANGES
HBASE-4579  CST.requestCompaction semantics changed, logs are now
   spammed when too many store files

jdcryans : 
Files : 
* /hbase/trunk/CHANGES.txt

jdcryans : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


 CST.requestCompaction semantics changed, logs are now spammed when too many 
 store files
 ---

 Key: HBASE-4579
 URL: https://issues.apache.org/jira/browse/HBASE-4579
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-4579-v2.patch, HBASE-4579.patch


 Another bug I'm not so sure what's going on. I see this in my log:
 {quote}
 2011-10-12 00:23:43,435 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:44,335 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:45,236 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:46,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:47,036 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 2011-10-12 00:23:47,936 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 info: no store files to compact
 {quote}
 It spams for a while, and a little later instead I get:
 {quote}
 2011-10-12 00:26:52,139 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:53,040 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:53,940 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:54,840 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:55,741 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 2011-10-12 00:26:56,641 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
 Skipped compaction of info.  Only 2 file(s) of size 176.4m have met 
 compaction criteria.
 {quote}
 I believe I also saw something like that for flushes, but the region was 
 closing so at least I know why it was spamming (would be nice if it just 
 unrequested the flush):
 {quote}
 2011-10-12 00:26:40,693 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 Flush requested on 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
 2011-10-12 00:26:40,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 NOT flushing memstore for region 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
 flushing=false, writesEnabled=false
 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 Flush requested on 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 NOT flushing memstore for region 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
 flushing=false, writesEnabled=false
 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 Flush requested on 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 NOT flushing memstore for region 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
 flushing=false, writesEnabled=false
 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 Flush requested on 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
 NOT flushing memstore for region 
 TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., 
 flushing=false, writesEnabled=false
 2011-10-12 

[jira] [Updated] (HBASE-4486) Improve Javadoc for HTableDescriptor

2011-10-17 Thread Akash Ashok (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash Ashok updated HBASE-4486:
---

Attachment: HBase-4486-v3.patch

Ram's Comments Incorporated

 Improve Javadoc for HTableDescriptor
 

 Key: HBASE-4486
 URL: https://issues.apache.org/jira/browse/HBASE-4486
 Project: HBase
  Issue Type: Improvement
  Components: client, documentation
Reporter: Akash Ashok
Assignee: Akash Ashok
Priority: Minor
 Attachments: HBase-4486-v2.patch, HBase-4486-v3.patch, 
 HBase-4486.patch, HTableDescriptor-v2.html, HTableDescriptor.html




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4580) Create some invalid zk nodes when a clean cluster start.

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129430#comment-13129430
 ] 

jirapos...@reviews.apache.org commented on HBASE-4580:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2420/
---

Review request for hbase.


Summary
---

https://issues.apache.org/jira/browse/HBASE-4580


This addresses bug HBASE-4580.
https://issues.apache.org/jira/browse/HBASE-4580


Diffs
-

  /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 

Diff: https://reviews.apache.org/r/2420/diff


Testing
---

1. I tested it in real cluster(3 nodes, created a table with 15 regions).
a)restart the cluster.
b)kill master and then start master
c)kill master and one region server, then start master.

2. all the UT test cased passed.(I tested twice)
Results :

Tests in error:
  
testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster):
 unknown host: example.org

Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

The TestCatalogTrackerOnCluster passed in a connected network environment.
 T E S T S
---
Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0


Thanks,

jinchao



 Create some invalid zk nodes when a clean cluster start.
 

 Key: HBASE-4580
 URL: https://issues.apache.org/jira/browse/HBASE-4580
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.92.0

 Attachments: HBASE-4580_TrunkV1.patch


 The below logs said that we created a invalid zk node when restarted a 
 cluster.
 it mistakenly believed that the regions belong to a dead server.
 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: Meta 
 updated status = true
 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: 
 ROOT/Meta already up-to date with new HRI.
 2011-10-11 05:05:29,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 771d63e9327383159553619a4f2dc74f with OFFLINE state
 2011-10-11 05:05:29,161 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state
 2011-10-11 05:05:29,170 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 4065350214452a9d5c55243c734bef08 with OFFLINE state
 2011-10-11 05:05:29,178 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state
 2011-10-11 05:05:29,187 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state
 2011-10-11 05:05:29,195 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state
 2011-10-11 05:05:29,229 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 db5f641452a70b09b85a92970e4198c7 with OFFLINE state
 2011-10-11 05:05:29,237 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state
 2011-10-11 05:05:29,253 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 c9385619425f737eab1a6624d2e097a8 with OFFLINE state
 // we cleaned all zk nodes.
 2011-10-11 05:05:29,262 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. 
 Assigning userregions
 2011-10-11 05:05:29,262 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Deleting any existing unassigned nodes
 2011-10-11 05:05:29,367 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) 
 across 1 server(s), retainAssignment=true
 2011-10-11 05:05:29,369 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Timeout-on-RIT=9000
 2011-10-11 05:05:29,369 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: 

[jira] [Updated] (HBASE-4605) Add constraints as a top-level feature

2011-10-17 Thread Jesse Yates (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-4605:
---

Attachment: java_Constraint_v2.patch

Constraint implementation that is just added as a coprocessor. Not implemented 
as a Precondition for ease, though it could be ported over to that fairly 
easily. Basically, putting this up for posterity since the consensus seems to 
be pursuing #1 above. 

Also, is there a better way to pass back exceptions from coprocessors? Right 
now, the exception causes a retry which is a huge timeout problem

 Add constraints as a top-level feature
 --

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: java_Constraint_v2.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4580) Create some invalid zk nodes when a clean cluster start.

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129444#comment-13129444
 ] 

jirapos...@reviews.apache.org commented on HBASE-4580:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2420/
---

(Updated 2011-10-18 02:57:00.553590)


Review request for hbase.


Changes
---

Sorry, I uploaded error patch file that lost a line of code


Summary
---

https://issues.apache.org/jira/browse/HBASE-4580


This addresses bug HBASE-4580.
https://issues.apache.org/jira/browse/HBASE-4580


Diffs (updated)
-

  /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1185442 

Diff: https://reviews.apache.org/r/2420/diff


Testing
---

1. I tested it in real cluster(3 nodes, created a table with 15 regions).
a)restart the cluster.
b)kill master and then start master
c)kill master and one region server, then start master.

2. all the UT test cased passed.(I tested twice)
Results :

Tests in error:
  
testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster):
 unknown host: example.org

Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16

The TestCatalogTrackerOnCluster passed in a connected network environment.
 T E S T S
---
Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0


Thanks,

jinchao



 Create some invalid zk nodes when a clean cluster start.
 

 Key: HBASE-4580
 URL: https://issues.apache.org/jira/browse/HBASE-4580
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.92.0

 Attachments: HBASE-4580_TrunkV1.patch


 The below logs said that we created a invalid zk node when restarted a 
 cluster.
 it mistakenly believed that the regions belong to a dead server.
 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: Meta 
 updated status = true
 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: 
 ROOT/Meta already up-to date with new HRI.
 2011-10-11 05:05:29,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 771d63e9327383159553619a4f2dc74f with OFFLINE state
 2011-10-11 05:05:29,161 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state
 2011-10-11 05:05:29,170 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 4065350214452a9d5c55243c734bef08 with OFFLINE state
 2011-10-11 05:05:29,178 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state
 2011-10-11 05:05:29,187 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state
 2011-10-11 05:05:29,195 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state
 2011-10-11 05:05:29,229 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 db5f641452a70b09b85a92970e4198c7 with OFFLINE state
 2011-10-11 05:05:29,237 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state
 2011-10-11 05:05:29,253 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 c9385619425f737eab1a6624d2e097a8 with OFFLINE state
 // we cleaned all zk nodes.
 2011-10-11 05:05:29,262 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. 
 Assigning userregions
 2011-10-11 05:05:29,262 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Deleting any existing unassigned nodes
 2011-10-11 05:05:29,367 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) 
 across 1 server(s), retainAssignment=true
 2011-10-11 05:05:29,369 DEBUG 
 

[jira] [Commented] (HBASE-4580) Create some invalid zk nodes when a clean cluster start.

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129449#comment-13129449
 ] 

jirapos...@reviews.apache.org commented on HBASE-4580:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2420/#review2639
---


Overall patch looks good.
TestCatalogTrackerOnCluster#testBadOriginalRootLocation passed.
See minor comments below.


/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
https://reviews.apache.org/r/2420/#comment5941

Should read 'or regions that were in RIT'



/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
https://reviews.apache.org/r/2420/#comment5942

Should be on line 2229. 'is' should be 'in'



/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
https://reviews.apache.org/r/2420/#comment5943

Some people prefer the old format: ampersand at the end of first line 
signifying continuation on the second line



/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
https://reviews.apache.org/r/2420/#comment5944

There should be a space between if and left parenthesis


- Ted


On 2011-10-18 02:57:00, jinchao gao wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2420/
bq.  ---
bq.  
bq.  (Updated 2011-10-18 02:57:00)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4580
bq.  
bq.  
bq.  This addresses bug HBASE-4580.
bq.  https://issues.apache.org/jira/browse/HBASE-4580
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
1185442 
bq.  
bq.  Diff: https://reviews.apache.org/r/2420/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  1. I tested it in real cluster(3 nodes, created a table with 15 regions).
bq.  a)restart the cluster.
bq.  b)kill master and then start master
bq.  c)kill master and one region server, then start master.
bq.  
bq.  2. all the UT test cased passed.(I tested twice)
bq.  Results :
bq.  
bq.  Tests in error:
bq.
testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster):
 unknown host: example.org
bq.  
bq.  Tests run: 1031, Failures: 0, Errors: 1, Skipped: 16
bq.  
bq.  The TestCatalogTrackerOnCluster passed in a connected network environment.
bq.   T E S T S
bq.  ---
bq.  Running org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster
bq.  Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.502 sec
bq.  
bq.  Results :
bq.  
bq.  Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jinchao
bq.  
bq.



 Create some invalid zk nodes when a clean cluster start.
 

 Key: HBASE-4580
 URL: https://issues.apache.org/jira/browse/HBASE-4580
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.92.0

 Attachments: HBASE-4580_TrunkV1.patch


 The below logs said that we created a invalid zk node when restarted a 
 cluster.
 it mistakenly believed that the regions belong to a dead server.
 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: Meta 
 updated status = true
 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: 
 ROOT/Meta already up-to date with new HRI.
 2011-10-11 05:05:29,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 771d63e9327383159553619a4f2dc74f with OFFLINE state
 2011-10-11 05:05:29,161 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state
 2011-10-11 05:05:29,170 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 4065350214452a9d5c55243c734bef08 with OFFLINE state
 2011-10-11 05:05:29,178 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state
 2011-10-11 05:05:29,187 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 

[jira] [Updated] (HBASE-4580) Some invalid zk nodes were created when a clean cluster restarts

2011-10-17 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4580:
--

Summary: Some invalid zk nodes were created when a clean cluster restarts  
(was: Create some invalid zk nodes when a clean cluster start.)

 Some invalid zk nodes were created when a clean cluster restarts
 

 Key: HBASE-4580
 URL: https://issues.apache.org/jira/browse/HBASE-4580
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.92.0

 Attachments: HBASE-4580_TrunkV1.patch


 The below logs said that we created a invalid zk node when restarted a 
 cluster.
 it mistakenly believed that the regions belong to a dead server.
 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: Meta 
 updated status = true
 2011-10-11 05:05:29,127 INFO org.apache.hadoop.hbase.master.HMaster: 
 ROOT/Meta already up-to date with new HRI.
 2011-10-11 05:05:29,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 771d63e9327383159553619a4f2dc74f with OFFLINE state
 2011-10-11 05:05:29,161 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state
 2011-10-11 05:05:29,170 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 4065350214452a9d5c55243c734bef08 with OFFLINE state
 2011-10-11 05:05:29,178 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state
 2011-10-11 05:05:29,187 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state
 2011-10-11 05:05:29,195 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 5cd9f55eecd43d088bbd505f6795131f with OFFLINE state
 2011-10-11 05:05:29,229 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 db5f641452a70b09b85a92970e4198c7 with OFFLINE state
 2011-10-11 05:05:29,237 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 a7b20a653919e7f41bfb2ed349af7d21 with OFFLINE state
 2011-10-11 05:05:29,253 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Creating (or updating) unassigned node for 
 c9385619425f737eab1a6624d2e097a8 with OFFLINE state
 // we cleaned all zk nodes.
 2011-10-11 05:05:29,262 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. 
 Assigning userregions
 2011-10-11 05:05:29,262 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Deleting any existing unassigned nodes
 2011-10-11 05:05:29,367 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) 
 across 1 server(s), retainAssignment=true
 2011-10-11 05:05:29,369 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Timeout-on-RIT=9000
 2011-10-11 05:05:29,369 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 9 region(s) 
 to C3S3,54366,1318323920153
 2011-10-11 05:05:29,369 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done
 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Async create of unassigned node for 
 771d63e9327383159553619a4f2dc74f with OFFLINE state
 2011-10-11 05:05:29,371 INFO org.apache.hadoop.hbase.master.HMaster: Master 
 has completed initialization
 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Async create of unassigned node for 
 3cf860dd323fe6360f571aeafc129f95 with OFFLINE state
 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Async create of unassigned node for 
 4065350214452a9d5c55243c734bef08 with OFFLINE state
 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Async create of unassigned node for 
 4e81613f82a39fc6e5e89f96e7b3ccc4 with OFFLINE state
 2011-10-11 05:05:29,371 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:58198-0x132f23a9a38 Async create of unassigned node for 
 e21b9e1545a28953aba0098fda5c9cd9 with OFFLINE state
 2011-10-11 05:05:29,372 DEBUG 

[jira] [Commented] (HBASE-4605) Add constraints as a top-level feature

2011-10-17 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129456#comment-13129456
 ] 

Ted Yu commented on HBASE-4605:
---

See the following from HBASE-4014 w.r.t. exceptions and coprocessors:

The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 

try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }

block. 
handleCoprocessorThrowable() is responsible for either passing 'e' along to 
the client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).

 Add constraints as a top-level feature
 --

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: java_Constraint_v2.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129457#comment-13129457
 ] 

Lars Hofhansl commented on HBASE-4562:
--

No problem. Thanks for patch :)

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, 
 HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, 
 test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-17 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129457#comment-13129457
 ] 

Lars Hofhansl edited comment on HBASE-4562 at 10/18/11 3:40 AM:


No problem. Thanks for the patch :)

  was (Author: lhofhansl):
No problem. Thanks for patch :)
  
 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.4.patch, HBASE-4562-0.90.patch, 
 HBASE-4562-0.92.patch, HBASE-4562-trunk.patch, test-4562-0.90.4.txt, 
 test-4562-0.90.txt, test-4562-0.92.txt, test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4510) HDFS-1620 related changes downstream (For compiling with HDFS 0.23+)

2011-10-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129458#comment-13129458
 ] 

jirapos...@reviews.apache.org commented on HBASE-4510:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2108/
---

(Updated 2011-10-18 03:42:06.955026)


Review request for hbase.


Changes
---

Updated patch that should fix 0.23 builds.


Summary
---

HBase isn't seemingly compiling anymore on 0.23 after the HDFS-1620 naming 
refactorings were carried out.

Two solutions:

1. We use new classnames. This breaks HBase's backward compatibility with older 
Hadoop releases (is that a concern with future releases?)
2. HBase gets its own sets of constants as the upstream one is not marked for 
public usage. This needs a little more maintenance on HBases' side.

Way (2) seems more viable. I've attached an initial patch that doesn't fix all 
trouble yet, but introduces the first way of changes.

The remaining issue lies in the use of DistributedFileSystem's safemode methods 
(which are private audience as well) inside of HBase for safemode waits and 
checks (via HBases' FSUtils class). Since this uses an enum, it is more 
difficult to handle without upstream interventions - thoughts?


This addresses bug HBASE-4510.
https://issues.apache.org/jira/browse/HBASE-4510


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/util/FSHDFSUtils.java dcd0937 
  src/main/java/org/apache/hadoop/hbase/util/FSUtils.java 789dd3b 

Diff: https://reviews.apache.org/r/2108/diff


Testing
---


Thanks,

Harsh



 HDFS-1620 related changes downstream (For compiling with HDFS 0.23+)
 

 Key: HBASE-4510
 URL: https://issues.apache.org/jira/browse/HBASE-4510
 Project: HBase
  Issue Type: Task
Affects Versions: 0.94.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Blocker

 HBase isn't seemingly compiling anymore on 0.23 after the HDFS-1620 naming 
 refactorings were carried out.
 Two solutions:
 * We use new classnames. This breaks HBase's backward compatibility with 
 older Hadoop releases (is that a concern with future releases?)
 * HBase gets its own sets of constants as the upstream one is not marked for 
 public usage. This needs a little more maintenance on HBases' side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >