[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-30 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3845:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
 HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, 
 HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, 
 HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, 
 HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-30 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073159#comment-13073159
 ] 

Ted Yu commented on HBASE-3845:
---

Applied to TRUNK.
TestResettingCounters passes now.

Thanks for the patch Anirudh.

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
 HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, 
 HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, 
 HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, 
 HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073168#comment-13073168
 ] 

Hudson commented on HBASE-3845:
---

Integrated in HBase-TRUNK #2064 (See 
[https://builds.apache.org/job/HBase-TRUNK/2064/])
HBASE-3845 Addendum: relax lastSeqWritten check in case write to WAL is 
skipped

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java


 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
 HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, 
 HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, 
 HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, 
 HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4003) Cleanup Calls Conservatively On Timeout

2011-07-30 Thread Karthick Sankarachary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthick Sankarachary updated HBASE-4003:
-

Attachment: (was: HBASE-4003-V2.patch)

 Cleanup Calls Conservatively On Timeout
 ---

 Key: HBASE-4003
 URL: https://issues.apache.org/jira/browse/HBASE-4003
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.3
Reporter: Karthick Sankarachary
Assignee: Karthick Sankarachary
 Fix For: 0.92.0

 Attachments: HBASE-4003.patch


 In the event of a socket timeout, the {{HBaseClient}} iterates over the 
 outstanding calls (on that socket), and notifies them that a 
 {{SocketTimeoutException}} has occurred. Ideally, we should be cleanup up 
 just those calls that have been outstanding for longer than the specified 
 socket timeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-07-30 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4148:
--

Attachment: 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch

 HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
 

 Key: HBASE-4148
 URL: https://issues.apache.org/jira/browse/HBASE-4148
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.3
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch


 When HFiles are flushed through the normal path, they include an attribute 
 TIMERANGE_KEY which can be used to cull HFiles when performing a 
 time-restricted scan. Files produced by HFileOutputFormat are currently 
 missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-07-30 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4148:
--

Status: Patch Available  (was: Open)

Up for review here: https://reviews.apache.org/r/1229/

 HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
 

 Key: HBASE-4148
 URL: https://issues.apache.org/jira/browse/HBASE-4148
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.3
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch


 When HFiles are flushed through the normal path, they include an attribute 
 TIMERANGE_KEY which can be used to cull HFiles when performing a 
 time-restricted scan. Files produced by HFileOutputFormat are currently 
 missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-07-30 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073230#comment-13073230
 ] 

jirapos...@reviews.apache.org commented on HBASE-4148:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/
---

Review request for hbase and Todd Lipcon.


Summary
---

When HFiles are flushed through the normal path, they include an attribute 
TIMERANGE_KEY which can be used to cull HFiles when performing a 
time-restricted scan. Files produced by HFileOutputFormat are currently missing 
this metadata.


This addresses bug HBASE-4148.
https://issues.apache.org/jira/browse/HBASE-4148


Diffs
-

  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 
8ccdf4d 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 40efdda 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 
89241eb 

Diff: https://reviews.apache.org/r/1229/diff


Testing
---

Added unit test.  

I don't quite understand why the KeyValue with the larger timestamp (2000) 
value must be written before the one with the smaller timestamp (1000). I can 
see the code that enforces this (HFile.checkKey) but not why keys are larger to 
smaller.  Is this in HFile data precondition?

I cannot get the full test suite to pass, with or without this patch.  Suite 
seems to timeout on tests unrelated to this.  Would appreciate some hints or 
pointers for info on which tests are flakey or take a long time to run.


Thanks,

jmhsieh



 HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
 

 Key: HBASE-4148
 URL: https://issues.apache.org/jira/browse/HBASE-4148
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.3
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch


 When HFiles are flushed through the normal path, they include an attribute 
 TIMERANGE_KEY which can be used to cull HFiles when performing a 
 time-restricted scan. Files produced by HFileOutputFormat are currently 
 missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-451) Remove HTableDescriptor from HRegionInfo

2011-07-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073279#comment-13073279
 ] 

Hudson commented on HBASE-451:
--

Integrated in HBase-TRUNK #2065 (See 
[https://builds.apache.org/job/HBase-TRUNK/2065/])
HBASE-4032 HBASE-451 improperly breaks public API HRegionInfo#getTableDesc

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java


 Remove HTableDescriptor from HRegionInfo
 

 Key: HBASE-451
 URL: https://issues.apache.org/jira/browse/HBASE-451
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.2.0
Reporter: Jim Kellerman
Assignee: Subbu M Iyer
Priority: Critical
 Fix For: 0.92.0

 Attachments: 451-addendum-v2.txt, 
 451_support_for_removing_HTD_from_HRI_trunk.txt, 
 HBASE-451-Fixed_broken_TestAdmin.patch, 
 HBASE-451-Fixed_broken_TestAdmin1.patch, 
 HBASE-451_-_First_draft_support_for_removing_HTD_from_HRI1.patch, 
 HBASE-451_-_Fourth_draft_support_for_removing_HTD_from_HRI.patch, 
 HBASE-451_-_Second_draft_-_Remove_HTD_from_HRI.patch, descriptors.txt, 
 fixtestadmin.txt, pass_htd_on_region_construction.txt


 There is an HRegionInfo for every region in HBase. Currently HRegionInfo also 
 contains the HTableDescriptor (the schema). That means we store the schema n 
 times where n is the number of regions in the table.
 Additionally, for every region of the same table that the region server has 
 open, there is a copy of the schema. Thus it is stored in memory once for 
 each open region.
 If HRegionInfo merely contained the table name the HTableDescriptor could be 
 stored in a separate file and easily found.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4032) HBASE-451 improperly breaks public API HRegionInfo#getTableDesc

2011-07-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073280#comment-13073280
 ] 

Hudson commented on HBASE-4032:
---

Integrated in HBase-TRUNK #2065 (See 
[https://builds.apache.org/job/HBase-TRUNK/2065/])
HBASE-4032 HBASE-451 improperly breaks public API HRegionInfo#getTableDesc

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java


 HBASE-451 improperly breaks public API HRegionInfo#getTableDesc
 ---

 Key: HBASE-4032
 URL: https://issues.apache.org/jira/browse/HBASE-4032
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: stack
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4032-v2.txt, 4032-v3.txt, 4032.txt


 After HBASE-451, HRegionInfo#getTableDesc has been modified to always return 
 {{null}}. 
 One immediate effect is broken unit tests.
 That aside, it is not in the spirit of deprecation to actually break the 
 method until after the deprecation cycle, it's a bug.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-07-30 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073328#comment-13073328
 ] 

jirapos...@reviews.apache.org commented on HBASE-4148:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/
---

(Updated 2011-07-31 05:52:30.608713)


Review request for hbase and Todd Lipcon.


Changes
---

Updated to address nit.


Summary
---

When HFiles are flushed through the normal path, they include an attribute 
TIMERANGE_KEY which can be used to cull HFiles when performing a 
time-restricted scan. Files produced by HFileOutputFormat are currently missing 
this metadata.


This addresses bug HBASE-4148.
https://issues.apache.org/jira/browse/HBASE-4148


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 
8ccdf4d 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 40efdda 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 
89241eb 

Diff: https://reviews.apache.org/r/1229/diff


Testing
---

Added unit test.  

I don't quite understand why the KeyValue with the larger timestamp (2000) 
value must be written before the one with the smaller timestamp (1000). I can 
see the code that enforces this (HFile.checkKey) but not why keys are larger to 
smaller.  Is this in HFile data precondition?

I cannot get the full test suite to pass, with or without this patch.  Suite 
seems to timeout on tests unrelated to this.  Would appreciate some hints or 
pointers for info on which tests are flakey or take a long time to run.


Thanks,

jmhsieh



 HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
 

 Key: HBASE-4148
 URL: https://issues.apache.org/jira/browse/HBASE-4148
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.3
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 
 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch


 When HFiles are flushed through the normal path, they include an attribute 
 TIMERANGE_KEY which can be used to cull HFiles when performing a 
 time-restricted scan. Files produced by HFileOutputFormat are currently 
 missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira