date:20110730

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-30 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ted Yu updated HBASE-3845:
--

Resolution: Fixed
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)

data loss because lastSeqWritten can miss memstore edits

Key: HBASE-3845
URL: https://issues.apache.org/jira/browse/HBASE-3845
Project: HBase
Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
Fix For: 0.90.5

Attachments:
0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch,
HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch,
HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch,
HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch,
HBASE-3845_trunk_3.patch

(I don't have a test case to prove this yet but I have run it by Dhruba and
Kannan internally and wanted to put this up for some feedback.)
In this discussion let us assume that the region has only one column family.
That way I can use region/memstore interchangeably.
After a memstore flush it is possible for lastSeqWritten to have a
log-sequence-id for a region that is not the earliest log-sequence-id for
that region's memstore.
HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure
that we only keep track of the earliest log-sequence-number that is present
in the memstore.
Every time the memstore is flushed we remove the region's entry in
lastSequenceWritten and wait for the next append to populate this entry
again. This is where the problem happens.
step 1:
flusher.prepare() snapshots the memstore under
HRegion.updatesLock.writeLock().
step 2 :
as soon as the updatesLock.writeLock() is released new entries will be added
into the memstore.
step 3 :
wal.completeCacheFlush() is called. This method removes the region's entry
from lastSeqWritten.
step 4:
the next append will create a new entry for the region in lastSeqWritten().
But this will be the log seq id of the current append. All the edits that
were added in step 2 are missing.
==
as a temporary measure, instead of removing the region's entry in step 3 I
will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-30 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073159#comment-13073159
]

Ted Yu commented on HBASE-3845:
---

Applied to TRUNK.
TestResettingCounters passes now.

Thanks for the patch Anirudh.

data loss because lastSeqWritten can miss memstore edits

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-30 Thread Hudson (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073168#comment-13073168
]

Hudson commented on HBASE-3845:
---

Integrated in HBase-TRUNK #2064 (See
[https://builds.apache.org/job/HBase-TRUNK/2064/])
HBASE-3845 Addendum: relax lastSeqWritten check in case write to WAL is
skipped

tedyu :
Files :
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java

data loss because lastSeqWritten can miss memstore edits

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4003) Cleanup Calls Conservatively On Timeout

2011-07-30 Thread Karthick Sankarachary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthick Sankarachary updated HBASE-4003:
-

Attachment: (was: HBASE-4003-V2.patch)

 Cleanup Calls Conservatively On Timeout
 ---

 Key: HBASE-4003
 URL: https://issues.apache.org/jira/browse/HBASE-4003
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.3
Reporter: Karthick Sankarachary
Assignee: Karthick Sankarachary
 Fix For: 0.92.0

 Attachments: HBASE-4003.patch


 In the event of a socket timeout, the {{HBaseClient}} iterates over the 
 outstanding calls (on that socket), and notifies them that a 
 {{SocketTimeoutException}} has occurred. Ideally, we should be cleanup up 
 just those calls that have been outstanding for longer than the specified 
 socket timeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-07-30 Thread Jonathan Hsieh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4148:
--

Attachment: 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch

 HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
 

 Key: HBASE-4148
 URL: https://issues.apache.org/jira/browse/HBASE-4148
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.3
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch


 When HFiles are flushed through the normal path, they include an attribute 
 TIMERANGE_KEY which can be used to cull HFiles when performing a 
 time-restricted scan. Files produced by HFileOutputFormat are currently 
 missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-07-30 Thread Jonathan Hsieh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4148:
--

Status: Patch Available  (was: Open)

Up for review here: https://reviews.apache.org/r/1229/

 HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
 

 Key: HBASE-4148
 URL: https://issues.apache.org/jira/browse/HBASE-4148
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.3
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.90.5

 Attachments: 
 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch


 When HFiles are flushed through the normal path, they include an attribute 
 TIMERANGE_KEY which can be used to cull HFiles when performing a 
 time-restricted scan. Files produced by HFileOutputFormat are currently 
 missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-07-30 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073230#comment-13073230
]

jirapos...@reviews.apache.org commented on HBASE-4148:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/
---

Review request for hbase and Todd Lipcon.

Summary
---

When HFiles are flushed through the normal path, they include an attribute
TIMERANGE_KEY which can be used to cull HFiles when performing a
time-restricted scan. Files produced by HFileOutputFormat are currently missing
this metadata.

This addresses bug HBASE-4148.
https://issues.apache.org/jira/browse/HBASE-4148

Diffs
-

src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
8ccdf4d
src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 40efdda
src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
89241eb

Diff: https://reviews.apache.org/r/1229/diff

Testing
---

Added unit test.

I don't quite understand why the KeyValue with the larger timestamp (2000)
value must be written before the one with the smaller timestamp (1000). I can
see the code that enforces this (HFile.checkKey) but not why keys are larger to
smaller. Is this in HFile data precondition?

I cannot get the full test suite to pass, with or without this patch. Suite
seems to timeout on tests unrelated to this. Would appreciate some hints or
pointers for info on which tests are flakey or take a long time to run.

Thanks,

jmhsieh

HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Key: HBASE-4148
URL: https://issues.apache.org/jira/browse/HBASE-4148
Project: HBase
Issue Type: Bug
Components: mapreduce
Affects Versions: 0.90.3
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
Fix For: 0.90.5

Attachments:
0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-451) Remove HTableDescriptor from HRegionInfo

2011-07-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073279#comment-13073279
 ] 

Hudson commented on HBASE-451:
--

Integrated in HBase-TRUNK #2065 (See 
[https://builds.apache.org/job/HBase-TRUNK/2065/])
HBASE-4032 HBASE-451 improperly breaks public API HRegionInfo#getTableDesc

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java


 Remove HTableDescriptor from HRegionInfo
 

 Key: HBASE-451
 URL: https://issues.apache.org/jira/browse/HBASE-451
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.2.0
Reporter: Jim Kellerman
Assignee: Subbu M Iyer
Priority: Critical
 Fix For: 0.92.0

 Attachments: 451-addendum-v2.txt, 
 451_support_for_removing_HTD_from_HRI_trunk.txt, 
 HBASE-451-Fixed_broken_TestAdmin.patch, 
 HBASE-451-Fixed_broken_TestAdmin1.patch, 
 HBASE-451_-_First_draft_support_for_removing_HTD_from_HRI1.patch, 
 HBASE-451_-_Fourth_draft_support_for_removing_HTD_from_HRI.patch, 
 HBASE-451_-_Second_draft_-_Remove_HTD_from_HRI.patch, descriptors.txt, 
 fixtestadmin.txt, pass_htd_on_region_construction.txt


 There is an HRegionInfo for every region in HBase. Currently HRegionInfo also 
 contains the HTableDescriptor (the schema). That means we store the schema n 
 times where n is the number of regions in the table.
 Additionally, for every region of the same table that the region server has 
 open, there is a copy of the schema. Thus it is stored in memory once for 
 each open region.
 If HRegionInfo merely contained the table name the HTableDescriptor could be 
 stored in a separate file and easily found.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4032) HBASE-451 improperly breaks public API HRegionInfo#getTableDesc

2011-07-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073280#comment-13073280
 ] 

Hudson commented on HBASE-4032:
---

Integrated in HBase-TRUNK #2065 (See 
[https://builds.apache.org/job/HBase-TRUNK/2065/])
HBASE-4032 HBASE-451 improperly breaks public API HRegionInfo#getTableDesc

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java


 HBASE-451 improperly breaks public API HRegionInfo#getTableDesc
 ---

 Key: HBASE-4032
 URL: https://issues.apache.org/jira/browse/HBASE-4032
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: stack
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4032-v2.txt, 4032-v3.txt, 4032.txt


 After HBASE-451, HRegionInfo#getTableDesc has been modified to always return 
 {{null}}. 
 One immediate effect is broken unit tests.
 That aside, it is not in the spirit of deprecation to actually break the 
 method until after the deprecation cycle, it's a bug.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-07-30 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073328#comment-13073328
]

jirapos...@reviews.apache.org commented on HBASE-4148:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/
---

(Updated 2011-07-31 05:52:30.608713)

Review request for hbase and Todd Lipcon.

Changes
---

Updated to address nit.

Summary
---

This addresses bug HBASE-4148.
https://issues.apache.org/jira/browse/HBASE-4148

Diffs (updated)
-

Diff: https://reviews.apache.org/r/1229/diff

Testing
---

Added unit test.

Thanks,

jmhsieh

HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

Attachments:
0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch,
0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[jira] [Updated] (HBASE-4003) Cleanup Calls Conservatively On Timeout

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

[jira] [Commented] (HBASE-451) Remove HTableDescriptor from HRegionInfo

[jira] [Commented] (HBASE-4032) HBASE-451 improperly breaks public API HRegionInfo#getTableDesc

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

10 matches

Site Navigation

Mail list logo

Footer information