[
https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ryan rawson updated HBASE-2248:
-------------------------------
Attachment: HBASE-2248-rr-pre-durability4.txt
ok here is a patch that addresses all the above issues:
- spin fixed by restructuring hlog append
- index test pass failure fixed
- test failures due to compaction
- all comments addressed
To accomplish the index hbase fix, I had to introduce a new notion of optional
scanner creation atomicity along with pre-flush-commit work, so a sub-class can
create an atomic section whereby some work is done (eg: switching out an index)
and the flush commit (where the snapshot is removed and the hfile is introduced
to open scanners) and this atomic section will be atomic relative to new
scanner creation. This was required to fix race conditions in indexed hbase,
which also means that indexed hbase is not as fast as it can be, since it
cannot create new scanners during this one critical phase of flush (which
includes re-reading scanner blocks btw).
> Provide new non-copy mechanism to assure atomic reads in get and scan
> ---------------------------------------------------------------------
>
> Key: HBASE-2248
> URL: https://issues.apache.org/jira/browse/HBASE-2248
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.3
> Reporter: Dave Latham
> Assignee: ryan rawson
> Priority: Blocker
> Fix For: 0.20.4
>
> Attachments: HBASE-2248-demonstrate-previous-impl-bugs.patch,
> HBASE-2248-GetsAsScans3.patch, HBASE-2248-rr-alpha3.txt,
> HBASE-2248-rr-pre-durability2.txt, HBASE-2248-rr-pre-durability3.txt,
> HBASE-2248-rr-pre-durability4.txt, hbase-2248.gc, HBASE-2248.patch,
> hbase-2248.txt, profile.png, put_call_graph.png, readownwrites-lost.2.patch,
> readownwrites-lost.patch, Screen shot 2010-02-23 at 10.33.38 AM.png,
> threads.txt
>
>
> HBASE-2037 introduced a new MemStoreScanner which triggers a
> ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when
> starting a scan.
> After upgrading to 0.20.3, we noticed a big slowdown in our use of short
> scans. Some of our data repesent a time series. The data is stored in time
> series order, MR jobs often insert/update new data at the end of the series,
> and queries usually have to pick up some or all of the series. These are
> often scans of 0-100 rows at a time. To load one page, we'll observe about
> 20 such scans being triggered concurrently, and they take 2 seconds to
> complete. Doing a thread dump of a region server shows many threads in
> ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key
> values to copy it.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira