[jira] Commented: (HBASE-2248) Provide new non-copy mechanism to assure atomic reads in get and scan

Andrew Purtell (JIRA) Mon, 12 Apr 2010 08:29:06 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856018#action_12856018
 ]


Andrew Purtell commented on HBASE-2248:
---------------------------------------

>From Ryan via email:
{quote}
There is a busy wait loop which attempts to ensure a write completes only when 
it is visible to others. With the log append as part of the "transaction" this 
is breaking down. The solution is to either forgo the busy wait loop (probably 
not a great idea) or restructure the code to do hlog appends first then 
memstore updates.

I'll talk to stack tomorrow and we can figure which route is better... Although 
I'd guess option #2
{quote}

> Provide new non-copy mechanism to assure atomic reads in get and scan
> ---------------------------------------------------------------------
>
>                 Key: HBASE-2248
>                 URL: https://issues.apache.org/jira/browse/HBASE-2248
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3
>            Reporter: Dave Latham
>            Priority: Blocker
>             Fix For: 0.20.4
>
>         Attachments: HBASE-2248-demonstrate-previous-impl-bugs.patch, 
> HBASE-2248-GetsAsScans3.patch, HBASE-2248-rr-alpha3.txt, 
> HBASE-2248-rr-pre-durability2.txt, HBASE-2248-rr-pre-durability3.txt, 
> hbase-2248.gc, HBASE-2248.patch, hbase-2248.txt, profile.png, 
> put_call_graph.png, readownwrites-lost.2.patch, readownwrites-lost.patch, 
> Screen shot 2010-02-23 at 10.33.38 AM.png, threads.txt
>
>
> HBASE-2037 introduced a new MemStoreScanner which triggers a 
> ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when 
> starting a scan.
> After upgrading to 0.20.3, we noticed a big slowdown in our use of short 
> scans.  Some of our data repesent a time series.   The data is stored in time 
> series order, MR jobs often insert/update new data at the end of the series, 
> and queries usually have to pick up some or all of the series.  These are 
> often scans of 0-100 rows at a time.  To load one page, we'll observe about 
> 20 such scans being triggered concurrently, and they take 2 seconds to 
> complete.  Doing a thread dump of a region server shows many threads in 
> ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key 
> values to copy it.  

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-2248) Provide new non-copy mechanism to assure atomic reads in get and scan

Reply via email to