[ 
https://issues.apache.org/jira/browse/ACCUMULO-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026487#comment-14026487
 ] 

Mike Drob commented on ACCUMULO-2766:
-------------------------------------

CI + Agitation is a pretty high bar for minimally ensuring that something works 
as intended. I do not know of anybody running those nightly (maybe [~elserj] 
is?)

At this point, I'm not worrying about performance, I trust the numbers that you 
posted (would love to reproduce them eventually, but don't have time for it 
yet). What assumptions did the extra locking provide us with? Anything relating 
to the state of the queue? Are we exposed to a potential concurrent 
modification?

Initially it looks like we locked to offer work to the syncQueue, but did not 
lock to poll? And now we do not lock for either?

Concurrent code can be difficult to understand, and to prove correctness on, so 
an extra test might protect us against changes elsewhere down the line. And act 
as a useful documentation.

> Single walog operation may wait for multiple hsync calls
> --------------------------------------------------------
>
>                 Key: ACCUMULO-2766
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2766
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.5.0, 1.5.1, 1.6.0
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>            Priority: Critical
>              Labels: performance
>             Fix For: 1.5.2, 1.6.1, 1.7.0
>
>         Attachments: ACCUMULO-2677-1.patch, ACCUMULO-2766-2.patch
>
>
> While looking into slow {{hsync}} calls, I noticed an oddity in the way 
> Accumulo processes syncs.  Specifically the way {{closeLock}} is used in 
> {{DfsLogger}}, it seems like the following situation could occur. 
>  
>  # thread B starts executing DfsLogger.LogSyncingTask.run()
>  # thread 1 enters DfsLogger.logFileData()
>  # thread 1 writes to walog
>  # thread 1 locks _closeLock_ 
>  # thread 1 adds sync work to workQueue
>  # thread 1 unlocks _closeLock_
>  # thread B takes sync work off of workQueue
>  # thread B locks _closeLock_
>  # thread B calls sync
>  # thread 3 enters DfsLogger.logFileData()
>  # thread 3 writes to walog
>  # thread 3 blocks locking _closeLock_
>  # thread 4 enters DfsLogger.logFileData()
>  # thread 4 writes to walog
>  # thread 4 blocks locking _closeLock_
>  # thread B unlocks _closeLock_
>  # thread 4 locks _closeLock_ 
>  # thread 4 adds sync work to workQueue
>  # thread B takes sync work off of workQueue
>  # thread B blocks locking _closeLock_
>  # thread 4 unlocks _closeLock_
>  # thread B locks _closeLock_
>  # thread B calls sync
>  # thread B unlocks _closeLock_
>  # thread 3 locks _closeLock_
>  # thread 3 adds sync work to workQueue
>  # thread 3 unlocks _closeLock_
> In this situation thread 3 unnecessarily has to wait for an extra {{hsync}} 
> call.  Not sure if this situation actually occurs, or if it occurs very 
> frequently.  Looking at the code it seems like it would be nice if sync 
> operations could be queued w/o synchronizing w/ sync operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to