[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211269#comment-17211269
 ] 

Bharat Viswanadham commented on HDDS-4308:
------------------------------------------

I think the better solution here is copy of a new volumeArgs object in the 
Request before addResponseToDoubleBuffer. Of course, during the copy process, 
we need to lock the object volumeArgs in case other operations change it.

This might not be complete i believe, If 2 threads acquire copy object and if 
they update outside lock we have issue again. I think the whole operation 
should be performed under volume lock. (As we update in-memory it should be 
quick) But i agree that it might have performance impact across buckets when 
key writes happen.

Question: With your tests how much perf impact has been observed?

cc [~arp] For any more thoughts on this issue.




> Fix issue with quota update
> ---------------------------
>
>                 Key: HDDS-4308
>                 URL: https://issues.apache.org/jira/browse/HDDS-4308
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Bharat Viswanadham
>            Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 10000
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to