[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220693#comment-17220693 ] mingchao zhao commented on HDDS-4308: - ??Can we just make the method OMKeyRequest#getVolumeInfo be thread safe to return a copied object, this should be okay for current issue? This will make the logic more simplified.?? +1 Thanks for [~linyiqun]'s advice, I think this way is feasible. It has little impact on code structure and avoids frequent lock switching. According to this idea, I modified the original PR. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: mingchao zhao >Priority: Blocker > Labels: pull-request-available > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215148#comment-17215148 ] Yiqun Lin commented on HDDS-4308: - {quote}This might not be complete i believe, If 2 threads acquire copy object and if they update outside lock we have issue again. I think the whole operation should be performed under volume lock. (As we update in-memory it should be quick) But i agree that it might have performance impact across buckets when key writes happen. {quote} Use volume lock during bucket operation makes logic a little complex. As current PR change does: 1)acquire bucket lock 2)release bucket lock 3)acquire volume lock update volume usedBytes usage 4)release volume lock 5)acquire bucket lock again (to finish remaining operation) 6)release bucket lock Can we just make the method OMKeyRequest#getVolumeInfo be thread safe to return a copied object, this should be okay for current issue? This will make the logic more simplified. Like: {code:java} public static synchronized OmVolumeArgs getVolumeInfo(OMMetadataManager omMetadataManager, String volume) { return omMetadataManager.getVolumeTable().getCacheValue( new CacheKey<>(omMetadataManager.getVolumeKey(volume))) .getCacheValue().copyObject(); } {code} > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: mingchao zhao >Priority: Blocker > Labels: pull-request-available > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214653#comment-17214653 ] mingchao zhao commented on HDDS-4308: - Hi [~bharat] I submitted a [PR|https://github.com/apache/hadoop-ozone/pull/1489] to fix this problem, please help to review it. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: mingchao zhao >Priority: Blocker > Labels: pull-request-available > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212808#comment-17212808 ] mingchao zhao commented on HDDS-4308: - Thanks for [~bharat]'s advice. I will submit PR in this way as soon as possible to solve this bug. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Blocker > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212736#comment-17212736 ] Bharat Viswanadham commented on HDDS-4308: -- {quote}The performance impact of volume lock hasn’t tested before and it may also be within our tolerance(In-memory operations can be really fast). This should be the easiest way to fix this bug by far. {quote} Agreed using volume lock and then doing calculation will solve the correctness issue. If we don't have any smart solution until then we can fix this by using volume lock and update bytes used. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Blocker > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212631#comment-17212631 ] Rui Wang commented on HDDS-4308: [~micahzhao] thanks for this reminder. Will follow up on discussions above to understand necessary steps. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Blocker > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212162#comment-17212162 ] mingchao zhao commented on HDDS-4308: - cc [~amaliujia], HDDS-4272 and HDDS-4277 will also have this problem, need to pay attention to the here. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Blocker > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211707#comment-17211707 ] mingchao zhao commented on HDDS-4308: - Hi [~bharat] Can we keep updating the usedBytes of volumeArgs in memory and persist it in the takeSnapshot(currently, this action is executed periodically)? Just want to make sure it's feasible. I am not familiar with snapshot and ratis, not sure if this is correct. If feasible, I am very willing to discuss here. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Blocker > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211679#comment-17211679 ] mingchao zhao commented on HDDS-4308: - ??Question: With your tests how much perf impact has been observed? ?? Only the existing approach has been [tested before|https://github.com/apache/hadoop-ozone/pull/1296#issuecomment-692424807]. The performance impact of volume lock hasn’t tested before and it may also be within our tolerance. This should be the easiest way to fix this bug by far. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Blocker > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211269#comment-17211269 ] Bharat Viswanadham commented on HDDS-4308: -- I think the better solution here is copy of a new volumeArgs object in the Request before addResponseToDoubleBuffer. Of course, during the copy process, we need to lock the object volumeArgs in case other operations change it. This might not be complete i believe, If 2 threads acquire copy object and if they update outside lock we have issue again. I think the whole operation should be performed under volume lock. (As we update in-memory it should be quick) But i agree that it might have performance impact across buckets when key writes happen. Question: With your tests how much perf impact has been observed? cc [~arp] For any more thoughts on this issue. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Blocker > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211254#comment-17211254 ] Bharat Viswanadham commented on HDDS-4308: -- As mentioned in the scenario this can happen when double buffer flush is not completed and other transaction requests updating bytes used for the same key. When you run with a load this can be seen, as previously we have observed ConcurrentModificatinException because of using the same cache object in the double buffer. HDDS-2322 But here we might not see an error, but here this can happen bytesUsed will be updated wrongly. Coming to the solution, I think we can use read lock and acquire volume Object using Table#get API and update bytes used and submit this object to double buffer. (In this way, we might not see volume lock contention, as we acquire write lock on volume during create/delete, so this might no affect performance. Let me know your thoughts? > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Blocker > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210583#comment-17210583 ] mingchao zhao commented on HDDS-4308: - Hi [~bharat] Sorry for didn't recover in time due to the holiday. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. I get your point, because the cache object of volumeArgs in OM is unique. The usedBytes in the unique object may be modified when DB is updated. Using a cache avoids volume locking, which has a minimal performance impact. But I did not notice the problem you mentioned above. I think the better solution here is copy of a new volumeArgs object in the Request before addResponseToDoubleBuffer. Of course, during the copy process, we need to lock the object volumeArgs in case other operations change it. Any other solution here? > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Blocker > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the updated value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > But after T2, the value should be 7000, so we have DB in an incorrect state. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-4308) Fix issue with quota update
[ https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208208#comment-17208208 ] Bharat Viswanadham commented on HDDS-4308: -- cc [~micahzhao] For ur comments on this issue. > Fix issue with quota update > --- > > Key: HDDS-4308 > URL: https://issues.apache.org/jira/browse/HDDS-4308 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Priority: Major > > Currently volumeArgs using getCacheValue and put the same object in > doubleBuffer, this might cause issue. > Let's take the below scenario: > InitialVolumeArgs quotaBytes -> 1 > 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated > volumeArgs to DoubleBuffer. > 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to > double buffer. > *Now at the end of flushing these transactions, our DB should have 7000 as > bytes used.* > Now T1 is picked by double Buffer and when it commits, and as it uses cached > Object put into doubleBuffer, it flushes to DB with the processed value from > T2(As it is a cache object) and update DB with bytesUsed as 7000. > And now OM has restarted, and only DB has transactions till T1. (We get this > info from TransactionInfo > Table(https://issues.apache.org/jira/browse/HDDS-3685) > Now T2 is again replayed, as it is not committed to DB, now DB will be again > subtracted with 2000, and now DB will have 5000. > Issue here: > 1. As we use a cached object and put the same cached object into double > buffer this can cause this kind of issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org