[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-26 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220693#comment-17220693
 ] 

mingchao zhao commented on HDDS-4308:
-

??Can we just make the method OMKeyRequest#getVolumeInfo be thread safe to 
return a copied object, this should be okay for current issue? This will make 
the logic more simplified.??
+1 Thanks for [~linyiqun]'s advice, I think this way is feasible. It has little 
impact on code structure and avoids frequent lock switching. According to this 
idea, I modified the original PR.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: mingchao zhao
>Priority: Blocker
>  Labels: pull-request-available
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-15 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215148#comment-17215148
 ] 

Yiqun Lin commented on HDDS-4308:
-

{quote}This might not be complete i believe, If 2 threads acquire copy object 
and if they update outside lock we have issue again. I think the whole 
operation should be performed under volume lock. (As we update in-memory it 
should be quick) But i agree that it might have performance impact across 
buckets when key writes happen.
{quote}
Use volume lock during bucket operation makes logic a little complex. 
 As current PR change does:
 1)acquire bucket lock
 2)release bucket lock
 3)acquire volume lock
    update volume usedBytes usage
 4)release volume lock
 5)acquire bucket lock again (to finish remaining operation)
 6)release bucket lock

Can we just make the method OMKeyRequest#getVolumeInfo be thread safe to return 
a copied object, this should be okay for current issue? This will make the 
logic more simplified.
 Like:
{code:java}
  public static synchronized OmVolumeArgs getVolumeInfo(OMMetadataManager 
omMetadataManager,
  String volume) {
return omMetadataManager.getVolumeTable().getCacheValue(
new CacheKey<>(omMetadataManager.getVolumeKey(volume)))
.getCacheValue().copyObject();
  }
{code}

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: mingchao zhao
>Priority: Blocker
>  Labels: pull-request-available
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-15 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214653#comment-17214653
 ] 

mingchao zhao commented on HDDS-4308:
-

Hi [~bharat] I submitted a 
[PR|https://github.com/apache/hadoop-ozone/pull/1489] to fix this problem, 
please help to review it.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: mingchao zhao
>Priority: Blocker
>  Labels: pull-request-available
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212808#comment-17212808
 ] 

mingchao zhao commented on HDDS-4308:
-

Thanks for [~bharat]'s advice. I will submit PR in this way as soon as possible 
to solve this bug.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212736#comment-17212736
 ] 

Bharat Viswanadham commented on HDDS-4308:
--

{quote}The performance impact of volume lock hasn’t tested before and it may 
also be within our tolerance(In-memory operations can be really fast). This 
should be the easiest way to fix this bug by far.
{quote}
Agreed using volume lock and then doing calculation will solve the correctness 
issue.

If we don't have any smart solution until then we can fix this by using volume 
lock and update bytes used.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212631#comment-17212631
 ] 

Rui Wang commented on HDDS-4308:


[~micahzhao] thanks for this reminder. Will follow up on discussions above to 
understand necessary steps.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212162#comment-17212162
 ] 

mingchao zhao commented on HDDS-4308:
-

cc [~amaliujia], HDDS-4272 and HDDS-4277 will also have this problem, need to 
pay attention to the 
 here.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-10 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211707#comment-17211707
 ] 

mingchao zhao commented on HDDS-4308:
-

Hi [~bharat] Can we keep updating the usedBytes of volumeArgs in memory and 
persist it in the takeSnapshot(currently, this action is executed 
periodically)? 
Just want to make sure it's feasible. I am not familiar with snapshot and 
ratis, not sure if this is correct. If feasible, I am very willing to discuss 
here.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-10 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211679#comment-17211679
 ] 

mingchao zhao commented on HDDS-4308:
-

??Question: With your tests how much perf impact has been observed? ??
Only the existing approach has been [tested 
before|https://github.com/apache/hadoop-ozone/pull/1296#issuecomment-692424807].
 
The performance impact of volume lock hasn’t tested before and it may also be 
within our tolerance. This should be the easiest way to fix this bug by far.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-09 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211269#comment-17211269
 ] 

Bharat Viswanadham commented on HDDS-4308:
--

I think the better solution here is copy of a new volumeArgs object in the 
Request before addResponseToDoubleBuffer. Of course, during the copy process, 
we need to lock the object volumeArgs in case other operations change it.

This might not be complete i believe, If 2 threads acquire copy object and if 
they update outside lock we have issue again. I think the whole operation 
should be performed under volume lock. (As we update in-memory it should be 
quick) But i agree that it might have performance impact across buckets when 
key writes happen.

Question: With your tests how much perf impact has been observed?

cc [~arp] For any more thoughts on this issue.




> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-09 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211254#comment-17211254
 ] 

Bharat Viswanadham commented on HDDS-4308:
--

As mentioned in the scenario this can happen when double buffer flush is not 
completed and other transaction requests updating bytes used for the same key.
When you run with a load this can be seen, as previously we have observed 
ConcurrentModificatinException because of using the same cache object in the 
double buffer. HDDS-2322

But here we might not see an error, but here this can happen bytesUsed will be 
updated wrongly.

Coming to the solution, I think we can use read lock and acquire volume Object 
using Table#get API and update bytes used and submit this object to double 
buffer. (In this way, we might not see volume lock contention, as we acquire 
write lock on volume during create/delete, so this might no affect performance.

Let me know your thoughts?



> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-08 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210583#comment-17210583
 ] 

mingchao zhao commented on HDDS-4308:
-

Hi [~bharat] Sorry for didn't recover in time due to the holiday.

> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue.

I get your point, because the cache object of volumeArgs in OM is unique. The 
usedBytes in the unique object may be modified when DB is updated. 
Using a cache avoids volume locking, which has a minimal performance impact. 
But I did not  notice the problem you mentioned above.
I think the better solution here is copy of a new volumeArgs object in the 
Request before addResponseToDoubleBuffer. Of course, during the copy process, 
we need to lock the object volumeArgs in case other operations change it.

Any other solution here?



> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-05 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208208#comment-17208208
 ] 

Bharat Viswanadham commented on HDDS-4308:
--

cc [~micahzhao]
For ur comments on this issue.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the processed value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org