smengcl edited a comment on issue #696: HDDS-3056. Allow users to list volumes 
they have access to, and optionally allow all users to list all volumes
URL: https://github.com/apache/hadoop-ozone/pull/696#issuecomment-611991079
 
 
   I'm able to dig a bit into the root cause of the timeout.
   
   Turns out, when a mini ozone cluster launches for a second time in the 
**same** test class. In `setOwner()` call the OM side would [add the same 
volume to owner 
list](https://github.com/apache/hadoop-ozone/blob/80e9f0a7238953e41b06d22f0419f04ab31d4212/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/volume/OMVolumeSetOwnerRequest.java#L156-L158)
 for a second time and **succeed**, which is very weird.
   The result of this is a **malformed** list in `UserVolumeInfo` for the user, 
see `prevVolList` variable in below screenshot:
   <img width="1440" alt="ss1" 
src="https://user-images.githubusercontent.com/50227127/78987769-cc2eb780-7ae3-11ea-9dc7-544b3783c667.png";>
   
   This causes `testAclDisabledListAllDisallowed` to get stuck in the 
`it.hasNext()` infinite loop and eventually timeout.
   
   I am able to confirm my discovery by setting a breakpoint inside 
`addVolumeToOwnerList()`.
   
   If I only run `testAclDisabledListAllDisallowed` this one test directly in 
IntelliJ, the test case would just pass. This makes the problem weirder. 
Because I do call the shutdown function in `MiniOzoneClusterImpl` to do the 
cleanup. And it did [delete the temp directory for the entire 
cluster](https://github.com/apache/hadoop-ozone/blob/e2ebbf874d5e33565b27a24a02cfb4cee6330ea1/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/MiniOzoneClusterImpl.java#L392).
 This in theory should have performed the clean up work.
   
   My questions:
   
   1. Unless there are some other in-memory cache (`TableCache`) that is 
accidentally persisted across mini cluster (i.e. not fully cleaned up in 
`MiniOzoneClusterImpl`)? If this is the case we just need to somehow fix the 
test utility.
   
   2. Or could it be the case that the 
[`userTable`](https://github.com/apache/hadoop-ozone/blob/876bec0130094b24472a7017fdb1fd81a65023bc/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java#L140)
 is flushed by mistake? In this case this would be a major bug (outside the 
scope of this jira) that should be fixed.
   
   Pinging for some help @bharatviswa504 @elek 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to