[ https://issues.apache.org/jira/browse/GERONIMO-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Donald Woods reassigned GERONIMO-3489: -------------------------------------- Assignee: Donald Woods > Deployment problems caused by file deletion failures > ---------------------------------------------------- > > Key: GERONIMO-3489 > URL: https://issues.apache.org/jira/browse/GERONIMO-3489 > Project: Geronimo > Issue Type: Bug > Security Level: public(Regular issues) > Components: deployment > Affects Versions: 2.0.1 > Reporter: Ted Kirby > Assignee: Donald Woods > Fix For: 2.0.2, 2.0.x, 2.1 > > Attachments: G3489-1.patch, G3489-2.patch > > > File.delete() failures in IOUtil.recursiveDelete() are causing various > deployment problems. I open this JIRA to discuss them to see how the server > might better handle them. In all but one case, delete failures are not even > noted with a log record! Deletion problems are seen in many environments and > platforms, but they are persistently fatal when using a NFS file system for > the repository. > In investigating the problem, I have added code to recursiveDelete to retry > the delete a few times if it fails. I added code to list directory contents > if a directory delete failed, and saw a file named > .nfs000000002bc43500000053e in the directory. My first attempt at a bypass > was to retry a failed delete 5 times, sleeping a second before each try. > This did not work. I added a call to System.gc() before each sleep, and this > got me passed the problem. Interestingly, two retries were required to get > this to work. In another version, each retry was a second longer, and I > printed all file names in a directory before trying the delete. This worked > in most cases, but required the full 5 retries, so I suspect System.gc() > would have time. System.runFinalization() would be something else to try. > RepositoryConfigurationStore.createNewConfigurationDir(Artifact) shows the > failing end of the deletion problem, with the dreaded > ConfigurationAlreadyExistsException("Configuration already exists: " + > configId)exception. I think this message is not good. It should really say > directory already exists. If the file is not deleted on undeploy, this > failure occurs on a subsequent deploy. What is really bad is if the user > invokes a redeploy operation, and the file delete fails on the undeploy. It > is important that undeploy not complete until the file goes away. > From other environments, I am not convinced that all file handles and > references, and particularly open streams, are being closed on some > artifacts. This will cause the delete to fail. It may be that the gc() > calls are cleaning these up, and allowing the deletes to work in my case > above. > Another option is that > RepositoryConfigurationStore.createNewConfigurationDir(Artifact) not throw a > ConfigurationAlreadyExistsException if the only problem is an empty directory > structure exists. The next line creates the directory structure anyway. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.