Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
HoustonPutman commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-2057545040 Closing in favor of https://github.com/apache/solr/pull/2349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
HoustonPutman closed pull request #2336: SOLR-16866: Deleting a non-existent directory should not fail URL: https://github.com/apache/solr/pull/2336 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
dsmiley commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-2050682221 Although I haven't looked at this in as much detail as others, I find Michael's arguments persuasive and would prefer we do #2349 instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
magibney commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-2012621142 Any thoughts on #2349 ? This issue creates a lot of log noise 😓 >>Maybe the error occurs while it's deleting a child tree, and the rest of the files arent' deleted? >Yeah I expect that'd be the case. Btw I no longer think there's a race condition leaving the file tree partially undeleted. >The stack trace to me looks like it could be there are 2 things doing the delete and its a race condition who gets there first? Otherwise I don't see how we would end up getting this exception. This isn't the top level path right? The stack trace says this is happening while walking the file tree... I think this is very likely a top-level path -- it's "walking the file tree", but it just hasn't gotten beyond the top level yet. In the vast majority of cases the approach taken by this PR should be fine (i.e. would not lead to partial deletion or other failure). Still, if we can get away with handling this in a stricter way I still think that's preferable. In the unlikely event that some other non-solr process really _is_ deleting stuff, we'd want to know about it; and if we swallow this error, we would have no way of knowing there's a problem. I think it's well within our ability to properly order the deletes for all the tracked directories -- the logic was already basically there, it just wasn't working in all cases, and #2349 should fix that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
magibney commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-1985873829 > Maybe the error occurs while it's deleting a child tree, and the rest of the files arent' deleted? Yeah I expect that'd be the case. If we can instead fix the removal-reordering logic, that would probably be best; esp. since some strictness around this would probably be a good thing (if data we think should be there is being deleted by some other means, that's not ideal in general). In this case we're pretty sure we know why it's being deleted, and that it's ok. But I suspect that simply fixing/extending the removal-reordering logic to cover the cases it's currently missing would fix this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
magibney commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-1985671846 Looked into this a little bit more, and I suspect the close errors are a symptom of some faulty logic in reordering removals. There's an attempt in `CachingDirectoryFactory.closeCacheValue(CacheValue)` to defer deletion of parent paths by [associating them with subpaths that are also tracked](https://github.com/apache/solr/blob/cebc10d060ff5fa7b9a52660a02a27c2ed6af1f0/solr/core/src/java/org/apache/solr/core/CachingDirectoryFactory.java#L255-L269). There are really two problems: one problem occurs when there are _two_ subpaths that match a given parent -- the parent is arbitrarily added to one or the other child, and thus the parent can be deleted before its subpath (defeating the reordering logic). The other problem (the one expect we're actually hitting most of the time) is that for removals that are deferred until core close, there is no removal reordering [upon iteration of CachingDirectoryFactory.removeEntries](https://github.com/apache/solr/blob/cebc10d060ff5fa7b9a52660a02a27c2ed6af1f0/solr/core/src/java/org/apache/solr/core/CachingDirectoryFactory.java#L213-L220) -- this is in arbitrary `HashMap` iteration order, so parent dirs will often be removed before child dirs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
risdenk commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-1984950183 The stack trace to me looks like it could be there are 2 things doing the delete and its a race condition who gets there first? Otherwise I don't see how we would end up getting this exception. This isn't the top level path right? The stack trace says this is happening while walking the file tree... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
HoustonPutman commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-1984577087 Not sure if this is the end-all-be-all approach. Maybe the error occurs while it's deleting a child tree, and the rest of the files arent' deleted? Here's the stack trace: ``` 2024-03-07 18:59:44.928 ERROR (parallelCoreAdminAPIBaseExecutor-20-thread-6-processing-solr-e2e-1-foo-solrcloud-0.test.solr.org:80_solr e2e-2_shard2_replica_n5 solr-e2e-1-foo-solrcloud-0.test.solr.org-78 move-replicas-foo-solrcloud-0698005595075831 unload) [ x:e2e-2_shard2_replica_n5 t:solr-e2e-1-foo-solrcloud-0.test.solr.org-78] o.a.s.c.CachingDirectoryFactory Error removing directory => java.nio.file.NoSuchFileException: /var/solr/data/e2e-2_shard2_replica_n5/data/index at java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source) java.nio.file.NoSuchFileException: /var/solr/data/e2e-2_shard2_replica_n5/data/index at sun.nio.fs.UnixException.translateToIOException(Unknown Source) ~[?:?] at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?] at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?] at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(Unknown Source) ~[?:?] at sun.nio.fs.UnixFileSystemProvider.readAttributes(Unknown Source) ~[?:?] at sun.nio.fs.LinuxFileSystemProvider.readAttributes(Unknown Source) ~[?:?] at java.nio.file.Files.readAttributes(Unknown Source) ~[?:?] at java.nio.file.FileTreeWalker.getAttributes(Unknown Source) ~[?:?] at java.nio.file.FileTreeWalker.visit(Unknown Source) ~[?:?] at java.nio.file.FileTreeWalker.walk(Unknown Source) ~[?:?] at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?] at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?] at org.apache.commons.io.file.PathUtils.visitFileTree(PathUtils.java:1654) ~[?:?] at org.apache.commons.io.file.PathUtils.lambda$deleteDirectory$0(PathUtils.java:503) ~[?:?] at org.apache.commons.io.file.PathUtils.withPosixFileAttributes(PathUtils.java:1774) ~[?:?] at org.apache.commons.io.file.PathUtils.deleteDirectory(PathUtils.java:502) ~[?:?] at org.apache.commons.io.file.PathUtils.deleteDirectory(PathUtils.java:487) ~[?:?] at org.apache.solr.core.StandardDirectoryFactory.removeDirectory(StandardDirectoryFactory.java:92) ~[?:?] at org.apache.solr.core.CachingDirectoryFactory.close(CachingDirectoryFactory.java:216) ~[?:?] at org.apache.solr.core.SolrCore.doClose(SolrCore.java:1891) ~[?:?] at org.apache.solr.core.SolrCore.close(SolrCore.java:1755) ~[?:?] at org.apache.solr.core.SolrCore.closeAndWait(SolrCore.java:1561) ~[?:?] at org.apache.solr.core.CoreContainer.unload(CoreContainer.java:2170) ~[?:?] at org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$1(CoreAdminOperation.java:128) ~[?:?] at org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:414) ~[?:?] at org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:374) ~[?:?] at org.apache.solr.handler.admin.CoreAdminHandler.lambda$handleRequestBody$0(CoreAdminHandler.java:235) ~[?:?] at org.apache.solr.handler.admin.CoreAdminHandler$CoreAdminAsyncTracker.lambda$submitAsyncTask$0(CoreAdminHandler.java:473) ~[?:?] at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:212) ~[metrics-core-4.2.20.jar:4.2.20] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:294) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] at java.lang.Thread.run(Unknown Source) [?:?] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
HoustonPutman commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-1984574418 > I'm curious, was there a reason for replacing FileUtils.deleteDirectory() with PathUtils.deleteDirectory()? @magibney https://issues.apache.org/jira/browse/SOLR-16074 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
magibney commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-1984545453 I'm curious, was there a reason for replacing `FileUtils.deleteDirectory()` with `PathUtils.deleteDirectory()`? Related: I've been in the CachingDirectoryFactory code recently and I recall there's some logic that's supposed to defer the deletion of parent directories until descendant directories are no longer referenced (have been deleted). This change seems reasonable to me (i.e., if you're asking to delete something, at the end of the day you want it gone; so if it's already gone, that shouldn't throw an exception). Still I'm going to take a look and see whether there might be a problem with the "deletion-deferral" code that's being uncovered by this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
HoustonPutman commented on code in PR #2336: URL: https://github.com/apache/solr/pull/2336#discussion_r1516795796 ## solr/core/src/java/org/apache/solr/cli/CreateTool.java: ## @@ -192,7 +191,7 @@ protected void createCore(CommandLine cli, SolrClient solrClient) throws Excepti } } catch (Exception e) { /* create-core failed, cleanup the copied configset before propagating the error. */ - PathUtils.deleteDirectory(coreInstanceDir); + org.apache.solr.util.FileUtils.deleteDirectory(coreInstanceDir); Review Comment: Unfortunately another FileUtils is used in this file as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
epugh commented on PR #2336: URL: https://github.com/apache/solr/pull/2336#issuecomment-1984396468 I think I saw some problems with the `RunExampleTool` when I was trying to make it so you could run it twice, and it would reset the state on the second run.. weird things happened, so eventually I gave up and just logged a "Your example already exists so you may get werid stuff".. I may go back and take a looksee at it as it may have been related to this... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16866: Deleting a non-existent directory should not fail [solr]
epugh commented on code in PR #2336: URL: https://github.com/apache/solr/pull/2336#discussion_r1516792080 ## solr/core/src/java/org/apache/solr/cli/CreateTool.java: ## @@ -192,7 +191,7 @@ protected void createCore(CommandLine cli, SolrClient solrClient) throws Excepti } } catch (Exception e) { /* create-core failed, cleanup the copied configset before propagating the error. */ - PathUtils.deleteDirectory(coreInstanceDir); + org.apache.solr.util.FileUtils.deleteDirectory(coreInstanceDir); Review Comment: Should this be a fully qualified package? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org