[ https://issues.apache.org/jira/browse/LUCENE-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407858#comment-13407858 ]
Andi Vajda commented on LUCENE-4190: ------------------------------------ {quote} 1. subdirectories currently are a foreign concept to Directory, we would have to make some serious changes there to support subdirectories. 2. Lucene 3.x and Lucene4-alpha indexes still need to be supported, and we dont want to leave behind baggage when we merge, so the transition would be tricky. {quote} The way I imagined this working (short of looking at the code and proposing a patch) was to just append something like "lucene.index" to the directory path given by the user and pretend that's what the user gave it (and create that subdirectory). Code deleting the index files becomes simpler, it's just a recursive delete of that subdirectory Lucene created. That name, "lucene.index", could be in fact an additional parameter to the Directory class so that one could pick different names and store multiple indexes in the directory part. Having that extra name parameter would make it harder to a just use c:\ or c:\tmp which, while apparently silly, is also easy to hit accidentally. Imagine a shell script that puts together an index directory path with, say, some environment variables or command line parameters, and because of a bug there or some human error some inputs making up that path are now missing. Fancy generated path /home/fred/$(indexes) is now shorter all of a sudden, /home/fred gets used instead and later (partially) wiped. Ouch. No, I didn't just make this up :-) {quote} 3. the user could also do this on their own right? e.g. we still have the same situation we have currently, where anything in that directory can get deleted by lucene, its just underneath another layer. {quote} Of course they could, but the difference is that if Lucene is the one creating a directory, I expect it to more or less control the files in it, whereas if I create a directory myself, that is not the case. I don't expect Lucene to take it over by deleting files in it it didn't create. With the extra index name parameter, one would make it much less likely to stomp over something not belonging to Lucene. As for backwards compatibility, one could tolerate for the next release cycle, that this extra name be empty (which would be equivalent to the situation we have now, bug 4190). > IndexWriter deletes non-Lucene files > ------------------------------------ > > Key: LUCENE-4190 > URL: https://issues.apache.org/jira/browse/LUCENE-4190 > Project: Lucene - Java > Issue Type: Bug > Reporter: Michael McCandless > Assignee: Robert Muir > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4190.patch, LUCENE-4190.patch, LUCENE-4190.patch, > LUCENE-4190.patch > > > Carl Austin raised a good issue in a comment on my Lucene 4.0.0 alpha blog > post: > http://blog.mikemccandless.com/2012/07/lucene-400-alpha-at-long-last.html > IndexWriter will now (as of 4.0) delete all foreign files from the index > directory. We made this change because Codecs are free to write to any files > now, so the space of filenames is hard to "bound". > But if the user accidentally uses the wrong directory (eg c:/) then we will > in fact delete important stuff. > I think we can at least use some simple criteria (must start with _, maybe > must fit certain pattern eg _<base36>(_X).Y), so we are much less likely to > delete a non-Lucene file.... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org