[ https://issues.apache.org/jira/browse/LUCENE-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407033#comment-13407033 ]
Michael McCandless commented on LUCENE-4190: -------------------------------------------- I agree there is a real danger here if users accidentally point IndexWriter at the wrong directory. This was found/fixed way in the past already: LUCENE-385. But I also don't want to go back to the hairy files(), extensions() we used to require of all codec components. Yet I think there's a good middle ground: only allow a codec to write to _<seg>.* or _<seg>_*.* files (ie the ones created by IndexFileNames). All of our codecs are (should be!) using IndexFileName.* to compute a file name to write to. In reality a codec already isn't free to just write to any file, because then it may conflict with another codec doing the same thing. So de-facto codecs already have a "private" namespace, prefixed by _<seg> and further refined by _N (ie when there are multiple postings formats in a single codec). Since a general codec must already obey its private namespace (to not step on other codecs) I think it's fine to enforce it? > IndexWriter deletes non-Lucene files > ------------------------------------ > > Key: LUCENE-4190 > URL: https://issues.apache.org/jira/browse/LUCENE-4190 > Project: Lucene - Java > Issue Type: Bug > Reporter: Michael McCandless > Assignee: Robert Muir > Fix For: 4.0, 5.0 > > Attachments: LUCENE-4190.patch, LUCENE-4190.patch > > > Carl Austin raised a good issue in a comment on my Lucene 4.0.0 alpha blog > post: > http://blog.mikemccandless.com/2012/07/lucene-400-alpha-at-long-last.html > IndexWriter will now (as of 4.0) delete all foreign files from the index > directory. We made this change because Codecs are free to write to any files > now, so the space of filenames is hard to "bound". > But if the user accidentally uses the wrong directory (eg c:/) then we will > in fact delete important stuff. > I think we can at least use some simple criteria (must start with _, maybe > must fit certain pattern eg _<base36>(_X).Y), so we are much less likely to > delete a non-Lucene file.... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org