[ 
https://issues.apache.org/jira/browse/LUCENE-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407858#comment-13407858
 ] 

Andi Vajda commented on LUCENE-4190:
------------------------------------


{quote}
1. subdirectories currently are a foreign concept to Directory, we would have
to make some serious changes there to support subdirectories.
2. Lucene 3.x and Lucene4-alpha indexes still need to be supported, and we
dont want to leave behind baggage when we merge, so the transition would be
tricky.
{quote}

The way I imagined this working (short of looking at the code and proposing a
patch) was to just append something like "lucene.index" to the directory path
given by the user and pretend that's what the user gave it (and create that
subdirectory). Code deleting the index files becomes simpler, it's just a
recursive delete of that subdirectory Lucene created.

That name, "lucene.index", could be in fact an additional parameter to the
Directory class so that one could pick different names and store multiple
indexes in the directory part.

Having that extra name parameter would make it harder to a just use c:\ or 
c:\tmp which, while apparently silly, is also easy to hit accidentally. Imagine 
a shell script that puts together an index directory path with, say, some 
environment variables or command line parameters, and because of a bug there or 
some human error some inputs making up that path are now missing. Fancy 
generated path /home/fred/$(indexes) is now shorter all of a sudden, /home/fred 
gets used instead and later (partially) wiped. Ouch. No, I didn't just make 
this up :-)

{quote}
3. the user could also do this on their own right? e.g. we still have the same 
situation we have currently, where anything in that directory can get deleted 
by lucene, its just underneath another layer.
{quote}

Of course they could, but the difference is that if Lucene is the one creating
a directory, I expect it to more or less control the files in it, whereas if I
create a directory myself, that is not the case. I don't expect Lucene to take
it over by deleting files in it it didn't create.

With the extra index name parameter, one would make it much less likely to 
stomp over something not belonging to Lucene. As for backwards compatibility, 
one could tolerate for the next release cycle, that this extra name be empty 
(which would be equivalent to the situation we have now, bug 4190).

                
> IndexWriter deletes non-Lucene files
> ------------------------------------
>
>                 Key: LUCENE-4190
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4190
>             Project: Lucene - Java
>          Issue Type: Bug
>            Reporter: Michael McCandless
>            Assignee: Robert Muir
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4190.patch, LUCENE-4190.patch, LUCENE-4190.patch, 
> LUCENE-4190.patch
>
>
> Carl Austin raised a good issue in a comment on my Lucene 4.0.0 alpha blog 
> post: 
> http://blog.mikemccandless.com/2012/07/lucene-400-alpha-at-long-last.html
> IndexWriter will now (as of 4.0) delete all foreign files from the index 
> directory.  We made this change because Codecs are free to write to any files 
> now, so the space of filenames is hard to "bound".
> But if the user accidentally uses the wrong directory (eg c:/) then we will 
> in fact delete important stuff.
> I think we can at least use some simple criteria (must start with _, maybe 
> must fit certain pattern eg _<base36>(_X).Y), so we are much less likely to 
> delete a non-Lucene file....

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to