Bernhard Messer wrote:

Dmitry,

Bernhard Messer wrote:

hi,

CompoundFileReader class contains some code where i can't follow the idea behind it. Maybe somebody else can switch on the light for me, so i can see the track. There are 2 public methods which definitly don't work as expected. I know, extending Directory forces one to implement the methods, but in that particular, case the implementation is just confusing me and my be other people too.

   public long fileModified(String name) throws IOException {
       return directory.fileModified(fileName);
   }

   public void touchFile(String name) throws IOException {
       directory.touchFile(fileName);
   }

Looking at the implementation, both methods are working on the compound filename itself, regardless what the filename passed in has as it's value. It would be much more understandable, if these methods throw some UnsupportedOperationException. The other way is to to change them in a way, that the underlaying directory method calls will get the real filename passed in and not the compound filename itself.



Well, the reason I did it this way is because I thought this would be the least amount of disruption to the programs out there that might be using these APIs. You can't really pass the "name" into the directory since it doesn't know about these as individual files. Directoy only knows about the compound file.


I'm not sure if this is correct. Looking at the implementation for example in FSDirectory, every file, doesn't matter if it is related to Lucene or not can be touched.

Yes, but the usual files that you find in the old-style segment, the ones that the CompoundFileReader and the rest of Lucene know about, are not present on the file system when the compound files are used. So FSDirectory only knows about the compound file, while everything up from the CompoundFileReader still thinks that there are multiple files in a given segment.



To implement the fileModified() fully, you could just store timestamps in the file, but then they would just the same as the timestamp on the overall file, unless there was also touchFile() support.To implement touch file, you'd have to open the file in random access and update the timestamp field of an individual file. This can certainly be done, but I didn't have a need for it. You could throw the Unsupported exception, but this could make callers have to change. Anyway, the compromise I chose was to treat a "touch" on one file as if a "touch" on all files for the segment. This works in most usages. The only time this would be a problem is if you implemented some kind of timestamp set/check that would depend on files in a segment having different timestamps. This might be important for updating segments, but since this is never done, I'm not sure this is really that useful. Do you have case in mind when this is proving to be a limitation?

Agree with you. I don't see the need for a full implementation of touch file and lastModified for the internal used compound file parts or any other file. But the way it is implemented now, it just does something different than it looks for the user of the API. The idea i had in mind, was to implement it in a way that the compound file can be touched and lastModified can be read also. If the user passes in a filename, different to the compound file name, either an UnsupportedOperationException or even better an IOException could be thrown.

See above. The user knows about the old-style files only, it does not know about the cfs file. On the other hand, FSDirectory knows only of cfs file and not of the .f1, .f2, .fdt, and so on.


What the implementation is trying to do (unless I'm forgetting something) is to accept the .f1, .f2, etc names as input and change the timestamp of the .cfs file regardless of which particular segment file was requested. This makes it look like your call resulted in the expected behavior (in that calling fileModified with the same name will give back the same timestamp), but also that *someone else* has also called touchFile on all other files as well. I think this is not unreasonable and provides the most compatible behavior for the upper layers, short of a full implementation. Does this make sense?


what do you think ?

Bernhard


just a thought ;-)

bernhard



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to