[ 
http://issues.apache.org/jira/browse/LUCENE-665?page=comments#action_12431801 ] 
            
Doron Cohen commented on LUCENE-665:
------------------------------------

I think I know which software is causing/exposing this behavior in my 
environment.
This is the SVN client I am using - TortoiseSVN. 

I tried the following sequence:
 1) Run with TortoiseSVN installed - the test generates these "access denied:" 
errors (and bypasses them). 
 2) Uninstalled TortoiseSVN (+reboot), run test - pass with no "access denied" 
errorrs. 
 3) Installed TortoiseSVN again (+reboot), run test - same "access denied" 
errors again. 

I am using most recent stable TotoiseSVN version - 1.3.5 build 6804 - 32 bit, 
for svn-1.3.2, downloaded from http://tortoisesvn.tigris.org/.

There is an interesting discussion thread of these type of errors on Windows 
platforms in svn forums - http://svn.haxx.se/dev/archive-2003-10/0136.shtml. At 
that case it was svn that suffers from these errors.

It says "...Windows allows applications to "tag-along" to see when a file has 
been written - they will wait for it to close and then do whatever they do, 
usually opening a file descriptor or handle. This would prevent that file from 
being renamed for a brief period..."

TortoiseSVN is a shell extension integrated into Windows explorer. As such, it 
probably demonstrates the "tag-along" behavior described above.

(BTW, it is a great svn client to my opinion)

Here is another excerpt from that discussion thread - 
>>
>> sleep(1) would work, I suppose. ;~) 
>>
> Most of the time, but not all the time. The only way I've made it work 
> well on all the machines I've tried it on is to put it into a sleep(1) 
> and retry loop of at *least* 20 or so attempts. Anything less and it 
> still fails on some machines. That implies it is very dependent on 
> machine speed or something, which means sleep times/retry times are just 
> guessing games at best. 
>
> If I could just get it recreated outside of Subversion and prove it's a 
> Microsoft problem...although it probably still wouldn't get fixed for 
> months at least. 

We don't know that this is a bug in TortoiseSVN.
We cannot tell that there are no other such tag-along applications in users 
machines.
One cannot seriously expect this Win32 behavior to be fixed.

I guess the question is - is it worth for Lucene to attempt to at least reduce 
chances of failures in this case (I say yes:-)

> temporary file access denied on Windows
> ---------------------------------------
>
>                 Key: LUCENE-665
>                 URL: http://issues.apache.org/jira/browse/LUCENE-665
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 2.0.0
>         Environment: Windows
>            Reporter: Doron Cohen
>         Attachments: FSDirectory_Retry_Logic.patch, 
> FSDirs_Retry_Logic_3.patch, Test_Output.txt, TestInterleavedAddAndRemoves.java
>
>
> When interleaving adds and removes there is frequent opening/closing of 
> readers and writers. 
> I tried to measure performance in such a scenario (for issue 565), but the 
> performance test failed  - the indexing process crashed consistently with 
> file "access denied" errors - "cannot create a lock file" in 
> "lockFile.createNewFile()" and "cannot rename file".
> This is related to:
> - issue 516 (a closed issue: "TestFSDirectory fails on Windows") - 
> http://issues.apache.org/jira/browse/LUCENE-516 
> - user list questions due to file errors:
>   - 
> http://www.nabble.com/OutOfMemory-and-IOException-Access-Denied-errors-tf1649795.html
>   - 
> http://www.nabble.com/running-a-lucene-indexing-app-as-a-windows-service-on-xp%2C-crashing-tf2053536.html
> - discussion on lock-less commits 
> http://www.nabble.com/Lock-less-commits-tf2126935.html
> My test setup is: XP (SP1), JAVA 1.5 - both SUN and IBM SDKs. 
> I noticed that the problem is more frequent when locks are created on one 
> disk and the index on another. Both are NTFS with Windows indexing service 
> enabled. I suspect this indexing service might be related - keeping files 
> busy for a while, but don't know for sure.
> After experimenting with it I conclude that these problems - at least in my 
> scenario - are due to a temporary situation - the FS, or the OS, is 
> *temporarily* holding references to files or folders, preventing from 
> renaming them, deleting them, or creating new files in certain directories. 
> So I added to FSDirectory a retry logic in cases the error was related to 
> "Access Denied". This is the same approach brought in 
> http://www.nabble.com/running-a-lucene-indexing-app-as-a-windows-service-on-xp%2C-crashing-tf2053536.html
>  - there, in addition to the retry, gc() is invoked (I did not gc()). This is 
> based on the *hope* that a access-denied situation would vanish after a small 
> delay, and the retry would succeed.
> I modified FSDirectory this way for "Access Denied" errors during creating a 
> new files, renaming a file.
> This worked fine for me. The performance test that failed before, now managed 
> to complete. There should be no performance implications due to this 
> modification, because only the cases that would otherwise wrongly fail are 
> now delaying some extra millis and retry.
> I am attaching here a patch - FSDirectory_Retry_Logic.patch - that has these 
> changes to FSDirectory. 
> All "ant test" tests pass with this patch.
> Also attaching a test case that demostrates the problem - at least on my 
> machine. There two tests cases in that test file - one that works in system 
> temp (like most Lucene tests) and one that creates the index in a different 
> disk. The latter case can only run if the path ("D:" , "tmp") is valid.
> It would be great if people that experienced these problems could try out 
> this patch and comment whether it made any difference for them. 
> If it turns out useful for others as well, including this patch in the code 
> might help to relieve some of those "frustration" user cases.
> A comment on state of proposed patch: 
> - It is not a "ready to deploy" code - it has some debug printing, showing 
> the cases that the "retry logic" actually took place. 
> - I am not sure if current 30ms is the right delay... why not 50ms? 10ms? 
> This is currently defined by a constant.
> - Should a call to gc() be added? (I think not.)
> - Should the retry be attempted also on "non access-denied" exceptions? (I 
> think not).
> - I feel it is somewhat "woodoo programming", but though I don't like it, it 
> seems to work... 
> Attached files:
> 1. TestInterleavedAddAndRemoves.java - the LONG test that fails on XP without 
> the patch and passes with the patch.
> 2. FSDirectory_Retry_Logic.patch
> 3. Test_Output.txt- output of the test with the patch, on my XP. Only the 
> createNewFile() case had to be bypassed in this test, but for another program 
> I also saw the renameFile() being bypassed.
> - Doron

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to