Re: Unable to hotcopy to a NAS shared directory: E720002
Philip Martin philip.mar...@wandisco.com writes: In the line SVN_ERR(svn_io_remove_file2(shm_name, TRUE, pool));, the TRUE parameter is supposed to suppress file not found errors yet that's the error I'm getting, isn't it? Yes, it should. It's failing because it's getting error 720002 and that is not an error the code recognises. I don't recognise it either. What sort of network drive is producing that error? E720002 on Windows means an ERROR_FILE_NOT_FOUND error [1] — that's 2L rebased onto the APR_OS_START_SYSERR (72). I did a quick peek into the svn_io_remove_file2() function. The fact that we are receiving the error means that we are getting it from one of the two apr_file_remove() calls. That's either after an attempt to remove a readonly attribute or from a retry loop. The reason why this error is propagated up the stack is that we only examine the 'ignore_enoent' argument after the first apr_file_remove() call. This is racy — if we get a EACCES during the first attempt to remove a file, and the file is simultaneously removed from the disk, the next attempt to remove it would fail with a ENOENT, even with 'ignore_enoent'. I think we should fix this by suppressing ENOENTs from every apr_file_remove() call, not just the first one. I am not sure about the reason why we are receiving an EACCES for the /db/rev-prop-atomics.shm in the destination filesystem. Hotcopy is non-incremental, and revprop caching got disabled in 1.8.11, so the file should not be in the destination. Hence an attempt to blow it away with the cleanup_revprop_namespace() call should be a no-op (I suspect that we could completely drop this call, but that's another story). It might be a NAS-specific behavior, but this is just a random guess. [1] https://msdn.microsoft.com/en-us/library/windows/desktop/ms681382 Regards, Evgeny Kotkov
RE: Unable to hotcopy to a NAS shared directory: E720002
-Original Message- From: Philip Martin [mailto:philip.mar...@wandisco.com] Sent: woensdag 21 januari 2015 20:17 To: Evgeny Kotkov Cc: Cory Riddell; Subversion Development Subject: Re: Unable to hotcopy to a NAS shared directory: E720002 Evgeny Kotkov evgeny.kot...@visualsvn.com writes: The reason why this error is propagated up the stack is that we only examine the 'ignore_enoent' argument after the first apr_file_remove() call. This is racy — if we get a EACCES during the first attempt to remove a file, and the file is simultaneously removed from the disk, the next attempt to remove it would fail with a ENOENT, even with 'ignore_enoent'. I think we should fix this by suppressing ENOENTs from every apr_file_remove() call, not just the first one. Sounds plausible. Windows code is tricky. When svn_io_remove_file2() gets EACCES it calls For something to return access denied on Windows it must exist. svn_io_set_file_read_write() passing ignore_enoent. That function has different handling of ignore_enoent as it only checks ENOENT while svn_io_remove_file2() and checks both ENOENT and ENOTDIR. svn_io_set_file_read_write() also doesn't have a WIN32_RETRY_LOOP. Are those differences intentional? File attributes are typically not involved with locking of the files. I prefer *not* to loop when in doubt, as bad loops can cause much bigger problems than a forgotten loop. A loop that just waits for something that isn't going to fix itself, is just a 12 second delay... Turn yet another delay loop around that in its caller and you are waiting for minutes. Another loop around that and it will be days. (Note that there are some retries in apr!) We had quite a few bugs in previous versions, where scenarios could cause major lockups caused by retries waiting for the wrong error conditions. The problem here is that we have a NAS that shows itself as a Windows device, but behaves differently. A typical Windows test run *never* triggers a retry loop for IO errors... nor should it. The io retry loops are workarounds for externally caused problems. Virusscanners, etc. In this case: are we really saying that hotcopy should work to a network drive? Even if it doesn't support the proper locking primitives? (We certainly recommend not to use servers on such a setup) Perhaps the proper recommendation is: hotcopy to a local drive first, and then copy to network storage. Bert -- Philip Martin | Subversion Committer WANdisco // *Non-Stop Data*
Re: Unable to hotcopy to a NAS shared directory: E720002
Evgeny Kotkov evgeny.kot...@visualsvn.com writes: The reason why this error is propagated up the stack is that we only examine the 'ignore_enoent' argument after the first apr_file_remove() call. This is racy — if we get a EACCES during the first attempt to remove a file, and the file is simultaneously removed from the disk, the next attempt to remove it would fail with a ENOENT, even with 'ignore_enoent'. I think we should fix this by suppressing ENOENTs from every apr_file_remove() call, not just the first one. Sounds plausible. Windows code is tricky. When svn_io_remove_file2() gets EACCES it calls svn_io_set_file_read_write() passing ignore_enoent. That function has different handling of ignore_enoent as it only checks ENOENT while svn_io_remove_file2() and checks both ENOENT and ENOTDIR. svn_io_set_file_read_write() also doesn't have a WIN32_RETRY_LOOP. Are those differences intentional? -- Philip Martin | Subversion Committer WANdisco // *Non-Stop Data*
Re: Unable to hotcopy to a NAS shared directory: E720002
Bert Huijben b...@qqmail.nl writes: Windows code is tricky. When svn_io_remove_file2() gets EACCES it calls For something to return access denied on Windows it must exist. Yes, the file exists when we try to remove it. svn_io_set_file_read_write() passing ignore_enoent. That function has different handling of ignore_enoent as it only checks ENOENT while svn_io_remove_file2() and checks both ENOENT and ENOTDIR. But it, and the parent directory, may have been removed by the time we call svn_io_set_file_read_write(). So can the apr_file_attrs_set() return ENOTDIR, and should it be ignored? svn_io_set_file_read_write() also doesn't have a WIN32_RETRY_LOOP. Are those differences intentional? File attributes are typically not involved with locking of the files. OK, so svn_io_set_file_read_write() doesn't need a retry loop, but does it ignore the right values when ignore_enoent is set? -- Philip Martin | Subversion Committer WANdisco // *Non-Stop Data*
Re: Unable to hotcopy to a NAS shared directory: E720002
On 1/21/2015 3:00 PM, Philip Martin wrote: Bert Huijben b...@qqmail.nl writes: Windows code is tricky. When svn_io_remove_file2() gets EACCES it calls For something to return access denied on Windows it must exist. Yes, the file exists when we try to remove it. The more I dig, the less certain I am of what's going on. I made a small program that does what svn_io_remove_file2() does. Snippet const char* path_to_nonexistent_file = Diskstation\\svn\\MyRepo\\db\\rev-prop-atomics.shm; apr_status_t apr_err = apr_file_remove(path_to_nonexistent_file, NULL); TRACE(_T(apr_err = %d\n), apr_err); TRACE(_T(APR_STATUS_IS_ENOENT = %d\n), APR_STATUS_IS_ENOENT(apr_err)); apr_status_t status = apr_file_attrs_set(path_to_nonexistent_file, 0, APR_FILE_ATTR_READONLY, 0); TRACE(_T(status = %d\n), status); apr_err = apr_file_remove(path_to_nonexistent_file, NULL); TRACE(_T(apr_err = %d\n), apr_err); TRACE(_T(APR_STATUS_IS_ENOENT = %d\n), APR_STATUS_IS_ENOENT(apr_err)); And the output is: apr_err = 720005 APR_STATUS_IS_ENOENT = 0 status = 720002 apr_err = 720002 APR_STATUS_IS_ENOENT = 1 If I run the program a second time, I get the same output again. So, the first error is a EACCESS, then the non-existent file is made writable, then a ENOENT error is generated. It doesn't make sense to me, but I think the bottom line is that the second apr_file_remove call can generate an ENOENT result and if the ignore_enoent parameter is TRUE, then the function should return SVN_NO_ERROR after the second (or third) try. Cory
Re: svn commit: r1653609 - /subversion/trunk/subversion/include/svn_repos.h
On 21.01.2015 18:21, julianf...@apache.org wrote: Author: julianfoad Date: Wed Jan 21 17:21:45 2015 New Revision: 1653609 URL: http://svn.apache.org/r1653609 Log: Clarify an unusual 'decoded URL' parameter by renaming it. Fix a problem whereby it was passed to the Ev2 shims (if enabled) as if it were a normal URL. I have not investigated whether the Ev2 shims problem was real or just theoretical, nor tried to test the Ev2 shims after this fix. Please be aware that JavaHL depends on the Ev2 shims and expects them to work. -- Brane