Found the problem. It was the Linux WebDAV driver. Further experimentation from the shell with a WebDAV mount showed flawless operation from OS X for a variety of file sizes and load, while the Linux WebDAV driver (davfs2 version 1.4.6-1ubuntu3) mount exhibited the same problems I’ve been having from my OSGI thread.
I tried switched from using davfs2, e.g.: mount -t davfs -o gid=sling,rw,uid=sling,username=admin http://localhost:8090 /mnt/jcr to using fusedav: fusedav -p=admin -u=admin http://localhost:8090 /mnt/jcr and the problems ceased, however the fusedav driver lacked many of the options provided by davfs2. A bit more digging yielded a new davfs2 version based on the new libneon27 WebDAV API [1]: Installing the newer davfs2 package, 1.5.2-1, as well as disabling locks in /etc/davfs2/davfs2.conf also fixed the problem. This is preferable to the fusedav option as davfs2 has considerably more functionality and options than fusedav. WebDAV I/O is now reliable. Have been running load testing for hours now with no errors. Thanks again for the pointers Bertrand. [1] https://launchpad.net/ubuntu/+source/davfs2 -Bruce From: Bertrand Delacretaz <[email protected]<mailto:[email protected]>> Reply-To: users <[email protected]<mailto:[email protected]>> Date: Wednesday, December 10, 2014 at 1:10 AM To: users <[email protected]<mailto:[email protected]>> Subject: Re: WebDAV write problems, loses files with no error, no consistency Hi, On Wed, Dec 10, 2014 at 8:03 AM, Bruce Edge <[email protected]<mailto:[email protected]>> wrote: I¹m not seeing all the files I¹m writing out to webdav from a bundle thread and I¹m losing files.... To reformulate, IIUC you are running java code that only works with File objects, and those are actually stored in Sling's JCR repository because your code works on a WebDAV mounted folder? If yes, I would stop (in a debugger) when detecting a failure, and examine the Sling repository at this to see exactly which nodes/files were created and what their state is. I suspect there might be name collisions which mean the JCR content is not what you expect. Or worse, concurrency issues, but our WebDAV stuff is fairly stable and well tested code so it would be surprising to discover this now. Doing this processing inside Sling, accessing the data via JCR is probably more efficient - but you're right that this should also work under WebDAV. It's hard to debug your code by reading it here, if you can reduce to the smallest thing that fails we might be able to help better. -Bertrand
