On Mon, Mar 5, 2018 at 8:21 PM, Paul Anderson <p...@umich.edu> wrote:
> Hi, > > tl;dr summary of below: flock() works, but what does it take to make > sync()/fsync() work in a 3 node GFS cluster? > > I am under the impression that POSIX flock, POSIX > fcntl(F_SETLK/F_GETLK,...), and POSIX read/write/sync/fsync are all > supported in cluster operations, such that in theory, SQLite3 should > be able to atomically lock the file (or a subset of page), modify > pages, flush the pages to gluster, then release the lock, and thus > satisfy the ACID property that SQLite3 appears to try to accomplish on > a local filesystem. > > In a test we wrote that fires off 10 simple concurrernt SQL insert, > read, update loops, we discovered that we at least need to use flock() > around the SQLite3 db connection open/update/close to protect it. > > However, that is not enough - although from testing, it looks like > flock() works as advertised across gluster mounted files, sync/fsync > don't appear to, so we end up getting corruption in the SQLite3 file > (pragma integrity_check generally will show a bunch of problems after > a short test). > > Is what we're trying to do achievable? We're testing using the docker > container gluster/gluster-centos as the three servers, with a php test > inside of php-cli using filesystem mounts. If we mount the gluster FS > via sapk/plugin-gluster into the php-cli containers using docker, we > seem to have better success sometimes, but I haven't figured out why, > yet. > > I did see that I needed to set the server volume parameter > 'performance.flush-behind off', otherwise it seems that flushes won't > block as would be needed by SQLite3. > If you are relying on fsync this shouldn't matter as fsync makes sure data is synced to disk. > Does anyone have any suggestions? Any words of widsom would be much > appreciated. > Can you experiment with turning on/off various performance xlators? Based on earlier issues, its likely that there is stale metadata which might be causing the issue (not necessarily improper fsync behavior). I would suggest turning off all performance xlators. You can refer [1] for a related discussion. In theory the only perf xlator relevant for fsync is write-behind and I am not aware of any issues where fsync is not working. Does glusterfs log file has any messages complaining about writes or fsync failing? Does your application use O_DIRECT? If yes, please note that you need to turn the option performance.strict-o-direct on for write-behind to honour O_DIRECT Also, is it possible to identify nature of corruption - Data or metadata? More detailed explanation will help to RCA the issue. Also, is your application running on a single mount or from multiple mounts? Can you collect strace of your application (strace -ff -T -p <pid> -o <file>)? If possible can you also collect fuse-dump using option --dump-fuse while mounting glusterfs? [1] http://lists.gluster.org/pipermail/gluster-users/2018-February/033503.html > Thanks, > > Paul > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users