> On 08 Mar 2016, at 12:02, Chas Williams <3ch...@gmail.com> wrote: > > On Tue, 2016-03-08 at 10:16 +0100, Denis Lohner wrote: >> Am 08.03.2016 um 04:37 schrieb Benjamin Kaduk: >>>>> There >>>>>>> are many call paths in the cache manager that end up at this function, >>>>>>> most of which are not prepared to properly handle an ERESTARTSYS >>>>>>> return. >>>>>>> Since this status can be returned after some data has already been >>>>>>> written, the correct behavior upon receiving it is far from clear ... >>>>>>> a >>>>>>> path towards a client free of this vector for data corruption may >>>>>>> involve >>>>>>> avoiding the dependence on splice_from_pipe_next() in preference to >>>>>>> adopting all call sites to handle the ERSTARTSYS case. >>>>> >>>>> For the 1.6 release, this seems the best choice of action. The "real" >>>>> fix would likely be difficult to completely test in a timely fashion. >>> That only helps if we know what the replacement would be...I am not a >>> linux VFS expert and do not have any ideas right now. >> >> >> I am not a kernel/driver developer nor a file system developer. So >> please forgive, if the following makes no sense at all. >> >> As far as I understand the issue and the openafs sources, the problem >> arises as afs_linux_storeproc uses the splice api that can return >> ERESTARTSYS as of kernel version 4.4. >> A quick search in the NEWS file and git logs suggests that >> afs_linux_storeproc was introduced in OpenAFS 1.5.69 (2010-01-19) as a >> performance improvement: >> " Linux >> >> * Use splice to speed up storing files." >> >> The original behaviour which uses seperate reads/writes instead of >> splice and that is (still) used on non-linux systems remained in >> afs_GenericStoreProc in src/afs/afs_fetchstore.c . >> >> So my question is: Is it possible to rereplace afs_linux_storeproc with >> afs_GenericStoreProc on linux kernel versions >=4.4 as a temporary >> solution to the issue either in the openafs sources or as a distribution >> specific patch, trading some performance for data integrity? > > That would be the first thing I tried. This code was brought into the > tree on commit 34ffc9cd7d7eed62229704ad0e1d327f076ea7b6. There doesn't seem > to be any additional side effects, so simply not using afs_linux_storeproc > should > still work.
So we'd simply do something like this diff --git a/src/afs/afs_fetchstore.c b/src/afs/afs_fetchstore.c index f65f40c..2630209 100644 --- a/src/afs/afs_fetchstore.c +++ b/src/afs/afs_fetchstore.c @@ -326,7 +326,7 @@ struct storeOps rxfs_storeUfsOps = { .padd = rxfs_storePadd, .close = rxfs_storeClose, .destroy = rxfs_storeDestroy, -#ifdef AFS_LINUX26_ENV +#if defined(AFS_LINUX26_ENV) && defined(LINUX_USE_SPLICE) .storeproc = afs_linux_storeproc #else .storeproc = afs_GenericStoreProc and add a configure test defaulting to off? -- Stephan Wiesand DESY - DV - Platanenallee 6 15738 Zeuthen, Germany _______________________________________________ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info