On Wed, Jan 8, 2020 at 12:02 PM Simon Baatz <gmbno...@gmail.com> wrote:
> On Tue, Jan 07, 2020 at 04:45:54PM -0500, Brian Bouterse wrote: > > We had two bugs filed recently [0][1] which suggest that when using > the > > default backend for Pulp, i.e. pulpcore.app.models.storage.FileSystem > > Pulp should not be "moving" files. This is the default behavior Django > > gives us, and it destroys data when sync'ed from file:/// for example > > [1]. > > I am surprised that file:/// should be supported at all. This means > that a worker process must be able to access basically any file on the > system. Is this covered by the (upcoming?) SELinux policy? I would > expect workers to be more constrained than that. Or do we expect > the user to label files before importing? > The user would need to label the files ahead of time. The use case is that a large amount of content is stored on a hard drive which is mounted on the worker. A lot of setups of Pulp use this as a mass-import method before switching their sync's back to the CDN. > > > I propose that with 3.1 we fix this bug by switching > > pulpcore.app.models.storage.FileSystem to leave files in place and > > either hard-link (same filesystem) or copy (different filesystem). > > We should not use hard links: > > - On systems with sysctl "fs.protected_hardlinks" enabled (Ubuntu, Fedora), > the files would have to be owned by the pulp user to be able to > create hard-links at all. > - On systems with SELinux enabled, these hard links will share > the labels. > - Imported hard links are not protected against modification of the > original file. > > We could try to reflink the file and fall-back to copy (like "cp > --reflink=auto" does) > Great idea; let me try doing it this way. I'll link to my POC pr when it's up. Feel free also to share one.
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev