Re: [Pulp-list] Changing working_directory and/or reducing disk utilization during sync

2017-04-03 Thread Christina Plummer
Thanks Michael!  I appreciate the response.
Oracle seems to have their own special way of doing things, for sure.

On Fri, Mar 31, 2017 at 5:54 PM, Michael Hrivnak 
wrote:

> I just looked at the repo, and other.xml is 720MB compressed!!! Wow! I
> wonder what's in there.
>
> For comparison, just for fun, I checked RHEL 6.8. The other.xml file there
> is under 5MB compressed.
>
> The setting to change where the working directory lives is intended to
> help in a scenario where you're using a slow/latent network filesystem in a
> Pulp cluster. It allows a worker process to potentially use fast local
> storage for transient data. But you do pay a small price for having to
> eventually copy some data from that filesystem to the shared one.
>
> Thus on a single-machine deployment, it pays to have /var/cache/pulp on
> the same filesystem as /var/lib/pulp.
>
> When changing the setting, restarting services is all you need to do,
> besides of course ensure that the "apache" user can write to the new
> location.
>
> Otherwise, there's no option for reducing Pulp's disk usage during sync.
> It has to download that 720MB file, and it does end up storing all of that
> data on disk uncompressed temporarily while the sync takes place. I theory
> we could modify that workflow to store those temporary data blobs (one for
> each RPM) compressed in the working directory. But it's not currently
> optimized for gigantic metadata files, and I'm not sure if it would be
> worth adding that complexity and overhead for a rare use case.
>
> Michael
>
> On Wed, Mar 29, 2017 at 9:57 AM, Christina Plummer 
> wrote:
>
>>
>> We found the "working_directory" setting in server.conf, but couldn't
>> find much documentation about it. Since this is a production system, I
>> wanted to check with the list first to confirm:
>> 1) Will changing this to a location on a different, larger filesystem
>> address my issues with /var utilization spikes during repo sync?
>> 2) Are there any special considerations to changing this setting, other
>> than restarting all the services?  Do I need to copy the subdirectories? Is
>> a symlink a bad idea? It looks like the SELinux context probably needs to
>> be set to pulp_var_cache_t.
>> 3) Is there another way to reduce Pulp's utilization during the sync?
>> This repo seems to be particularly egregious in terms of the massive size
>> of the uncompressed other.db and filelists.db for some reason.
>>
>> Thanks,
>> Christina
>>
>> ___
>> Pulp-list mailing list
>> Pulp-list@redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-list
>>
>
>
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Changing working_directory and/or reducing disk utilization during sync

2017-03-31 Thread Michael Hrivnak
I just looked at the repo, and other.xml is 720MB compressed!!! Wow! I
wonder what's in there.

For comparison, just for fun, I checked RHEL 6.8. The other.xml file there
is under 5MB compressed.

The setting to change where the working directory lives is intended to help
in a scenario where you're using a slow/latent network filesystem in a Pulp
cluster. It allows a worker process to potentially use fast local storage
for transient data. But you do pay a small price for having to eventually
copy some data from that filesystem to the shared one.

Thus on a single-machine deployment, it pays to have /var/cache/pulp on the
same filesystem as /var/lib/pulp.

When changing the setting, restarting services is all you need to do,
besides of course ensure that the "apache" user can write to the new
location.

Otherwise, there's no option for reducing Pulp's disk usage during sync. It
has to download that 720MB file, and it does end up storing all of that
data on disk uncompressed temporarily while the sync takes place. I theory
we could modify that workflow to store those temporary data blobs (one for
each RPM) compressed in the working directory. But it's not currently
optimized for gigantic metadata files, and I'm not sure if it would be
worth adding that complexity and overhead for a rare use case.

Michael

On Wed, Mar 29, 2017 at 9:57 AM, Christina Plummer 
wrote:

>
> We found the "working_directory" setting in server.conf, but couldn't find
> much documentation about it. Since this is a production system, I wanted to
> check with the list first to confirm:
> 1) Will changing this to a location on a different, larger filesystem
> address my issues with /var utilization spikes during repo sync?
> 2) Are there any special considerations to changing this setting, other
> than restarting all the services?  Do I need to copy the subdirectories? Is
> a symlink a bad idea? It looks like the SELinux context probably needs to
> be set to pulp_var_cache_t.
> 3) Is there another way to reduce Pulp's utilization during the sync? This
> repo seems to be particularly egregious in terms of the massive size of the
> uncompressed other.db and filelists.db for some reason.
>
> Thanks,
> Christina
>
> ___
> Pulp-list mailing list
> Pulp-list@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list
>
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Changing working_directory and/or reducing disk utilization during sync

2017-03-29 Thread Christina Plummer
>
> The short answer is that if you need to sync Oracle Linux sync one distro
> at a time and leave enough space.


Yes, I understand and suspected as much.  My question was primarily about
setting the working_directory setting in server.conf, since this does not
seem to be well documented.

Thanks,
Christina
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Changing working_directory and/or reducing disk utilization during sync

2017-03-29 Thread Christina Plummer
>
> This may be unrelated to the sync problem - but do you have the export
> distributor configured on that repo?


Hi Mihai,
No, we aren't using the export distributor - but I'll keep that in mind if
we end up needing it later.

Thanks,
Christina
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Changing working_directory and/or reducing disk utilization during sync

2017-03-29 Thread Mihai Ibanescu
This may be unrelated to the sync problem - but do you have the export
distributor configured on that repo?

It doesn't affect syncing at all, but there is a publish operation at the
end of the sync. The export distributor tries hard to burn all your CPU
while running mkisofs (for my usecase we don't need ISO images), as we have
found out the hard way. That task may consume some extra space as well
(which, as you point out, will be released)

Mihai

On Wed, Mar 29, 2017 at 9:57 AM, Christina Plummer 
wrote:

> Hello all,
>
> I am running Pulp 2.9.2. We are facing issues with our /var filesystem
> filling up when we do our nightly syncs - in particular, when we sync the
> Oracle Linux channel:
> http://public-yum.oracle.com/repo/OracleLinux/OL6/latest/x86_64/
>
> Syncing this one repo uses 5+ GB of space on /var while the sync is
> running.  Usage typically returns to normal once the sync completes
> (although if the filesystem completely fills we have had to restart
> pulp_workers in order to clear it). We have increased the size of the
> filesystem already (currently 8GB), but don't want to keep having to do so.
>
> We found the "working_directory" setting in server.conf, but couldn't find
> much documentation about it. Since this is a production system, I wanted to
> check with the list first to confirm:
> 1) Will changing this to a location on a different, larger filesystem
> address my issues with /var utilization spikes during repo sync?
> 2) Are there any special considerations to changing this setting, other
> than restarting all the services?  Do I need to copy the subdirectories? Is
> a symlink a bad idea? It looks like the SELinux context probably needs to
> be set to pulp_var_cache_t.
> 3) Is there another way to reduce Pulp's utilization during the sync? This
> repo seems to be particularly egregious in terms of the massive size of the
> uncompressed other.db and filelists.db for some reason.
>
> Thanks,
> Christina
>
> ___
> Pulp-list mailing list
> Pulp-list@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list
>
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

[Pulp-list] Changing working_directory and/or reducing disk utilization during sync

2017-03-29 Thread Christina Plummer
Hello all,

I am running Pulp 2.9.2. We are facing issues with our /var filesystem
filling up when we do our nightly syncs - in particular, when we sync the
Oracle Linux channel:
http://public-yum.oracle.com/repo/OracleLinux/OL6/latest/x86_64/

Syncing this one repo uses 5+ GB of space on /var while the sync is
running.  Usage typically returns to normal once the sync completes
(although if the filesystem completely fills we have had to restart
pulp_workers in order to clear it). We have increased the size of the
filesystem already (currently 8GB), but don't want to keep having to do so.

We found the "working_directory" setting in server.conf, but couldn't find
much documentation about it. Since this is a production system, I wanted to
check with the list first to confirm:
1) Will changing this to a location on a different, larger filesystem
address my issues with /var utilization spikes during repo sync?
2) Are there any special considerations to changing this setting, other
than restarting all the services?  Do I need to copy the subdirectories? Is
a symlink a bad idea? It looks like the SELinux context probably needs to
be set to pulp_var_cache_t.
3) Is there another way to reduce Pulp's utilization during the sync? This
repo seems to be particularly egregious in terms of the massive size of the
uncompressed other.db and filelists.db for some reason.

Thanks,
Christina
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list