Re: [gentoo-user] Distfiles cache setup

2015-09-20 Thread Marc Joliet
On Sunday 20 September 2015 11:07:32 Andrew Savchenko wrote:
>> I regularly run eclean-dist on the mythtv frontends as I still have 32GB
>> SSDs on a couple of them. These are pretty lean as all file shares &
>> mythtv recordings are on the server that is running 24/7.
>>
>> 
>>
>> I figured eclean-dist would wipe out everything that wasn't needed by
>> the machine it was run on, but if all it does is clean stuff that isn't
>> in the tree any longer that would work too.
>
>This is controllable:
>- eclean-dist cleans what is in the tree no longer and not
>installed in the system;
>- eclean-dist -d cleans everything not installed in the system.
>
>One can also restrict cleaning by file date (e.g. don't touch files
>newer than) or by file size; fetch-protected files may be spared
>as well. See
>  eclean-dist --help
>for more details.

But keep https://bugs.gentoo.org/show_bug.cgi?id=472020 in mind if you want to 
use "-n" and "-d" together.

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] Distfiles cache setup

2015-09-20 Thread Andrew Savchenko
On Fri, 18 Sep 2015 17:48:15 -0700 Daniel Frey wrote:
> On 09/18/2015 01:15 PM, Neil Bothwick wrote:
> > How tight is space? eclean-dist only removes distfiles for packages that
> > are no longer in the tree. So you can run it on one system and keep
> > $DISTDIR reasonably trimmed. If you use the --package-names option, it
> > will do as you suggest and only keep files needed by the machine running
> > the command.
> > 
> 
> Thanks for the replies.
> 
> I regularly run eclean-dist on the mythtv frontends as I still have 32GB
> SSDs on a couple of them. These are pretty lean as all file shares &
> mythtv recordings are on the server that is running 24/7.
> 
> I figured eclean-dist would wipe out everything that wasn't needed by
> the machine it was run on, but if all it does is clean stuff that isn't
> in the tree any longer that would work too.

This is controllable:
- eclean-dist cleans what is in the tree no longer and not
installed in the system;
- eclean-dist -d cleans everything not installed in the system.

One can also restrict cleaning by file date (e.g. don't touch files
newer than) or by file size; fetch-protected files may be spared
as well. See
  eclean-dist --help
for more details.
 
> The server I'd be running it on has ample space. Which is why I was
> debating over the http-replicator (thanks for the suggestion Peter!) and
> just exporting the damn distfiles directory.
> 
> I think I'm going to try exporting it first and see if it does what I
> want first, if it works I'll leave it. :-)

We have a cluster of identical machines. Exporting over NFS works
just fine, though we exported not only /usr/portage,
but /usr/local/portage, /var/lib/layman and /var/cache/edb/dep as
well (we use sqlite backed for portage).

Best regards,
Andrew Savchenko


pgpDWhszIE_6Q.pgp
Description: PGP signature


Re: [gentoo-user] Distfiles cache setup

2015-09-19 Thread J. Roeleveld
On Friday, September 18, 2015 10:02:27 AM Daniel Frey wrote:
> Anyone have any suggestions?
> 
> Dan

I actually export the portage tree using NFS as well between all non-mobile 
systems.
Binary packages can be shared as well, as long as all the machines have 
identical CFLAGS, profiles and USE-flags.

--
Joost



Re: [gentoo-user] Distfiles cache setup

2015-09-19 Thread J. Roeleveld
On Saturday, September 19, 2015 06:39:38 AM hydra wrote:
> On Fri, Sep 18, 2015 at 7:02 PM, Daniel Frey  wrote:
> You can export distfiles via glusterfs. A single machine holds the data
> while the others can fetch / upload files. Glusterfs needs to be installed
> on each machine and fuse enabled in the kernel.

That's a bit overkill.
NFS (version 3) will easily work and doesn't require anything really special.

--
Joost



Re: [gentoo-user] Distfiles cache setup

2015-09-18 Thread hydra
On Fri, Sep 18, 2015 at 7:02 PM, Daniel Frey  wrote:

> Hi all,
>
> I have been running several Gentoo machines here at my house, and am
> currently up to 7 (or was it 8?) installs.
>
> I have been trying to reduce my resource consumption and set up an rsync
> mirror long ago, so my [acting] server only syncs to the internet and
> all other devices point to it. That part is working fine, I've already
> moved it to the repos.conf configuration.
>
> Whenever I search for running a local distfiles mirror (on this list and
> on the web) it gets a bit murky.
>
> The way I see it is this can be done a couple of ways:
>
> 1. Set up a lighttpd server to serve the distfiles directory.
>
> This has the benefit of being able to sync machines outside my network,
> although I don't know if I'd expose it to the internet.
>
> The major issue I can see with this is that if the file doesn't exist,
> portage will crap out saying it's not available. What I don't know is if
> there's an easy way to "get around" this issue.
>
> 
>
> 2. Export the distfiles directory.
>
> This seems to be a bit better of a solution, other than not being able
> to use it outside the LAN. However, cleaning this directory becomes a
> lot less trivial as tools used to clean it will assume that the current
> machine is the only machine using it and clobber other workstation's
> required distfiles.
>
> I suppose the easiest way to sync is to wipe it completely out and run
> `emerge -fe world` on all machines to rebuild it, but this would be a
> fair bit of work as well.
>
> 
>
> With those two options, neither being perfect - it made me wonder if
> there's a Better Way(tm) to do this.
>
> In the case of a shared distfiles, it would be best if something was one
> the machine hosting the distfiles monitoring what workstation needed
> what file and only removing a file when no workstations request it.
> Alas, I don't think a tool such as that exists (although I didn't really
> look that hard.)
>
> Ideally, it would be nice to have some sort of caching proxy that could
> fetch the file as it was needed, but in searching for this I encountered
> so much noise in the search results I gave up for the time being.
>
> Anyone have any suggestions?
>
> Dan
>
>
>
>

You can export distfiles via glusterfs. A single machine holds the data
while the others can fetch / upload files. Glusterfs needs to be installed
on each machine and fuse enabled in the kernel.


Re: [gentoo-user] Distfiles cache setup

2015-09-18 Thread Daniel Frey
On 09/18/2015 01:15 PM, Neil Bothwick wrote:
> How tight is space? eclean-dist only removes distfiles for packages that
> are no longer in the tree. So you can run it on one system and keep
> $DISTDIR reasonably trimmed. If you use the --package-names option, it
> will do as you suggest and only keep files needed by the machine running
> the command.
> 

Thanks for the replies.

I regularly run eclean-dist on the mythtv frontends as I still have 32GB
SSDs on a couple of them. These are pretty lean as all file shares &
mythtv recordings are on the server that is running 24/7.

I figured eclean-dist would wipe out everything that wasn't needed by
the machine it was run on, but if all it does is clean stuff that isn't
in the tree any longer that would work too.

The server I'd be running it on has ample space. Which is why I was
debating over the http-replicator (thanks for the suggestion Peter!) and
just exporting the damn distfiles directory.

I think I'm going to try exporting it first and see if it does what I
want first, if it works I'll leave it. :-)

Dan



Re: [gentoo-user] Distfiles cache setup

2015-09-18 Thread Neil Bothwick
On Fri, 18 Sep 2015 10:02:27 -0700, Daniel Frey wrote:

> 2. Export the distfiles directory.

That's what I do.
 
> This seems to be a bit better of a solution, other than not being able
> to use it outside the LAN.

ZeroTier can take care of that, or a VPN if you feel like doing the work
yourself.

> However, cleaning this directory becomes a
> lot less trivial as tools used to clean it will assume that the current
> machine is the only machine using it and clobber other workstation's
> required distfiles.

How tight is space? eclean-dist only removes distfiles for packages that
are no longer in the tree. So you can run it on one system and keep
$DISTDIR reasonably trimmed. If you use the --package-names option, it
will do as you suggest and only keep files needed by the machine running
the command.

> I suppose the easiest way to sync is to wipe it completely out and run
> `emerge -fe world` on all machines to rebuild it, but this would be a
> fair bit of work as well.

If you run this on each computer

emerge -epf --usepkg=n world | awk '/^[fh]t?tps?\:\/\// {print $1}' | sort -u | 
while read f; do
touch --no-create ${DISTDIR}/$(basename ${f})
done

It will touch each file needed by an installed package. Then you can
simply delete all files more than a day old (or longer if you want to keep
some fallback)

find $DISTDIR -type f -mtime +3- -exec rm "{}" +


-- 
Neil Bothwick

Everyone has a photographic memory. Some don't have film.


pgp3TQ9Jt2QJU.pgp
Description: OpenPGP digital signature


Re: [gentoo-user] Distfiles cache setup

2015-09-18 Thread Peter Humphrey
On Friday 18 September 2015 10:02:27 Daniel Frey wrote:

> Ideally, it would be nice to have some sort of caching proxy that could
> fetch the file as it was needed, but in searching for this I encountered
> so much noise in the search results I gave up for the time being.
> 
> Anyone have any suggestions?

I use http-replicator for this. It needs a bit of maintenance to keep its 
space use down, but that can be taken care of by a cron script.

-- 
Rgds
Peter




[gentoo-user] Distfiles cache setup

2015-09-18 Thread Daniel Frey
Hi all,

I have been running several Gentoo machines here at my house, and am
currently up to 7 (or was it 8?) installs.

I have been trying to reduce my resource consumption and set up an rsync
mirror long ago, so my [acting] server only syncs to the internet and
all other devices point to it. That part is working fine, I've already
moved it to the repos.conf configuration.

Whenever I search for running a local distfiles mirror (on this list and
on the web) it gets a bit murky.

The way I see it is this can be done a couple of ways:

1. Set up a lighttpd server to serve the distfiles directory.

This has the benefit of being able to sync machines outside my network,
although I don't know if I'd expose it to the internet.

The major issue I can see with this is that if the file doesn't exist,
portage will crap out saying it's not available. What I don't know is if
there's an easy way to "get around" this issue.



2. Export the distfiles directory.

This seems to be a bit better of a solution, other than not being able
to use it outside the LAN. However, cleaning this directory becomes a
lot less trivial as tools used to clean it will assume that the current
machine is the only machine using it and clobber other workstation's
required distfiles.

I suppose the easiest way to sync is to wipe it completely out and run
`emerge -fe world` on all machines to rebuild it, but this would be a
fair bit of work as well.



With those two options, neither being perfect - it made me wonder if
there's a Better Way(tm) to do this.

In the case of a shared distfiles, it would be best if something was one
the machine hosting the distfiles monitoring what workstation needed
what file and only removing a file when no workstations request it.
Alas, I don't think a tool such as that exists (although I didn't really
look that hard.)

Ideally, it would be nice to have some sort of caching proxy that could
fetch the file as it was needed, but in searching for this I encountered
so much noise in the search results I gave up for the time being.

Anyone have any suggestions?

Dan