Re: [gentoo-user] Distfiles cache setup
On Sunday 20 September 2015 11:07:32 Andrew Savchenko wrote: >> I regularly run eclean-dist on the mythtv frontends as I still have 32GB >> SSDs on a couple of them. These are pretty lean as all file shares & >> mythtv recordings are on the server that is running 24/7. >> >> >> >> I figured eclean-dist would wipe out everything that wasn't needed by >> the machine it was run on, but if all it does is clean stuff that isn't >> in the tree any longer that would work too. > >This is controllable: >- eclean-dist cleans what is in the tree no longer and not >installed in the system; >- eclean-dist -d cleans everything not installed in the system. > >One can also restrict cleaning by file date (e.g. don't touch files >newer than) or by file size; fetch-protected files may be spared >as well. See > eclean-dist --help >for more details. But keep https://bugs.gentoo.org/show_bug.cgi?id=472020 in mind if you want to use "-n" and "-d" together. -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] Distfiles cache setup
On Fri, 18 Sep 2015 17:48:15 -0700 Daniel Frey wrote: > On 09/18/2015 01:15 PM, Neil Bothwick wrote: > > How tight is space? eclean-dist only removes distfiles for packages that > > are no longer in the tree. So you can run it on one system and keep > > $DISTDIR reasonably trimmed. If you use the --package-names option, it > > will do as you suggest and only keep files needed by the machine running > > the command. > > > > Thanks for the replies. > > I regularly run eclean-dist on the mythtv frontends as I still have 32GB > SSDs on a couple of them. These are pretty lean as all file shares & > mythtv recordings are on the server that is running 24/7. > > I figured eclean-dist would wipe out everything that wasn't needed by > the machine it was run on, but if all it does is clean stuff that isn't > in the tree any longer that would work too. This is controllable: - eclean-dist cleans what is in the tree no longer and not installed in the system; - eclean-dist -d cleans everything not installed in the system. One can also restrict cleaning by file date (e.g. don't touch files newer than) or by file size; fetch-protected files may be spared as well. See eclean-dist --help for more details. > The server I'd be running it on has ample space. Which is why I was > debating over the http-replicator (thanks for the suggestion Peter!) and > just exporting the damn distfiles directory. > > I think I'm going to try exporting it first and see if it does what I > want first, if it works I'll leave it. :-) We have a cluster of identical machines. Exporting over NFS works just fine, though we exported not only /usr/portage, but /usr/local/portage, /var/lib/layman and /var/cache/edb/dep as well (we use sqlite backed for portage). Best regards, Andrew Savchenko pgpDWhszIE_6Q.pgp Description: PGP signature
Re: [gentoo-user] Distfiles cache setup
On Friday, September 18, 2015 10:02:27 AM Daniel Frey wrote: > Anyone have any suggestions? > > Dan I actually export the portage tree using NFS as well between all non-mobile systems. Binary packages can be shared as well, as long as all the machines have identical CFLAGS, profiles and USE-flags. -- Joost
Re: [gentoo-user] Distfiles cache setup
On Saturday, September 19, 2015 06:39:38 AM hydra wrote: > On Fri, Sep 18, 2015 at 7:02 PM, Daniel Frey wrote: > You can export distfiles via glusterfs. A single machine holds the data > while the others can fetch / upload files. Glusterfs needs to be installed > on each machine and fuse enabled in the kernel. That's a bit overkill. NFS (version 3) will easily work and doesn't require anything really special. -- Joost
Re: [gentoo-user] Distfiles cache setup
On Fri, Sep 18, 2015 at 7:02 PM, Daniel Frey wrote: > Hi all, > > I have been running several Gentoo machines here at my house, and am > currently up to 7 (or was it 8?) installs. > > I have been trying to reduce my resource consumption and set up an rsync > mirror long ago, so my [acting] server only syncs to the internet and > all other devices point to it. That part is working fine, I've already > moved it to the repos.conf configuration. > > Whenever I search for running a local distfiles mirror (on this list and > on the web) it gets a bit murky. > > The way I see it is this can be done a couple of ways: > > 1. Set up a lighttpd server to serve the distfiles directory. > > This has the benefit of being able to sync machines outside my network, > although I don't know if I'd expose it to the internet. > > The major issue I can see with this is that if the file doesn't exist, > portage will crap out saying it's not available. What I don't know is if > there's an easy way to "get around" this issue. > > > > 2. Export the distfiles directory. > > This seems to be a bit better of a solution, other than not being able > to use it outside the LAN. However, cleaning this directory becomes a > lot less trivial as tools used to clean it will assume that the current > machine is the only machine using it and clobber other workstation's > required distfiles. > > I suppose the easiest way to sync is to wipe it completely out and run > `emerge -fe world` on all machines to rebuild it, but this would be a > fair bit of work as well. > > > > With those two options, neither being perfect - it made me wonder if > there's a Better Way(tm) to do this. > > In the case of a shared distfiles, it would be best if something was one > the machine hosting the distfiles monitoring what workstation needed > what file and only removing a file when no workstations request it. > Alas, I don't think a tool such as that exists (although I didn't really > look that hard.) > > Ideally, it would be nice to have some sort of caching proxy that could > fetch the file as it was needed, but in searching for this I encountered > so much noise in the search results I gave up for the time being. > > Anyone have any suggestions? > > Dan > > > > You can export distfiles via glusterfs. A single machine holds the data while the others can fetch / upload files. Glusterfs needs to be installed on each machine and fuse enabled in the kernel.
Re: [gentoo-user] Distfiles cache setup
On 09/18/2015 01:15 PM, Neil Bothwick wrote: > How tight is space? eclean-dist only removes distfiles for packages that > are no longer in the tree. So you can run it on one system and keep > $DISTDIR reasonably trimmed. If you use the --package-names option, it > will do as you suggest and only keep files needed by the machine running > the command. > Thanks for the replies. I regularly run eclean-dist on the mythtv frontends as I still have 32GB SSDs on a couple of them. These are pretty lean as all file shares & mythtv recordings are on the server that is running 24/7. I figured eclean-dist would wipe out everything that wasn't needed by the machine it was run on, but if all it does is clean stuff that isn't in the tree any longer that would work too. The server I'd be running it on has ample space. Which is why I was debating over the http-replicator (thanks for the suggestion Peter!) and just exporting the damn distfiles directory. I think I'm going to try exporting it first and see if it does what I want first, if it works I'll leave it. :-) Dan
Re: [gentoo-user] Distfiles cache setup
On Fri, 18 Sep 2015 10:02:27 -0700, Daniel Frey wrote: > 2. Export the distfiles directory. That's what I do. > This seems to be a bit better of a solution, other than not being able > to use it outside the LAN. ZeroTier can take care of that, or a VPN if you feel like doing the work yourself. > However, cleaning this directory becomes a > lot less trivial as tools used to clean it will assume that the current > machine is the only machine using it and clobber other workstation's > required distfiles. How tight is space? eclean-dist only removes distfiles for packages that are no longer in the tree. So you can run it on one system and keep $DISTDIR reasonably trimmed. If you use the --package-names option, it will do as you suggest and only keep files needed by the machine running the command. > I suppose the easiest way to sync is to wipe it completely out and run > `emerge -fe world` on all machines to rebuild it, but this would be a > fair bit of work as well. If you run this on each computer emerge -epf --usepkg=n world | awk '/^[fh]t?tps?\:\/\// {print $1}' | sort -u | while read f; do touch --no-create ${DISTDIR}/$(basename ${f}) done It will touch each file needed by an installed package. Then you can simply delete all files more than a day old (or longer if you want to keep some fallback) find $DISTDIR -type f -mtime +3- -exec rm "{}" + -- Neil Bothwick Everyone has a photographic memory. Some don't have film. pgp3TQ9Jt2QJU.pgp Description: OpenPGP digital signature
Re: [gentoo-user] Distfiles cache setup
On Friday 18 September 2015 10:02:27 Daniel Frey wrote: > Ideally, it would be nice to have some sort of caching proxy that could > fetch the file as it was needed, but in searching for this I encountered > so much noise in the search results I gave up for the time being. > > Anyone have any suggestions? I use http-replicator for this. It needs a bit of maintenance to keep its space use down, but that can be taken care of by a cron script. -- Rgds Peter
[gentoo-user] Distfiles cache setup
Hi all, I have been running several Gentoo machines here at my house, and am currently up to 7 (or was it 8?) installs. I have been trying to reduce my resource consumption and set up an rsync mirror long ago, so my [acting] server only syncs to the internet and all other devices point to it. That part is working fine, I've already moved it to the repos.conf configuration. Whenever I search for running a local distfiles mirror (on this list and on the web) it gets a bit murky. The way I see it is this can be done a couple of ways: 1. Set up a lighttpd server to serve the distfiles directory. This has the benefit of being able to sync machines outside my network, although I don't know if I'd expose it to the internet. The major issue I can see with this is that if the file doesn't exist, portage will crap out saying it's not available. What I don't know is if there's an easy way to "get around" this issue. 2. Export the distfiles directory. This seems to be a bit better of a solution, other than not being able to use it outside the LAN. However, cleaning this directory becomes a lot less trivial as tools used to clean it will assume that the current machine is the only machine using it and clobber other workstation's required distfiles. I suppose the easiest way to sync is to wipe it completely out and run `emerge -fe world` on all machines to rebuild it, but this would be a fair bit of work as well. With those two options, neither being perfect - it made me wonder if there's a Better Way(tm) to do this. In the case of a shared distfiles, it would be best if something was one the machine hosting the distfiles monitoring what workstation needed what file and only removing a file when no workstations request it. Alas, I don't think a tool such as that exists (although I didn't really look that hard.) Ideally, it would be nice to have some sort of caching proxy that could fetch the file as it was needed, but in searching for this I encountered so much noise in the search results I gave up for the time being. Anyone have any suggestions? Dan