Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-21 Thread Alec Ten Harmsel

On 01/21/2015 07:47 AM, Sam Bishop wrote:
> So I've been thinking crazy thoughts.
>
> Theoretically it can't be that hard to do a complete package binhost for 
> gentoo.

I love that you qualify this with "theoretically."

>
> To be clear, when i say complete, Im referring to building, all
> versions of all ebuilds marked stable or unstable on amd64, with every
> combination of use flags.

Every ebuild with every combination of USE flags? This is likely
impossible, and definitely not feasible. With 17000ish ebuilds in the
portage tree and assuming each only has 2 USE flags, this would be
building 17000*2^2 = 68,000 packages. If average build time is 20
seconds (nice server w/ SSD and enough RAM to build in /tmp), it'd take
377ish hours to do an initial build of the tree. I guess this isn't so
bad. Of course, there are outliers like www-client/firefox: 19
non-language USE flags, so 2^19 different firefox permutations at a fast
5 minutes apiece would take 43000 hours. I haven't looked at
REQUIRED_USE, so there could be less than 2^19 different combinations of
flags; taking it down to 2^10 combinations is only 85 hours or so.

>
> This pretty much boils down to bytes and bytes of storage + compute
> resources. Both of which are easily available to me. So I began
> pondering and here I am, thinking to myself "is this really all there
> is too it"?

A full CentOS mirror is ~600GB iirc, so you're gonna need a ton of storage.

> Does it really come down to CPU cycles and repeatedly running through
> the following commands for each combination of ebuild, version and use
> flags
>   emerge --emptytree --onlydeps ${name}
>   emerge --emptytree --buildpkgonly --buildpkg ${name}
>
> Obviously running them in a clean environment each time, either by
> chroot or other means.
> Then just storing the giant binhost somewhere suitable such as an AWS
> s3 bucket setup to work via HTTP so the normal tools work fine with
> it.
>

I haven't used binpkgs in a long time, but I think USE on the client
machine has to match the USE of the package being installed. Managing
all of this would be a nightmare unless you wrote your own special
portage server that served up binpkgs in a USE-aware way and a portage
host could request a binpkg with a certain USE.

Theoretically, great idea. I think this would be possible if you had
maybe 3 or 4 different USE combos (i.e. one for servers, one for KDE
client machines, one for gnome clients, etc.).

Alec

P.S. I'm reasonably sure my math is correct, but I would appreciate
corrections.



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-21 Thread Sam Bishop
On 21 January 2015 at 21:23, Alec Ten Harmsel  wrote:
>
> On 01/21/2015 07:47 AM, Sam Bishop wrote:
>> So I've been thinking crazy thoughts.
>>
>> Theoretically it can't be that hard to do a complete package binhost for 
>> gentoo.
>
> I love that you qualify this with "theoretically."
>

I'm in a position where the cost of these servers may become less than
the cost of paying developers to wait while ebuilds compile. So II'm
having a semi-serious theoretical discussion with myself as to the
merits of opening this up to the entire Gentoo community and a much
more serious theoretical discussion here right now with anyone on this
list as to just how would one do this.

>>
>> To be clear, when i say complete, Im referring to building, all
>> versions of all ebuilds marked stable or unstable on amd64, with every
>> combination of use flags.
>
> Every ebuild with every combination of USE flags? This is likely
> impossible, and definitely not feasible. With 17000ish ebuilds in the
> portage tree and assuming each only has 2 USE flags, this would be
> building 17000*2^2 = 68,000 packages. If average build time is 20
> seconds (nice server w/ SSD and enough RAM to build in /tmp), it'd take
> 377ish hours to do an initial build of the tree. I guess this isn't so
> bad. Of course, there are outliers like www-client/firefox: 19
> non-language USE flags, so 2^19 different firefox permutations at a fast
> 5 minutes apiece would take 43000 hours. I haven't looked at
> REQUIRED_USE, so there could be less than 2^19 different combinations of
> flags; taking it down to 2^10 combinations is only 85 hours or so.
>

Or... looking at it another way, in order to do this in under 24 hrs,
the initial burst capacity would need to be, using your time estimate
and a healthy over estimate of capacity. It would need approximately
20 'nice servers'... for a day for the initial build, then a much
reduced number in order to continue the ongoing work of building all
the new changes.

>>
>> This pretty much boils down to bytes and bytes of storage + compute
>> resources. Both of which are easily available to me. So I began
>> pondering and here I am, thinking to myself "is this really all there
>> is too it"?
>
> A full CentOS mirror is ~600GB iirc, so you're gonna need a ton of storage.
>

1TB on AWS S3 costs me $30 ... thats about 20 minutes of developer
time saved to pay back the cost.
At the moment our build pipeline can take over 45 minutes... most of
it is ebuilds compiling so it won't be hard to speed up with a
binhost.
Were not exactly going to build 'less often', so this does add up.

>> Does it really come down to CPU cycles and repeatedly running through
>> the following commands for each combination of ebuild, version and use
>> flags
>>   emerge --emptytree --onlydeps ${name}
>>   emerge --emptytree --buildpkgonly --buildpkg ${name}
>>
>> Obviously running them in a clean environment each time, either by
>> chroot or other means.
>> Then just storing the giant binhost somewhere suitable such as an AWS
>> s3 bucket setup to work via HTTP so the normal tools work fine with
>> it.
>>
>
> I haven't used binpkgs in a long time, but I think USE on the client
> machine has to match the USE of the package being installed. Managing
> all of this would be a nightmare unless you wrote your own special
> portage server that served up binpkgs in a USE-aware way and a portage
> host could request a binpkg with a certain USE.
>
> Theoretically, great idea. I think this would be possible if you had
> maybe 3 or 4 different USE combos (i.e. one for servers, one for KDE
> client machines, one for gnome clients, etc.).
>
> Alec
>
> P.S. I'm reasonably sure my math is correct, but I would appreciate
> corrections.
>

I don't see why it can't be all the combinations, the issue is
storage, and the storage costs could be a lot lower than expected
given how hard it is to guess. So I would also love to see some
corrected/more accurate estimates, especially any that are based on
numbers from anyone who has been involved in running a tinderbox,
these aren't exactly numbers many people have sitting around haha.



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-21 Thread Rich Freeman
On Wed, Jan 21, 2015 at 9:00 AM, Sam Bishop  wrote:
>
> I don't see why it can't be all the combinations, the issue is
> storage, and the storage costs could be a lot lower than expected
> given how hard it is to guess.

I don't believe that binpkg filenames contain the use flag settings,
and I'm not sure that given multiple copies of a binpkg with different
filenames portage goes through them and figures out which ones are
which.  This isn't an area I have looked into seriously.  However, it
obviously would be a blocker for getting what you propose to work,
even theoretically.

I don't really see the value in having EVERY combination of use flags
on call though.

Practically speaking I doubt this could be done.  You're talking about
a LOT of combinations.

However, I think it would be very useful to have a binpkg repository
all the same.  Perhaps have one for each of a few common profiles with
the default flags.  That alone would be a significant undertaking.

Just about everybody who has talked about running Gentoo in a
datacenter has set up a binpkg repository.  They may very well deviate
from the default USE flags, but for the most part they try to keep
their systems identical.  They would build updates as binpkg, install
to a test system, and after testing deploy them to production and that
would of course go quickly.

I have a script I use to build binpkg nightly for the day's updates.
That lets me review updates and deploy them quickly.  Any rebuilds/etc
still take time, but the bulk of my updates are very fast this way
with minimal time spent staring at the screen.  This would be another
route to take if your really did need highly varied deployments.

-- 
Rich



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-21 Thread Alan McKinnon
On 21/01/2015 15:23, Alec Ten Harmsel wrote:
> On 01/21/2015 07:47 AM, Sam Bishop wrote:
>> > So I've been thinking crazy thoughts.
>> >
>> > Theoretically it can't be that hard to do a complete package binhost for 
>> > gentoo.
> I love that you qualify this with "theoretically."
> 
>> >
>> > To be clear, when i say complete, Im referring to building, all
>> > versions of all ebuilds marked stable or unstable on amd64, with every
>> > combination of use flags.
> Every ebuild with every combination of USE flags? This is likely
> impossible, and definitely not feasible.



A word: tinderbox

A sentence: flameyes' blog describes just how long it takes to do basic
runs and the difficulties attached


"not feasible" is 100% spot-on correct

-- 
Alan McKinnon
alan.mckin...@gmail.com




Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-21 Thread Rich Freeman
On Wed, Jan 21, 2015 at 10:17 AM, Alan McKinnon  wrote:
> On 21/01/2015 15:23, Alec Ten Harmsel wrote:
>> On 01/21/2015 07:47 AM, Sam Bishop wrote:
>>> > So I've been thinking crazy thoughts.
>>> >
>>> > Theoretically it can't be that hard to do a complete package binhost for 
>>> > gentoo.
>> I love that you qualify this with "theoretically."
>>
>>> >
>>> > To be clear, when i say complete, Im referring to building, all
>>> > versions of all ebuilds marked stable or unstable on amd64, with every
>>> > combination of use flags.
>> Every ebuild with every combination of USE flags? This is likely
>> impossible, and definitely not feasible.
>
> A sentence: flameyes' blog describes just how long it takes to do basic
> runs and the difficulties attached
>

To be fair, this project wouldn't have to deal with all the error
reporting/etc which the tinderbox does have to deal with.  It also
won't be predominantly run in conditions where failures are
anticipated (new system packages, etc).  It also doesn't have to do
tests/etc, though that would obviously be nice.  Obviously it will
still take just as long to build.

Again, I suggest walking before running here.  Try building a binpkg
repository for @world with only kde-meta in the world file on the kde
desktop profile with no other changes other than # jobs/etc (or pick
gnome if you prefer).  See how much effort that takes to get working
(and keep up to date) and use that as a guide for what it will take to
go beyond that.  Just that would be very useful - it would be a great
tool for anybody who manages to break their toolchains or dealing with
a very stale install.


-- 
Rich



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-21 Thread Alec Ten Harmsel
I actually had kind of a cool idea while walking to the bus stop this
morning; a JIT Portage server that builds packages on demand. This would
require:

* Writing a portage server
* Patching portage to connect to said server

Basically, `emerge ` would send a message to the server "I need
www-client/firefox-35.0[pulseaudio]". The server would return the
tarball if already built, otherwise build it and then return it. This
would be reasonably complex to implement in practice, but it would let
everybody using the same binhost to run their own custom USE flags.

Re more accurate numbers: dev-java/icedtea. Let's pretend building this
takes ~5 minutes (this is faster than my desktop can do it in RAM with 6
hyper-threaded cores). There are 13 USE flags that are configurable if
you're using HotSpot; we'll ignore JamVM and CACAO. On a single server,
this would take nearly a month (28.44 days, exactly).

Alec



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-21 Thread Andreas K. Huettel
Am Mittwoch 21 Januar 2015, 20:36:55 schrieb Sam Bishop:
> So I've been thinking crazy thoughts.
> 
> Theoretically it can't be that hard to do a complete package binhost for
> gentoo.
> 
> To be clear, when i say complete, Im referring to building, all
> versions of all ebuilds marked stable or unstable on amd64, with every
> combination of use flags.

Not enough. You will also have to build against every combination of 
dependency subslots.

e.g., different poppler, boost, icu, perl and many more versions...

Which makes the task near impossible.

-- 
Andreas K. Huettel
Gentoo Linux developer
kde, council




Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-21 Thread Alan McKinnon
On 21/01/2015 17:42, Rich Freeman wrote:
> On Wed, Jan 21, 2015 at 10:17 AM, Alan McKinnon  
> wrote:
>> On 21/01/2015 15:23, Alec Ten Harmsel wrote:
>>> On 01/21/2015 07:47 AM, Sam Bishop wrote:
> So I've been thinking crazy thoughts.
>
> Theoretically it can't be that hard to do a complete package binhost for 
> gentoo.
>>> I love that you qualify this with "theoretically."
>>>
>
> To be clear, when i say complete, Im referring to building, all
> versions of all ebuilds marked stable or unstable on amd64, with every
> combination of use flags.
>>> Every ebuild with every combination of USE flags? This is likely
>>> impossible, and definitely not feasible.
>>
>> A sentence: flameyes' blog describes just how long it takes to do basic
>> runs and the difficulties attached
>>
> 
> To be fair, this project wouldn't have to deal with all the error
> reporting/etc which the tinderbox does have to deal with.  It also
> won't be predominantly run in conditions where failures are
> anticipated (new system packages, etc).  It also doesn't have to do
> tests/etc, though that would obviously be nice.  Obviously it will
> still take just as long to build.


To be equally fair, I was responding to the OP's idea that it is
feasible to do this:

"To be clear, when i say complete, Im referring to building, all
versions of all ebuilds marked stable or unstable on amd64, with every
combination of use flags."

That is well-nigh impossible in any reasonable time frame. How many
packages in the tree? My trusty find command and some guessing tell me
around 18,000, plus 8309 lines in profiles/use.*. I shudder to think how
much compiling that will take.

I mentioned Diego's tinderbox because that's a real-life example of
building everything in a build-host type environment and how long it
takes to compile just one run.

> 
> Again, I suggest walking before running here.  Try building a binpkg
> repository for @world with only kde-meta in the world file on the kde
> desktop profile with no other changes other than # jobs/etc (or pick
> gnome if you prefer).  See how much effort that takes to get working
> (and keep up to date) and use that as a guide for what it will take to
> go beyond that.  Just that would be very useful - it would be a great
> tool for anybody who manages to break their toolchains or dealing with
> a very stale install.


Agreed. I think what would be useful in real life would be binpkgs for
each profile in the tree with default USE for each, done once a week or
once a fortnight. Think in terms of stage3 raised to the next level.
Useful for getting oneself out of a jam - it's quite surprising how many
people have deleted gcc or all versions of python then come here for
advice. Usually they get told to unpack the package from stage3 in a
chroot - recent binpkgs are a cool nice-to-have.



-- 
Alan McKinnon
alan.mckin...@gmail.com




Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread Sam Bishop
On 21 January 2015 at 22:44, Rich Freeman  wrote:
> On Wed, Jan 21, 2015 at 9:00 AM, Sam Bishop  wrote:
>>
>> I don't see why it can't be all the combinations, the issue is
>> storage, and the storage costs could be a lot lower than expected
>> given how hard it is to guess.
>
> I don't believe that binpkg filenames contain the use flag settings,
> and I'm not sure that given multiple copies of a binpkg with different
> filenames portage goes through them and figures out which ones are
> which.  This isn't an area I have looked into seriously.  However, it
> obviously would be a blocker for getting what you propose to work,
> even theoretically.
>

I'll quote from the binpkg docs:
>> Next to these, portage will check if the binary package is built using the 
>> same USE flags as expected on the client. If a package is built with a 
>> different USE flag combination, portage will either ignore the binary 
>> package (and use source-based build) or fail, depending on the options 
>> passed on to emerge

So I'm fairly sure that implies they can coexist based on the
directory structure. -
http://wiki.gentoo.org/wiki/Binary_package_guide#The_PKGDIR_layout

One big concern would be having a HUGE Packages metadata file and
making the look up too slow. I'm not sure how big that file could get
before things became an issue.
http://wiki.gentoo.org/wiki/Binary_package_guide#Pulling_packages_from_a_binary_package_host

>
> I don't really see the value in having EVERY combination of use flags
> on call though.
>
> Practically speaking I doubt this could be done.  You're talking about
> a LOT of combinations.
>
> However, I think it would be very useful to have a binpkg repository
> all the same.  Perhaps have one for each of a few common profiles with
> the default flags.  That alone would be a significant undertaking.
>
> Just about everybody who has talked about running Gentoo in a
> datacenter has set up a binpkg repository.  They may very well deviate
> from the default USE flags, but for the most part they try to keep
> their systems identical.  They would build updates as binpkg, install
> to a test system, and after testing deploy them to production and that
> would of course go quickly.
>
> I have a script I use to build binpkg nightly for the day's updates.
> That lets me review updates and deploy them quickly.  Any rebuilds/etc
> still take time, but the bulk of my updates are very fast this way
> with minimal time spent staring at the screen.  This would be another
> route to take if your really did need highly varied deployments.
>
> --
> Rich
>



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread Sam Bishop
On 22 January 2015 at 01:54, Andreas K. Huettel  wrote:
> Am Mittwoch 21 Januar 2015, 20:36:55 schrieb Sam Bishop:
>> So I've been thinking crazy thoughts.
>>
>> Theoretically it can't be that hard to do a complete package binhost for
>> gentoo.
>>
>> To be clear, when i say complete, Im referring to building, all
>> versions of all ebuilds marked stable or unstable on amd64, with every
>> combination of use flags.
>
> Not enough. You will also have to build against every combination of
> dependency subslots.
>
> e.g., different poppler, boost, icu, perl and many more versions...
>
> Which makes the task near impossible.
>

Not impossible, just more computationally demanding and requiring more
storage. As I mentioned in another post, its not the task of building
and storing all these I think will be the problem. Its can
portage/emerge handle this? Is the current implementation of binhost
inadequate to deal with such a massive binhost, would it require new
utilities or code or a new version of the binhost metadata format.
These are the kinds of things I feel make it challenging, not the
simple demand for compute and storage. Those are a rather moot point
when S3 is pennies a gigabyte and an AWS spot instance powered compile
farm can be obtained relatively cheaply and if gifted to the Gentoo
community even run at discount prices.

> --
> Andreas K. Huettel
> Gentoo Linux developer
> kde, council
>
>



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread Sam Bishop
On 21 January 2015 at 23:53, Alec Ten Harmsel  wrote:
> I actually had kind of a cool idea while walking to the bus stop this
> morning; a JIT Portage server that builds packages on demand. This would
> require:
>
> * Writing a portage server
> * Patching portage to connect to said server
>

Or... Before integrating it into portage, it could be a wrapper, lets
call it 'premerge' for the sake of example.
Calling premerge www-client/firefox-35.0[pulseaudio] would unpack the
arguments, work out the relevant metadata, perhaps by parsing the
output of emerge -pv, and then fetch the binaries from the big storage
pool they live in, put them in the correct place for portage to find
them, and then call portage in such a way it can find the prebuilt
binary version we just provided it. If emerge itself is incapable of
handling such a large binary prebuilt collection of packages, then I'm
likely to explore this route for a while.

> Basically, `emerge ` would send a message to the server "I need
> www-client/firefox-35.0[pulseaudio]". The server would return the
> tarball if already built, otherwise build it and then return it. This
> would be reasonably complex to implement in practice, but it would let
> everybody using the same binhost to run their own custom USE flags.
>
> Re more accurate numbers: dev-java/icedtea. Let's pretend building this
> takes ~5 minutes (this is faster than my desktop can do it in RAM with 6
> hyper-threaded cores). There are 13 USE flags that are configurable if
> you're using HotSpot; we'll ignore JamVM and CACAO. On a single server,
> this would take nearly a month (28.44 days, exactly).
>
> Alec
>



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread Neil Bothwick
On Thu, 22 Jan 2015 16:43:32 +0800, Sam Bishop wrote:

> I'll quote from the binpkg docs:
> >> Next to these, portage will check if the binary package is built
> >> using the same USE flags as expected on the client. If a package is
> >> built with a different USE flag combination, portage will either
> >> ignore the binary package (and use source-based build) or fail,
> >> depending on the options passed on to emerge  
> 
> So I'm fairly sure that implies they can coexist based on the
> directory structure. -
> http://wiki.gentoo.org/wiki/Binary_package_guide#The_PKGDIR_layout

The package name is the same as the ebuild name but with a .tbz2
extension, so how could portage cope with multiple variants with
different USE flags when there is only one name? There can be only one
package per ebuild and either the USE flags match exactly or they do not.

You could get away with this with a limited set of profiles by having a
different $PKGDIR for each profile but to do it with random combinations
would require some sort of middleware to handle the requests and place
the specified packages where portage expects to find them.

I think the check for USE flags is done using the IUSE and USE settings
in the package metadata, so even if a USE flag you don't use is added to
an ebuild, the package will no longer match. ISTR having to hack metadata
in /var/db in the past to avoid a rebuild of *Office.

 
-- 
Neil Bothwick

When companies ship Styrofoam, what do they pack it in?


pgppgIhzUwDYu.pgp
Description: OpenPGP digital signature


Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread Sam Bishop
On 22 January 2015 at 17:00, Neil Bothwick  wrote:
> On Thu, 22 Jan 2015 16:43:32 +0800, Sam Bishop wrote:
>
>> I'll quote from the binpkg docs:
>> >> Next to these, portage will check if the binary package is built
>> >> using the same USE flags as expected on the client. If a package is
>> >> built with a different USE flag combination, portage will either
>> >> ignore the binary package (and use source-based build) or fail,
>> >> depending on the options passed on to emerge
>>
>> So I'm fairly sure that implies they can coexist based on the
>> directory structure. -
>> http://wiki.gentoo.org/wiki/Binary_package_guide#The_PKGDIR_layout
>
> The package name is the same as the ebuild name but with a .tbz2
> extension, so how could portage cope with multiple variants with
> different USE flags when there is only one name? There can be only one
> package per ebuild and either the USE flags match exactly or they do not.
>
> You could get away with this with a limited set of profiles by having a
> different $PKGDIR for each profile but to do it with random combinations
> would require some sort of middleware to handle the requests and place
> the specified packages where portage expects to find them.
>
> I think the check for USE flags is done using the IUSE and USE settings
> in the package metadata, so even if a USE flag you don't use is added to
> an ebuild, the package will no longer match. ISTR having to hack metadata
> in /var/db in the past to avoid a rebuild of *Office.
>

Thank you kindly Neil. You rephrasing what was right in front of my
face in the docs finally lead to the lightbulb going off. Happens to
all of us I suppose. The pkdir layout diagram isn't implying multiple
versions of a single package, it is referring to multiple packages
with a numeric shorthand. So this would require middleware, wrappers,
or improvements to portage to cope with having overlapping packages
like this. So interim functionality could be achieved with separate
bin hosts directories for each of the baseline profiles with their
default use flags. Once the infrastructure was stable then work could
be undertaken to build some kind of wrapper, or enhancement to
portage.

>
> --
> Neil Bothwick
>
> When companies ship Styrofoam, what do they pack it in?



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread Bruce Schultz


On 22 January 2015 7:20:07 PM AEST, Sam Bishop  wrote:
>On 22 January 2015 at 17:00, Neil Bothwick  wrote:
>> On Thu, 22 Jan 2015 16:43:32 +0800, Sam Bishop wrote:
>>
>>> I'll quote from the binpkg docs:
>>> >> Next to these, portage will check if the binary package is built
>>> >> using the same USE flags as expected on the client. If a package
>is
>>> >> built with a different USE flag combination, portage will either
>>> >> ignore the binary package (and use source-based build) or fail,
>>> >> depending on the options passed on to emerge
>>>
>>> So I'm fairly sure that implies they can coexist based on the
>>> directory structure. -
>>> http://wiki.gentoo.org/wiki/Binary_package_guide#The_PKGDIR_layout
>>
>> The package name is the same as the ebuild name but with a .tbz2
>> extension, so how could portage cope with multiple variants with
>> different USE flags when there is only one name? There can be only
>one
>> package per ebuild and either the USE flags match exactly or they do
>not.
>>
>> You could get away with this with a limited set of profiles by having
>a
>> different $PKGDIR for each profile but to do it with random
>combinations
>> would require some sort of middleware to handle the requests and
>place
>> the specified packages where portage expects to find them.
>>
>> I think the check for USE flags is done using the IUSE and USE
>settings
>> in the package metadata, so even if a USE flag you don't use is added
>to
>> an ebuild, the package will no longer match. ISTR having to hack
>metadata
>> in /var/db in the past to avoid a rebuild of *Office.
>>
>
>Thank you kindly Neil. You rephrasing what was right in front of my
>face in the docs finally lead to the lightbulb going off. Happens to
>all of us I suppose. The pkdir layout diagram isn't implying multiple
>versions of a single package, it is referring to multiple packages
>with a numeric shorthand. So this would require middleware, wrappers,
>or improvements to portage to cope with having overlapping packages
>like this. So interim functionality could be achieved with separate
>bin hosts directories for each of the baseline profiles with their
>default use flags. Once the infrastructure was stable then work could
>be undertaken to build some kind of wrapper, or enhancement to
>portage.

There was a discussion recently on the portage-dev list regarding storing 
multiple versions with different use flags in a pkgdir. There's an open bug in 
bugzilla too, I believe, but I cannot find the reference right now; if I can 
I'll follow up.

I think the summary was that the Packages file is able to index multiple 
versions of a package, but the tooling to create and manage packages needs some 
improvement. (Don't quote me on that though!)


>
>>
>> --
>> Neil Bothwick
>>
>> When companies ship Styrofoam, what do they pack it in?

-- 
:b



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread Bruce Schultz


On 22 January 2015 8:50:29 PM AEST, Bruce Schultz  wrote:
>
>
>On 22 January 2015 7:20:07 PM AEST, Sam Bishop 
>wrote:
>>On 22 January 2015 at 17:00, Neil Bothwick  wrote:
>>> On Thu, 22 Jan 2015 16:43:32 +0800, Sam Bishop wrote:
>>>
 I'll quote from the binpkg docs:
 >> Next to these, portage will check if the binary package is built
 >> using the same USE flags as expected on the client. If a package
>>is
 >> built with a different USE flag combination, portage will either
 >> ignore the binary package (and use source-based build) or fail,
 >> depending on the options passed on to emerge

 So I'm fairly sure that implies they can coexist based on the
 directory structure. -
 http://wiki.gentoo.org/wiki/Binary_package_guide#The_PKGDIR_layout
>>>
>>> The package name is the same as the ebuild name but with a .tbz2
>>> extension, so how could portage cope with multiple variants with
>>> different USE flags when there is only one name? There can be only
>>one
>>> package per ebuild and either the USE flags match exactly or they do
>>not.
>>>
>>> You could get away with this with a limited set of profiles by
>having
>>a
>>> different $PKGDIR for each profile but to do it with random
>>combinations
>>> would require some sort of middleware to handle the requests and
>>place
>>> the specified packages where portage expects to find them.
>>>
>>> I think the check for USE flags is done using the IUSE and USE
>>settings
>>> in the package metadata, so even if a USE flag you don't use is
>added
>>to
>>> an ebuild, the package will no longer match. ISTR having to hack
>>metadata
>>> in /var/db in the past to avoid a rebuild of *Office.
>>>
>>
>>Thank you kindly Neil. You rephrasing what was right in front of my
>>face in the docs finally lead to the lightbulb going off. Happens to
>>all of us I suppose. The pkdir layout diagram isn't implying multiple
>>versions of a single package, it is referring to multiple packages
>>with a numeric shorthand. So this would require middleware, wrappers,
>>or improvements to portage to cope with having overlapping packages
>>like this. So interim functionality could be achieved with separate
>>bin hosts directories for each of the baseline profiles with their
>>default use flags. Once the infrastructure was stable then work could
>>be undertaken to build some kind of wrapper, or enhancement to
>>portage.
>
>There was a discussion recently on the portage-dev list regarding
>storing multiple versions with different use flags in a pkgdir. There's
>an open bug in bugzilla too, I believe, but I cannot find the reference
>right now; if I can I'll follow up.
>
>I think the summary was that the Packages file is able to index
>multiple versions of a package, but the tooling to create and manage
>packages needs some improvement. (Don't quote me on that though!)

Found it
http://thread.gmane.org/gmane.linux.gentoo.portage.devel/5031
https://bugs.gentoo.org/show_bug.cgi?id=150031


-- 
:b



Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread thegeezer
On 22/01/15 15:46, Jc García wrote:
> 2015-01-22 6:11 GMT-06:00 Andreas K. Huettel :
>> Am Donnerstag 22 Januar 2015, 16:50:45 schrieb Sam Bishop:
>>> On 22 January 2015 at 01:54, Andreas K. Huettel 
>> wrote:
 Am Mittwoch 21 Januar 2015, 20:36:55 schrieb Sam Bishop:
> So I've been thinking crazy thoughts.
>
> Theoretically it can't be that hard to do a complete package binhost for
> gentoo.
>
> To be clear, when i say complete, Im referring to building, all
> versions of all ebuilds marked stable or unstable on amd64, with every
> combination of use flags.
 Not enough. You will also have to build against every combination of
 dependency subslots.

 e.g., different poppler, boost, icu, perl and many more versions...

 Which makes the task near impossible.
>>> Not impossible, just more computationally demanding and requiring more
>>> storage.
>> Well, exponential increase is exponential increase.
>>
>> * A libreoffice binary package with debug information has roughly 800Mbyte
>> size
>> * 2 libreoffice versions in the tree
>> * libreoffice links against poppler, icu, boost (among other things)
>> * poppler: 5 subslots, icu (soon) 3 subslots, boost 5 subslots in tree -> 75
>> combinations
>> * libreoffice has 22 useflags and 4 extensions, plus three supported python
>> variants -> 29 switches
>> * REQUIRED_USE limits your combinations, let's conservatively guess 25
>> independent switches -> 2^25=33554432 use combinations
>>
> Based on this.
> If it would take 1 minute(being more than optimistic) to build libreoffice:
> 33554432 builds * 1min = 63 years building
>
> If one would want to build that in a day it would be needed to rent
> 23301 super fast boxes. and have them heating all day long, leaving
> the storage problem aside, just for libreoffice, if we think now about
> firefox, chromium and the webkit packages, I think that makes for a
> good analogy of hell, and a terrible waste of resources.
>

My 2c
what if instead of one person does all the compiling and storage, we
have "cc-emerge" which would stand for "cloud contributor emerge"
it would be a wrapper / slightly modified emerge, to always build
packages, but have a postinstall hook which then bundles the package
with "builtwith.ini" which would have parsable detail, because on top of
the slots, and the uses and the 32/64 bit versions, you also have
pluggable compilers and even CHOST are different
ok so once the build package is bundled with the builtwith.ini it would
then be sent up to AWS for further analysis.
this would allow some interesting feedback:
1. most popular compilers
2. most common use flags for packages.
3. if no one is using a specific use flag then why bother having it in
communal binhost ?
i'm not saying that folks using something odd should then be expunged,
but it would possibly give devs some interesting feedback.
it might also help to streamline tinderboxing as you could compare your
compiled version with the communal version
also it would tap into (voluntarily of course) our collective compiling
time too.

with the billions and growing number of gentoo users ;)  we should be
able to crank out a communal binhost.
once that is there and it can be queried and indexed, we could have some
fun ensuring that all builds are built the same across the board
we could also then have cc-emerge do a lookup to see if someone else
already compiled it and choose to download it, or only download if 100
folks have also compiled it and all checksums are the same for the 100
folks that compiled it the same (same hardware, same use flags etc etc)





Re: Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread Andreas K. Huettel
Am Donnerstag 22 Januar 2015, 16:50:45 schrieb Sam Bishop:
> On 22 January 2015 at 01:54, Andreas K. Huettel  
wrote:
> > Am Mittwoch 21 Januar 2015, 20:36:55 schrieb Sam Bishop:
> >> So I've been thinking crazy thoughts.
> >> 
> >> Theoretically it can't be that hard to do a complete package binhost for
> >> gentoo.
> >> 
> >> To be clear, when i say complete, Im referring to building, all
> >> versions of all ebuilds marked stable or unstable on amd64, with every
> >> combination of use flags.
> > 
> > Not enough. You will also have to build against every combination of
> > dependency subslots.
> > 
> > e.g., different poppler, boost, icu, perl and many more versions...
> > 
> > Which makes the task near impossible.
> 
> Not impossible, just more computationally demanding and requiring more
> storage. 

Well, exponential increase is exponential increase. 

* A libreoffice binary package with debug information has roughly 800Mbyte 
size
* 2 libreoffice versions in the tree
* libreoffice links against poppler, icu, boost (among other things)
* poppler: 5 subslots, icu (soon) 3 subslots, boost 5 subslots in tree -> 75 
combinations
* libreoffice has 22 useflags and 4 extensions, plus three supported python 
variants -> 29 switches
* REQUIRED_USE limits your combinations, let's conservatively guess 25 
independent switches -> 2^25=33554432 use combinations

Which ends up with roughly 2 Exabyte (10^9 GByte) of storage for all packages.

-- 
Andreas K. Huettel
Gentoo Linux developer
kde, council




Re: Re: [gentoo-user] Open Question: The feasibility of a complete portage binhost

2015-01-22 Thread Jc García
2015-01-22 6:11 GMT-06:00 Andreas K. Huettel :
> Am Donnerstag 22 Januar 2015, 16:50:45 schrieb Sam Bishop:
>> On 22 January 2015 at 01:54, Andreas K. Huettel 
> wrote:
>> > Am Mittwoch 21 Januar 2015, 20:36:55 schrieb Sam Bishop:
>> >> So I've been thinking crazy thoughts.
>> >>
>> >> Theoretically it can't be that hard to do a complete package binhost for
>> >> gentoo.
>> >>
>> >> To be clear, when i say complete, Im referring to building, all
>> >> versions of all ebuilds marked stable or unstable on amd64, with every
>> >> combination of use flags.
>> >
>> > Not enough. You will also have to build against every combination of
>> > dependency subslots.
>> >
>> > e.g., different poppler, boost, icu, perl and many more versions...
>> >
>> > Which makes the task near impossible.
>>
>> Not impossible, just more computationally demanding and requiring more
>> storage.
>
> Well, exponential increase is exponential increase.
>
> * A libreoffice binary package with debug information has roughly 800Mbyte
> size
> * 2 libreoffice versions in the tree
> * libreoffice links against poppler, icu, boost (among other things)
> * poppler: 5 subslots, icu (soon) 3 subslots, boost 5 subslots in tree -> 75
> combinations
> * libreoffice has 22 useflags and 4 extensions, plus three supported python
> variants -> 29 switches
> * REQUIRED_USE limits your combinations, let's conservatively guess 25
> independent switches -> 2^25=33554432 use combinations
>
Based on this.
If it would take 1 minute(being more than optimistic) to build libreoffice:
33554432 builds * 1min = 63 years building

If one would want to build that in a day it would be needed to rent
23301 super fast boxes. and have them heating all day long, leaving
the storage problem aside, just for libreoffice, if we think now about
firefox, chromium and the webkit packages, I think that makes for a
good analogy of hell, and a terrible waste of resources.