Re: Sparc build failure analysis (was Re: StrongARM tactics)

2005-12-11 Thread Ingo Juergensmann
On Sun, Dec 11, 2005 at 05:30:24AM -0500, Kevin Mark wrote:

> has anyone every considered a check in the buildd infrastructure to
> alert someone (buildd admin and/or others) if a build is taking too long
> (eg openoffice usually takes between 2-3 hours to build and the current
> build has been building for 10 hours+). Something like a database entry
> or a database of either previous build times or last build time. As a
> way to not have a buildd tied up with an obvious build issue and thus
> allow the issue to be address sooner thus alowing more buildd
> throughput.

You mean something like the stats on http://www.buildd.net/ - for example:
http://buildd.net/cgi/ptracker.cgi?unstable_pkg=aptitude&searchtype=m68k

I intend to give some sort of "build is overdue" warning when I have enough
data to calculate on when a build is overdue ;)

-- 
Ciao...//Fon: 0381-2744150 
  Ingo   \X/ SIP: [EMAIL PROTECTED]

gpg pubkey: http://www.juergensmann.de/ij/public_key.asc


signature.asc
Description: Digital signature


Re: Sparc build failure analysis (was Re: StrongARM tactics)

2005-12-11 Thread Kurt Roeckx
On Sun, Dec 11, 2005 at 05:55:23AM -0800, Steve Langasek wrote:
> 
> > Indeed, for practical buildd maintainance purposes, the distinction is
> > not that important -- though 'Failed' is known to not benefit of a
> > requeue, while 'Building:Maybe-Failed' might or might not, it's unkown,
> > most archs should have enough surplus buildd power that retrying
> > everything once in a while doesn't hurt.
> 
> > The major benefit is though to make it apparant for porters what to look
> > into, without all the 'noise' in between of maybe-transient failures.
> > One could also make sure that the FTBFS bugs are tagged (user-tagged)
> > with [EMAIL PROTECTED] (etc) for example (or [EMAIL PROTECTED] There
> > doesn't exist a [EMAIL PROTECTED] for example...), so that one can get a
> > nice overview of all the porting bugs. It'd make sense to synchronise
> > this across all architectures, so that it is consistent.
> 
> http://lists.debian.org/debian-alpha/2005/12/msg00028.html

I have a long list of bug affecting amd64, but I haven't started
with usertags for it.

The (FTBFS) bugs I encouter (as buildd admin) are:
- General bugs affecting all arches.
- General bugs affecting 64 bit arches.
- Bugs affecting some arches (like not using -fPIC)
- Bugs only affecting amd64.

And the later really is the minorty of the problems.

Note that this does not cover runtime problems or something like
that, but they're very simular.

Do we need to have a special usertag for the first kind?  This is
basicly something everybody can look at.  The only reason I can think
of that it requires some tag is that it's better then looking at
those that don't have a tag.

So, I'm open for suggestions on how to tag the first 3 of those.


Kurt


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Sparc build failure analysis (was Re: StrongARM tactics)

2005-12-11 Thread Steve Langasek
On Sun, Dec 11, 2005 at 02:38:35PM +0100, Jeroen van Wolffelaar wrote:
> On Sun, Dec 11, 2005 at 12:35:26AM -0800, Steve Langasek wrote:
> > On Sat, Dec 10, 2005 at 06:53:47AM -0800, Blars Blarson wrote:
> > > FAILED

> > But FAILED is an advisory state anyway; it doesn't directly benefit the
> > port, at all, to have the package listed as "Failed", this is just a
> > convenience for folks sifting through the build failures (like the rarely
> > used "confirmed" BTS tag is for maintainers).  In the long-term, one of two
> > things needs to happen with each of these build failures: the package needs
> > to be added to P-a-s so the arch no longer tries to build it, or the package
> > needs to be fixed -- via porter NMU if necessary.

> > So as you have the list of these packages, as a porter you can proceed with
> > figuring out which of the two categories each falls into, and take the
> > necessary action without worrying about the "Failed" state, yes?

> Indeed, for practical buildd maintainance purposes, the distinction is
> not that important -- though 'Failed' is known to not benefit of a
> requeue, while 'Building:Maybe-Failed' might or might not, it's unkown,
> most archs should have enough surplus buildd power that retrying
> everything once in a while doesn't hurt.

> The major benefit is though to make it apparant for porters what to look
> into, without all the 'noise' in between of maybe-transient failures.
> One could also make sure that the FTBFS bugs are tagged (user-tagged)
> with [EMAIL PROTECTED] (etc) for example (or [EMAIL PROTECTED] There
> doesn't exist a [EMAIL PROTECTED] for example...), so that one can get a
> nice overview of all the porting bugs. It'd make sense to synchronise
> this across all architectures, so that it is consistent.

http://lists.debian.org/debian-alpha/2005/12/msg00028.html

Our porters can beat up your porters.

;)

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
[EMAIL PROTECTED]   http://www.debian.org/


signature.asc
Description: Digital signature


Re: Sparc build failure analysis (was Re: StrongARM tactics)

2005-12-11 Thread Jeroen van Wolffelaar
On Sun, Dec 11, 2005 at 12:35:26AM -0800, Steve Langasek wrote:
> On Sat, Dec 10, 2005 at 06:53:47AM -0800, Blars Blarson wrote:
> > FAILED
> 
> But FAILED is an advisory state anyway; it doesn't directly benefit the
> port, at all, to have the package listed as "Failed", this is just a
> convenience for folks sifting through the build failures (like the rarely
> used "confirmed" BTS tag is for maintainers).  In the long-term, one of two
> things needs to happen with each of these build failures: the package needs
> to be added to P-a-s so the arch no longer tries to build it, or the package
> needs to be fixed -- via porter NMU if necessary.
> 
> So as you have the list of these packages, as a porter you can proceed with
> figuring out which of the two categories each falls into, and take the
> necessary action without worrying about the "Failed" state, yes?

Indeed, for practical buildd maintainance purposes, the distinction is
not that important -- though 'Failed' is known to not benefit of a
requeue, while 'Building:Maybe-Failed' might or might not, it's unkown,
most archs should have enough surplus buildd power that retrying
everything once in a while doesn't hurt.

The major benefit is though to make it apparant for porters what to look
into, without all the 'noise' in between of maybe-transient failures.
One could also make sure that the FTBFS bugs are tagged (user-tagged)
with [EMAIL PROTECTED] (etc) for example (or [EMAIL PROTECTED] There
doesn't exist a [EMAIL PROTECTED] for example...), so that one can get a
nice overview of all the porting bugs. It'd make sense to synchronise
this across all architectures, so that it is consistent.

If that is done, and there would be some way for porters to easily tag
the build failures themselves on what bug they correspond with, or not,
and especially, what failures are new and are yet to be tagged, there'd
be an easy and clear workflow for porters to work on failures. I don't
think there has really been such a defined porter workflow for build
failures, and nobody so far has built/defined one to the best of my
knowledge. And this touches one of the core points Vancouver is intended
to solve: *porters* need to work on *porting*, and help actively and
actually fixing porting issues in the archive. If creating a better
interface for people to work on this is a part of achieving it, so be
it. I'll see whether I can hack up something together for this,
extending buildd.d.o/~jeroen/status.

--Jeroen

-- 
Jeroen van Wolffelaar
[EMAIL PROTECTED] (also for Jabber & MSN; ICQ: 33944357)
http://Jeroen.A-Eskwadraat.nl


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Sparc build failure analysis (was Re: StrongARM tactics)

2005-12-11 Thread Kevin Mark
On Sat, Dec 10, 2005 at 06:53:47AM -0800, Blars Blarson wrote:
> In article <[EMAIL PROTECTED]> [EMAIL PROTECTED] writes:
> >On Tue, Dec 06, 2005 at 05:21:46PM -0800, Blars Blarson wrote:
> >> I can do the analyzing, but what should I do with the results?
> >> [EMAIL PROTECTED] seems to be a black hole.  You'll need to find
> >> someone willing to communicate with access to the buildd queues before
> >> the porters can do anything.
> >
> >I said that deciding which packages should belong in P-a-s is porter work;
> >as is filing bugs on failed packages that shouldn't, providing patches, and
> >doing porter NMUs if necessary.
> 
> Again: what can I do with such a list?  See the list below.
> 
> >If the porters do this effectively, there's really not much need at all for
> >telling the buildd maintainers about transient build failures, because
> >they'll be pretty obvious (and account for the majority of failures, as it
> >should be).
> 
> Just because it is obvious does not mean that the buildd adminstrator
> does the correct thing.  kq was "uploaded" 51 days ago, trustedqsl was
> "uploaded" 25 days ago, neither is in the archive.
> 
> openoffice.org has been "building" for 8 days, it only took 57 hours
> on my slower than any current sparc buildd pbuilder.  kexi has been
> "building" for 6 days, it took less than 2 hours.
Hi Blars et al.,
has anyone every considered a check in the buildd infrastructure to
alert someone (buildd admin and/or others) if a build is taking too long
(eg openoffice usually takes between 2-3 hours to build and the current
build has been building for 10 hours+). Something like a database entry
or a database of either previous build times or last build time. As a
way to not have a buildd tied up with an obvious build issue and thus
allow the issue to be address sooner thus alowing more buildd
throughput. I'd help but I have neither the skill nor the access to
buildd infrastrure (as I'm not a DD or a buildd admin) but try to give
ideas that I feel are helpful.
Anyway, hope those buildd (and thier admins) are humming along smoothly!
Cheers,
Kev
-- 
counter.li.org #238656 -- goto counter.li.org and be counted!
  `$' $' 
   $  $  _
 ,d$$$g$  ,d$$$b. $,d$$$b`$' g$b $,d$$b
,$P'  `$ ,$P' `Y$ $$'  `$ $  "'   `$ $$' `$
$$ $ $$g$ $ $ $ ,$P""  $ $$
`$g. ,$$ `$$._ _. $ _,g$P $ `$b. ,$$ $$
 `Y$$P'$. `YP $$$P"' ,$. `Y$$P'$ $.  ,$.


signature.asc
Description: Digital signature


Re: Sparc build failure analysis (was Re: StrongARM tactics)

2005-12-11 Thread Steve Langasek
On Sat, Dec 10, 2005 at 06:53:47AM -0800, Blars Blarson wrote:
> >I said that deciding which packages should belong in P-a-s is porter work;
> >as is filing bugs on failed packages that shouldn't, providing patches, and
> >doing porter NMUs if necessary.

> Again: what can I do with such a list?  See the list below.

Changes to the P-a-s list should be sent to the contacts listed at the top
of this file (http://buildd.debian.org/quinn-diff/Packages-arch-specific).

> >If the porters do this effectively, there's really not much need at all for
> >telling the buildd maintainers about transient build failures, because
> >they'll be pretty obvious (and account for the majority of failures, as it
> >should be).

> Just because it is obvious does not mean that the buildd adminstrator
> does the correct thing.  kq was "uploaded" 51 days ago, trustedqsl was
> "uploaded" 25 days ago, neither is in the archive.

Well, release-wise, the practical impact here is quite small; for packages
that aren't needed in order to fix RC bugs, for my part I'm quite content to
have the buildd admins manage such give-backs on their own schedule.  Being
more responsive to give-back requests may generally make people happier, but
there's also a context-switch cost associated with such status polling; in
the non-RC cases, does it really hurt anything to *not* have these packages
given back quickly?  If not, isn't a better solution for people to be
understanding of this?

kq and pointless given back now, btw (not trustedqsl, which is "Failed"
rather than "Uploaded").

> openoffice.org has been "building" for 8 days, it only took 57 hours
> on my slower than any current sparc buildd pbuilder.  kexi has been
> "building" for 6 days, it took less than 2 hours.

openoffice.org is listed as "building" because the buildd crashed mid-build.
Ryan Murray and the package maintainer discussed this on IRC when it
happened; it was not immediately given back because of concerns over whether
the build might take down a *second* buildd while there was still a
significant backlog due to the c2a transition.  No, this isn't perfectly
transparent; but yes, it should be acceptable.  There's almost never a
reason to fret over builds-gone-missing for at least, say, a week and a
half, which is about how long it would take for the package to be eligible
for testing.  In OOo's case, try adding another couple of weeks to that for
the current c2a transition that it blocks on...

> The sparc buildd mainainter has in the past left transient build
> failures lie for over 6 months.  For the past year he's been requeuing
> all maybe-failed packages every 1-3 months.

Well, a) we now have auto-dep-wait, so this is even less of a problem now
than it might have been before; b) in the general case, it's not much of a
problem.

> REQUEUE
> qterm   0.4.0pre3-2+b1  342381

Consider that this was a bug in a *toolchain* package, so more than a simple
give-back is required:  gcc-4.0 needs to be upgraded on the buildds for this
to work.  Given that this was caused by a stray \ in the source that didn't
belong, it would be advisable for the package maintainer to defensively
correct this as well.

> DEP-WAIT
> galago-sharp  libmono-dev
> dmraidlibklibc-dev
> motv  ibzvbi0 (= 0.2.17-3.0.1)
> qmailadminvpopmail-bin
> wvstreams libxplc0.3.13-dev
> sylpheed-claws-gtk2   libclamv-dev
> digikam   libartsl-dev (>=1.4.2)
> tulip mesa-common-dev (= 6.2.1-7)
> liferea   libdbus-glib-1-dev
> kwave kdemultimedia-dev
> ivtools   libace-dev
> fwbuilder libfwbuilder-dev
> gtksharp2 libmono-dev
> libavglibavcodeccvs-dev
> gnucash   slib
> memepack  r-base-dev
> r-cran-bayesm r-base-dev

Um, at least some of these dep-waits are completely wrong; r-base-dev and
slib are arch: all packages, and it's not useful to set a dep-wait on a
package that *is* available.  Could you please revise this list accordingly?

Also, setting an (= $version) dep-wait isn't particularly helpful, sometimes
newer source versions will be uploaded before binaries are uploaded for the
current version.  So please clarify whether these are meant to be >= or
something else.

> FAILED

But FAILED is an advisory state anyway; it doesn't directly benefit the
port, at all, to have the package listed as "Failed", this is just a
convenience for folks sifting through the build failures (like the rarely
used "confirmed" BTS tag is for maintainers).  In the long-term, one of two
things needs to happen with each of these build failures: the package needs
to be added to P-a-s so the arch no longer tries to build it, or the package
needs to be fixed -- via porter NMU if necessary.

So as you have the list of these packages, as a porter you can proceed with
figuring out which of the two

Re: Sparc build failure analysis (was Re: StrongARM tactics)

2005-12-10 Thread Christoph Hellwig
On Sat, Dec 10, 2005 at 06:53:47AM -0800, Blars Blarson wrote:
> numactl
>   only supports i386 amd64 ia64
>   appears to assume intel-style stuff, would need major redesign
>   for other architectures

There's nothing intel-specific in here, rather it assumes NUMA support
in the kernel.  Obviously this is only the case for architectures that
support numa. I've actually tested it on ppc64, although not on debian.

> libaio

this should build on all architectures.  Currently it needs a tiny bit
of architecture-specific code, but that could be avoided by using proper
syscall macros instead of trying to do it's own.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]