Re: Automated testing - design and interfaces

2005-11-25 Thread Petter Reinholdtsen

[Daniel Burrows]
> To me, this sounds like the argument that "I don't need seat belts
> or air bags because only bad drivers crash and I'm a good driver".

Or even better, "using seat belts is showing distrust to the
driver". :)


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-24 Thread Daniel Burrows
On Sat, Nov 19, 2005 at 01:28:34AM -0600, Peter Samuelson <[EMAIL PROTECTED]> 
was heard to say:
> Do you really think packaging mistakes not caught by lintian would be
> caught by the test suite put together by the same goober who made those
> packaging mistakes?

  To me, this sounds like the argument that "I don't need seat belts or
air bags because only bad drivers crash and I'm a good driver".

  Daniel


signature.asc
Description: Digital signature


Re: Automated testing - design and interfaces

2005-11-24 Thread Philip Hands
Ian Jackson wrote:
> The scheme I'm proposing is useful to Debian even if the buildds don't
> get enhanced to run the tests automatically, because package
> maintainer tools can easily be enhanced to do that.  Of course Ubuntu
> will do that testing automatically but Ubuntu apparently has (will
> have) different infrastructure tools.

Also, if the tests are available for use by end-users, we can ask
pre-release testers to run all the tests on packages they install as part
of their installation testing.  A report saying "I tried installing this
set of packages, and not only did I succeed, but also the software all
works on my system" is a lot better than "I managed to install the packages."

What you are proposing seems like a better way of doing package tests than
the debian-test package I cobbled together ages ago (which has since
quietly gathered dust due to lack of effort from me, and lack of interest
from other maintainers) -- if we can get this to the point where the
default is for maintainers to write new tests as part of their bug fixing
procedure, then we'll end up with a comprehensive set of regression tests
without needing people to expend much more effort than was needed to fix
the bug anyway.

Well done Ian :-)

Cheers, Phil.


signature.asc
Description: OpenPGP digital signature


Re: Automated testing - design and interfaces

2005-11-24 Thread Ian Jackson
Bill Allombert writes ("Re: Automated testing - design and interfaces"):
> My fault, I wrote literally half of the sentence. What I meant is that
> nothing prevent you to run arbitrary test in your 'debian/rules build'
> target even if you have to add more Build-Depends. The test is run each
> time the package is build.  This practice is encouraged but the buildd
> admins, is performed automatically for all platforms even unofficial one,
> and actually prevent buggy packages to be uploaded.

Right.  But it won't find bugs where the problem is in the packaging
rather than in the code.

> OTOH, having a dedicated test network that duplicate the buildd network
> is going to be very hard to set up.

The scheme I'm proposing is useful to Debian even if the buildds don't
get enhanced to run the tests automatically, because package
maintainer tools can easily be enhanced to do that.  Of course Ubuntu
will do that testing automatically but Ubuntu apparently has (will
have) different infrastructure tools.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-24 Thread Ian Jackson
Peter Samuelson writes ("Re: Automated testing - design and interfaces"):
> Do you really think packaging mistakes not caught by lintian would be
> caught by the test suite put together by the same goober who made those
> packaging mistakes?

The goobers (as you say) who make packaging mistakes are usually also
the people who apply patches (including patches to add tests) without
exercising too much (how shall we say) editorial control.

There have been plenty of times when packaging mistakes not found by
lintian have rendered packages totally broken.  For example, a
nearly-empty package will pass lintian just fine ...

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-21 Thread Lars Wirzenius
to, 2005-11-17 kello 18:43 +, Ian Jackson kirjoitti:
> Note: that this is one of two messages on roughly the same topic.
> 
> This message will deal solely with TECHNICAL issues.  If you have some
> technical followup then please go ahead and reply here.  If you have a
> complaint or comment about my or Ubuntu's approach, please reply in
> debian-*project* to my other message, titled `Automated testing -
> politics, information, and Ubuntu's plans'.

My program, piuparts, has been mentioned in this thread, and I feel I
should comment. I started writing a reply, and it grew and grew, and I
don't have the time to finish it until next week at the earliest (I'm
going to be away this week). I will however offer the following brief
remarks and perhaps next week I'll see that my long response isn't
needed at all.

1. I think Ian and Ubuntu are correct in that automatic testing is a
good thing and should be done more. We can squabble over implementation,
and there's a few things buzzing in the back of my brain that want to
claim that Ian's approach needs some tweaking, but until and unless I
figure things out I am not going to stand in the way of progress, even
if it doesn't go quite in the direction my gut feeling says it should
go. If Ian (or others) gets it wrong, we can fix it in the next
iteration. No worries there.

2. If it turns out that having piuparts run the tests is technically a
good idea, I'm all for it.

3. The Debian QA team is somewhat disorganized and understaffed. It
would be great to get a good, active team. For example, I've been
writing and running piuparts pretty much in isolation, and I know at
least one or two other people have also been running it. We should join
forces, and gather more people besides, so that by the time etch freezes
next year, we WILL ALREADY HAVE FOUND ALL THE BUGS! Hopefully fixed them
as well, but I see QA's primary role being a tester, a finder of
problems, which the general developership then fixes.

4. As a design principle, if it is possible to have generic tests rather
than have every package add thoses tests themselves, we should go for
the generic tests.

I'll be happy to continue this discussion next week. If, meanwhile,
everyone else continues it and solves every possible problem, I will be
more than happy. I will be positively espanglished!

-- 
Choose wisely, choose often.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-20 Thread Robert Collins
On Thu, 2005-11-17 at 14:36 -0800, Steve Langasek wrote:
> [let's get this over to a technical list like it was supposed to be ;)]

> > Following your exit status based approach you could add to stanzas
> > something like:
> 
> >   Expected-Status: 0
> 
> > I found the above requirement the very minimum for a test interface.
> > What follows is optional (IMHO).
> 
> FWIW, I don't see that there's a clear advantage to having the test harness
> *expect* non-zero exit values (or non-empty output as you also suggested).
> It may make it easier to write tests by putting more of the logic in the
> test harness, but in exchange it makes it harder to debug a test failure
> because the debugger has to figure out how "correct" and "incorrect" are
> defined for each test, instead of just getting into the meat of seeing why
> the test returned non-zero.  Likewise, expecting successful tests to be
> silent means that you can rely on any output being error output that can be
> used for debugging a test failure.

Right. Splitting it into two bits ...

with respect to exit code, there is generally only one way to succeed,
but many ways to fail. So reserving 0 for 'test succeeded' in ALL cases,
makes writing front ends, or running the tests interactively much
easier. Its certainly possible to provide a $lang function that can
invert the relationship for you, for 'expected failure' results. One of
the things about expected failures is their granularity: is a test
expected to fail because 'file FOO is missing', or because 'something
went wrong'. The former test is best written as an explicit check, where
you invert the sense in the test script. Its best because this means
that when the behaviour of the failing logic alters - for better or
worse - you get flagged by the test that it has changed. Whereas a
handwaving 'somethings broken' style expected failure really does not
help code maintenance at all. So while it can be useful in the test
interface to have an explicit code for 'expected failure', I think it is
actually best to just write the test to catch the precise failure, and
report success. 

As for silence, yes, noise is generally not helpful, although long
running test suites can usefully give *some* feedback (a . per 100 tests
say) to remind people its still running.

Rob

-- 
GPG key available at: .


signature.asc
Description: This is a digitally signed message part


Re: Automated testing - design and interfaces

2005-11-18 Thread Peter Samuelson

[Ian Jackson]
> Running the upstream test suite in debian/rules usually isn't the
> answer to packaging mistakes, library mismanglements, and the like.

I have an idea.  What if we had a script that ran dpkg-buildpackage and
then immediately ran lintian and linda if available, to look for common
packaging mistakes.

Wait, that would be debuild and pdebuild.

Do you really think packaging mistakes not caught by lintian would be
caught by the test suite put together by the same goober who made those
packaging mistakes?

And I'm having trouble coming up with the kinds of issues which are (a)
not packaging mistakes per se, and (b) necessary to perform on a built
package (as opposed to a build tree or install tree).  I mean,
otherwise your test would work fine as part of the upstream testsuite,
which you should already be invoking from debian/rules.


signature.asc
Description: Digital signature


Re: Automated testing - design and interfaces

2005-11-18 Thread Anthony Towns
On Fri, Nov 18, 2005 at 12:23:49PM +, Ian Jackson wrote:
> (Note: sorry about my earlier header mixup.  This thread is on the
> wrong list so I'm crossposting this reply to -project and -policy and
> have set Reply-To to point to -policy.  I will also quote more of
> Stefano's message than would usually be sensible, to give readers in
> -policy an easier time.)

This isn't even implemented, so it can hardly be ready to be added to
-policy; I think Steve was right in choosing -devel as the right list...

Cheers,
aj



signature.asc
Description: Digital signature


Re: Automated testing - design and interfaces

2005-11-18 Thread Bill Allombert
On Fri, Nov 18, 2005 at 03:35:25PM +, Ian Jackson wrote:
> Bill Allombert writes ("Re: Automated testing - design and interfaces"):
> > piuparts is a first answer to that: it allows maintainer to check that
> > their package will install, remove and upgrade in a clean environnement.
> 
> Yes, and piuparts is a good thing.  But, it doesn't allow the
> maintainer to easily test that their package _works_.  We currently
> rely on the developer doing some ad-hoc testing.  This can easily be
> improved.

piuparts is in very early development phase and it has already quite 
expanded the possibility of testing Debian. I am sure there are room
to add new feature for yet more testing.
 
> > > > If Ubuntu want to improve the testing coverage, you could start by
> > > > submitting patches to packages that don't run test-suite in
> > > > debian/rules. That would profit both Debian and Ubuntu and there are
> > > > lot of work to do there.
> > > 
> > > Running the upstream test suite in debian/rules usually isn't the
> > > answer to packaging mistakes, library mismanglements, and the like.
> > 
> > There is no reason to restrict debian/rules to upstream test-suite?
> 
> Yes, that's true.  But I thought you meant that we should submit
> patches to make packages run the upstream test suite (if any) in
> debian/rules.
> 
> If that's not what you meant I'm afraid I don't follow you.  Can you
> explain what you meant ?

My fault, I wrote literally half of the sentence. What I meant is that
nothing prevent you to run arbitrary test in your 'debian/rules build'
target even if you have to add more Build-Depends. The test is run each
time the package is build.  This practice is encouraged but the buildd
admins, is performed automatically for all platforms even unofficial one,
and actually prevent buggy packages to be uploaded.

You can obviously run the upstream test-suite, but you can also write 
a new test-suite, add it to the diff.gz and run it in 'debian/rules build',
whether it is Debian specific or not.

Of course, non-Debian specific test-suite should be pushed upstream.

Maybe developers will propose helper packages that provide simple
test framework or standard tests for libraries like we have debhelper, 
dpatch, cdbs, etc. but at this time the issue is rather to write
more tests than to speculate.

OTOH, having a dedicated test network that duplicate the buildd network
is going to be very hard to set up.

(I would not mind be CCed since I am only subscribed to -policy).

Cheers,
-- 
Bill. <[EMAIL PROTECTED]>

Imagine a large red swirl here.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread David Nusinow
On Fri, Nov 18, 2005 at 03:35:25PM +, Ian Jackson wrote:
> Bill Allombert writes ("Re: Automated testing - design and interfaces"):
> > On Fri, Nov 18, 2005 at 12:08:25PM +, Ian Jackson wrote:
> > > maintainer didn't really test the package after installing it because
> > > it was too much trouble.  If it can be standardised and automated, and
> > > if a way can be found for Ubuntu to share tests with Debian, then
> > > everyone wins.
> > 
> > piuparts is a first answer to that: it allows maintainer to check that
> > their package will install, remove and upgrade in a clean environnement.
> 
> Yes, and piuparts is a good thing.  But, it doesn't allow the
> maintainer to easily test that their package _works_.  We currently
> rely on the developer doing some ad-hoc testing.  This can easily be
> improved.

Rather than invent a whole new test harness then, how about integrating the
functionality you want in to piuparts?

 - David Nusinow


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread Ian Jackson
Bill Allombert writes ("Re: Automated testing - design and interfaces"):
> On Fri, Nov 18, 2005 at 12:08:25PM +, Ian Jackson wrote:
> > maintainer didn't really test the package after installing it because
> > it was too much trouble.  If it can be standardised and automated, and
> > if a way can be found for Ubuntu to share tests with Debian, then
> > everyone wins.
> 
> piuparts is a first answer to that: it allows maintainer to check that
> their package will install, remove and upgrade in a clean environnement.

Yes, and piuparts is a good thing.  But, it doesn't allow the
maintainer to easily test that their package _works_.  We currently
rely on the developer doing some ad-hoc testing.  This can easily be
improved.

> > > If Ubuntu want to improve the testing coverage, you could start by
> > > submitting patches to packages that don't run test-suite in
> > > debian/rules. That would profit both Debian and Ubuntu and there are
> > > lot of work to do there.
> > 
> > Running the upstream test suite in debian/rules usually isn't the
> > answer to packaging mistakes, library mismanglements, and the like.
> 
> There is no reason to restrict debian/rules to upstream test-suite?

Yes, that's true.  But I thought you meant that we should submit
patches to make packages run the upstream test suite (if any) in
debian/rules.

If that's not what you meant I'm afraid I don't follow you.  Can you
explain what you meant ?

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread Bill Allombert
On Fri, Nov 18, 2005 at 12:08:25PM +, Ian Jackson wrote:
> maintainer didn't really test the package after installing it because
> it was too much trouble.  If it can be standardised and automated, and
> if a way can be found for Ubuntu to share tests with Debian, then
> everyone wins.

piuparts is a first answer to that: it allows maintainer to check that
their package will install, remove and upgrade in a clean environnement.

> > If Ubuntu want to improve the testing coverage, you could start by
> > submitting patches to packages that don't run test-suite in
> > debian/rules. That would profit both Debian and Ubuntu and there are
> > lot of work to do there.
> 
> Running the upstream test suite in debian/rules usually isn't the
> answer to packaging mistakes, library mismanglements, and the like.

There is no reason to restrict debian/rules to upstream test-suite?

Cheers,
Bill.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread Ian Jackson
(Note: sorry about my earlier header mixup.  This thread is on the
wrong list so I'm crossposting this reply to -project and -policy and
have set Reply-To to point to -policy.)

Sven Luther writes ("Re: Automated testing - design and interfaces"):
> How will this interact with stuff like the powerpc32/powerpc64
> biarch situation, where there is a series of tests which can only be
> handled on powerpc64, but no powerpc32 ? I know ubuntu has only
> powerpc64 machines, so it is not as important to you, but debian is
> using 32bit autobuilders, and i geuss in both case people would like
> to test and build the packages on powerpc32 machines too.

I would expect this problem to be dealt with by specifying an
appropriate keyword in the Restrictions field in the test stanza.  Eg,
 Restrictions: cpu32
or
 Restrictions: arch-powerpc
or some such.

Ian.

PS: Apologies again for my rather abrupt earlier response.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread Ian Jackson
(Note: sorry about my earlier header mixup.  This thread is on the
wrong list so I'm crossposting this reply to -project and -policy and
have set Reply-To to point to -policy.  I will also quote more of
Stefano's message than would usually be sensible, to give readers in
-policy an easier time.)

Stefano Zacchiroli writes ("Re: Automated testing - design and interfaces"):
> On Thu, Nov 17, 2005 at 06:43:32PM +, Ian Jackson wrote:
> >   This means execute debian/tests/fred, debian/tests/bill, etc.,
> >   each with no arguments, expecting exit status 0 and no stderr. The
> 
> Having been involved in various unit testing packages I found the above
> expectations too much constraining.  The first thing you will need after
> requiring all tests not to fail is to be able to distinguish: "test that
> need to suceed" vs "test that need to fail". Only the misbehaviour of
> tests wrt their expected result should be reported as test failures. I
> thus propose to add 
> 
> Following your exit status based approach you could add to stanzas
> something like:
> 
>   Expected-Status: 0

The need for this can be avoided by wrapping the actual test up with
some test-runner script.

I didn't want to make the _interface_ to the tests the kind of rich
interface a test suite framework has, with arrangements for specifying
expected behaviour, matching the output of programs against regexps,
etc.

Instead, if a package needs those facilities it then the test stanza
would declare a dependency on a package which would provide them.  For
convenience, the source package with the test-runner which interprets
these files would probably also produce a .deb with a few helpful
programs in it, but in general I think this problem is not part of the
_interface_.

The interface should be as simple as we can make it while still being
able to do the job.  Remember that in it should be possible, and not
wholly impractical, to reimplement the test runner.

> I can imagine tons of different ways of specifying [expected output]

Exactly :-).

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread Ian Jackson
Ian Jackson writes ("Automated testing - design and interfaces"):
> Note: that this is one of two messages on roughly the same topic.
> 
> This message will deal solely with TECHNICAL issues.  [...]

Damn!  I must have gotten confused while editing headers and this went
to the wrong list.  My apologies to Sven Luther and Stefano Zacchiroli
for taking them to task for posting to the wrong list when the error
was mine.

Since we've only had a few messages in this thread on -project I think
the best thing would be for me to repost my original message to
-policy.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread Ian Jackson
Stefano Zacchiroli writes ("Re: Automated testing - design and interfaces"):
> Having been involved in various unit testing packages [...]

These are technical, not political, comments and should be on
-policy.  Could I trouble you to repost your message there ?  Or would
you prefer me to do so ?

We should keep the technical discussion in one place, and also of
course we want to avoid cluttering the lists with off-topic traffic.

Thanks,
Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread Ian Jackson
Bill Allombert writes ("Re: Automated testing - design and interfaces"):
> Currently Debian packages are tested through a lot of channel:
> [stuff]

Right.

> Debian is an organisation which can afford a lot of decentralisation,
> but where centralisation is very expensive [...]

Right.

> Doing the checks in debian/rules is not perfect, but it happens before
> the package is uploaded, is performed for all architecture and more
> importantly is done with the current infrastructure.  

A big problem is that doing the checks in debian/rules will fail to
spot many obvious packaging mistakes.

> Going toward a centralised testing facility [...]

This is one of the things that Ubuntu will be using it for.  But one
of the things that I expect Debian to use the same facilities for is
to allow a developer to do a test of the package they're about to
upload, _after building and installing it_.

This might, just as one example, help get rid of a lot of the `NMU
broke it totally'.  And, of course, there have been occasions when the
maintainer didn't really test the package after installing it because
it was too much trouble.  If it can be standardised and automated, and
if a way can be found for Ubuntu to share tests with Debian, then
everyone wins.

> If Ubuntu want to improve the testing coverage, you could start by
> submitting patches to packages that don't run test-suite in
> debian/rules. That would profit both Debian and Ubuntu and there are
> lot of work to do there.

Running the upstream test suite in debian/rules usually isn't the
answer to packaging mistakes, library mismanglements, and the like.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread Adeodato Simó
* Ian Jackson [Fri, 18 Nov 2005 11:58:26 +]:

> This is a technical comment and ought to be discussed on -policy,
> rather than -project.

  Note that you posted to the wrong list first.

-- 
Adeodato Simó
EM: dato (at) the-barrel.org | PK: DA6AE621
 
So, irregular/impure/non-elegant syntax doesn't bother me. Shit, I speak
English. Mainly I just want to type less.
-- William Morgan, on [ruby-talk:131589]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-18 Thread Ian Jackson
Sven Luther writes ("Re: Automated testing - design and interfaces"):
> How will this interact with stuff like the powerpc32/powerpc64
> biarch situation, where there is a series of tests which can only be
> handled on powerpc64, but no powerpc32 ? I know ubuntu has only
> powerpc64 machines, so it is not as important to you, but debian is
> using 32bit autobuilders, and i geuss in both case people would like
> to test and build the packages on powerpc32 machines too.

This is a technical comment and ought to be discussed on -policy,
rather than -project.  But, I think if you read the draft you'll see
that it's flexible enough to accomodate these kinds of things.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-17 Thread Sven Luther
On Thu, Nov 17, 2005 at 06:43:32PM +, Ian Jackson wrote:
> Note: that this is one of two messages on roughly the same topic.
> 
> This message will deal solely with TECHNICAL issues.  If you have some
> technical followup then please go ahead and reply here.  If you have a
> complaint or comment about my or Ubuntu's approach, please reply in
> debian-*project* to my other message, titled `Automated testing -
> politics, information, and Ubuntu's plans'.

How will this interact with stuff like the powerpc32/powerpc64  biarch
situation, where there is a series of tests which can only be handled on
powerpc64, but no powerpc32 ? I know ubuntu has only powerpc64 machines, so it
is not as important to you, but debian is using 32bit autobuilders, and i
geuss in both case people would like to test and build the packages on
powerpc32 machines too.

Friendly,

Sven Luther


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-17 Thread Anthony Towns
Bcc'ed to -project; followups to -devel.

On Thu, Nov 17, 2005 at 06:43:32PM +, Ian Jackson wrote:
> Note that the point is to be able to test the _actual package_, _as
> installed_ (eg on a testbed system).  This is much better than testing
> the package from the source treeu during build time because it will
> detect packaging mistakes as well as program source problems and as we
> know packaging mistakes of one kind or another are one of the main
> causes of problems.

Mostly it's just different -- testing at build time lets you do unit
tests before putting the objects together and stripping them, which
gives you the opportunity to catch other bugs. One isn't better than the
other; though doing both is better than either or neither.

Other useful tests are things like "install all of optional" which
will catch unspecified Conflicts: relations, and "do a partial upgrade
from stable to this package in unstable" which will tell you if some
dependencies aren't strict enough. Looking through Contents-* files will
also let you catch unspecified dependencies.

Having multiple machines to do tests can be worthwhile too -- if you want
to test firewalls or that there aren't any listening ports on default
installs, eg.

>   The source package provides a test metadata file debian/tests/
>   control. This is a file containing zero or more RFC822-style
>   stanzas, along these lines:
> Tests: fred bill bongo
> Restrictions: needs-root breaks-computer
>   This means execute debian/tests/fred, debian/tests/bill, etc.,

Seems like:

  debian/tests/bar:
#!/bin/sh
# Restrictions: needs-root trashes-system
# Requires: foo

  foo FAIL: ...
  bar SKIP: foo failed

would make more sense than a separate file describing the tests? Is the
"Depends:" line meant to refer to other Debian packages (and thus be
a lower level version of Restrictions:) or is it meant to indiciate
test interdependencies? If it's meant to be for debian packages, maybe

  # Restrictions: deb:xvncviewer

might be better.

Note that it's often better to have a single script run many tests, so
you probably want to allow tests to pass back some summary information,
or include the last ten lines of its output or similar. Something like:

  foo FAIL:
FAILURE: testcase 231
FAILURE: testcase 289
FAILURE: testcase 314
3/512 test cases failed
  bar FAIL: (341123 other lines, then:)
xxx


x
Aborted (core dumped)
  baz SKIP: foo failed
  quux PASS

maybe.

>   Any unknown thing in Restrictions, or any unknown field in the
>   RFC822 stanza, causes the tester core to skip the test with a
>   message like `test environment does not support "blames-canada"
>   restriction of test "simpsons"'.

You mean southpark, surely?

>   A basic test could be simply running the binary and checking the
>   result status (or other variants of this). Eventually every
>   package would to be changed to include at least one test.

These sorts of tests are better done as part of debian/rules, I would've
thought -- the advantage of that is that the problems get caught even
when users rebuild the package themselves, and you don't need to worry
about special test infrastructure like you're talking about for the
simple case.

>   Ideally eventually where possible the upstream regression tests
>   could be massaged so that they test the installed version. Whether
>   this is possible and how best to achieve it has to be decided on a
>   per-package basis.

Having

  Restrictions: package-installed

and

  Restrictions: build-tree

might be worth thinking about so that it's easy to do both sorts of
testing.

>   Even integration tests can be represented like this: if one
>   package's tests Depend on the other's, then they are effectively
>   integration tests. The actual tests can live in whichever package
>   is most convenient.

Going from build/foo-1.0/debian/tests/x to
projects/bar-3.14/debian/tests/y seems difficult.

Adding "deb:otherpkg" or "deb:libotherpkg-dbg" to the Restrictions:
seems more plausible?

Anyway, something that can be run with minimal amounts of setup seems
most likely to be most useful: so running as part of the build without
installing the package, running without anything special installed but the
package being tested and a script that parses the control information,
stuff that can be run on a user's system without root privs and without
trashing the system, etc.

If there's going to be a "debian/rules check" command, "debian/tests/*"
probably should just be a suggested standard, or vice-versa --
minimising the number of required interfaces would likely make things
more flexible. Being able to add upstream tests by nothing more than
symlinking them into debian/tests might be a worthwhile goal, perhaps.

> From: Ian Jackson <[EMAIL PROTECTED]>

> Ian.
> (wearing both my Debian and Ubuntu hats)

Heh.

Cheers,
aj



signatu

Re: Automated testing - design and interfaces

2005-11-17 Thread Steve Langasek
[let's get this over to a technical list like it was supposed to be ;)]

On Thu, Nov 17, 2005 at 10:43:34PM +0100, Stefano Zacchiroli wrote:
> On Thu, Nov 17, 2005 at 06:43:32PM +, Ian Jackson wrote:
> >   This means execute debian/tests/fred, debian/tests/bill, etc.,
> >   each with no arguments, expecting exit status 0 and no stderr. The

> Having been involved in various unit testing packages I found the above
> expectations too much constraining.  The first thing you will need after
> requiring all tests not to fail is to be able to distinguish: "test that
> need to suceed" vs "test that need to fail". Only the misbehaviour of
> tests wrt their expected result should be reported as test failures. I
> thus propose to add 

> Following your exit status based approach you could add to stanzas
> something like:

>   Expected-Status: 0

> I found the above requirement the very minimum for a test interface.
> What follows is optional (IMHO).

FWIW, I don't see that there's a clear advantage to having the test harness
*expect* non-zero exit values (or non-empty output as you also suggested).
It may make it easier to write tests by putting more of the logic in the
test harness, but in exchange it makes it harder to debug a test failure
because the debugger has to figure out how "correct" and "incorrect" are
defined for each test, instead of just getting into the meat of seeing why
the test returned non-zero.  Likewise, expecting successful tests to be
silent means that you can rely on any output being error output that can be
used for debugging a test failure.

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
[EMAIL PROTECTED]   http://www.debian.org/


signature.asc
Description: Digital signature


Re: Automated testing - design and interfaces

2005-11-17 Thread Stefano Zacchiroli
On Thu, Nov 17, 2005 at 06:43:32PM +, Ian Jackson wrote:
>   This means execute debian/tests/fred, debian/tests/bill, etc.,
>   each with no arguments, expecting exit status 0 and no stderr. The

Having been involved in various unit testing packages I found the above
expectations too much constraining.  The first thing you will need after
requiring all tests not to fail is to be able to distinguish: "test that
need to suceed" vs "test that need to fail". Only the misbehaviour of
tests wrt their expected result should be reported as test failures. I
thus propose to add 

Following your exit status based approach you could add to stanzas
something like:

  Expected-Status: 0

I found the above requirement the very minimum for a test interface.
What follows is optional (IMHO).

I don't see the need/usefulness of requiring no stderr. It seems to me
rather ad-hoc, if people want to test output on file descriptors it can
do it via redirections and textual comparison of files. This bring us to
the next point.

In addition you can think about standardizing a way to compare what a
test prints on file descriptors. Often indeed in unit testing frameworks
you require a function to return a specific value. In the debian/ubuntu
world we can think at having files containing the expected output for a
given file descriptor and enrich stanzas with entries like:

  Expected-Output: tests/stdout.txt (0), tests/stderr.txt (2)

Since we are in the semantic-thing era textual comparison is no longer
always the appropriate choice, so it may be a wise to think from the
beginning to a way of specify ad-hoc comparison commands (think for
example at the comparison of XML documents).

I can imagine tons of different ways of specifying them:
- triples: 
- triples:  + per-format
  comparison tools in other stanza fields
- ...

Just choose your favorite one.

My 0.02 EUR.
Cheers.

-- 
Stefano Zacchiroli -*- Computer Science PhD student @ Uny Bologna, Italy
[EMAIL PROTECTED],debian.org,bononia.it} -%- http://www.bononia.it/zack/
If there's any real truth it's that the entire multidimensional infinity
of the Universe is almost certainly being run by a bunch of maniacs. -!-


signature.asc
Description: Digital signature


Re: Automated testing - design and interfaces

2005-11-17 Thread Russ Allbery
Bill Allombert <[EMAIL PROTECTED]> writes:

> Debian is an organisation which can afford a lot of decentralisation,
> but where centralisation is very expensive (debian-admin time, etc.). 
> Doing the checks in debian/rules is not perfect, but it happens before
> the package is uploaded, is performed for all architecture and more
> importantly is done with the current infrastructure.  

> Going toward a centralised testing facility where packages are checked
> before they go to unstable might be nice but even if the software was
> ready today, it would take ages before we could adapt the infrastructure
> to take advantage of it.

> It might be that Ubuntu is a different organisation with different
> ressources and the Debian way is not the most efficient for Ubuntu.

However, there do appear to me to be some obvious technical areas in which
Debian and Ubuntu could cooperate even if Debian isn't in a position to do
centralized testing for the time being.  Standardizing a way to run
post-install tests, for example; a lot of software, particularly software
that doesn't have upstream tests and wasn't designed for testing out of
the build tree, will have to be installed in a chroot before it can be
effectively tested.  That may be as simple as agreeing on some rule names
for debian/rules and possibly an output format.

Similarly, a way of enumerating the tests available, standardizing a way
of expressing test status (passed, failed, skipped, possibly various
reasons for skipping), dividing tests into fast and resource-intensive
tests, tagging tests that require an Internet connection... I agree that
many, if not most, Debian packages will not be in a position to *use* this
infrastructure immediately, but provided that Ubuntu has some cases in
mind so that these protocols can be designed in contact with the real
world and not just as a theoretical exercise, hashing out the protocol
details is still useful.  That way, as people slowly add tests, all the
tests work together and one can automate testing at scale.

This is fairly standard stuff, really; gcc, Perl, and many other large
upstream projects have gone through this exercise to standardize test
formats so that testing can be automated.

> If Ubuntu want to improve the testing coverage, you could start by
> submitting patches to packages that don't run test-suite in
> debian/rules. That would profit both Debian and Ubuntu and there are
> lot of work to do there.

Upstream test suites are sometimes problematic, though; in particular,
many upstreams routinely ship releases with failing tests and you have to
be familiar with upstream development to know if that's a real problem or
not.  That being said, certainly if one can run the upstream test suite
without causing such problems, it's a good idea.

-- 
Russ Allbery ([EMAIL PROTECTED])   


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Automated testing - design and interfaces

2005-11-17 Thread Bill Allombert
On Thu, Nov 17, 2005 at 06:43:32PM +, Ian Jackson wrote:
> So, what do you think ?

Currently Debian packages are tested through a lot of channel:

-- At build-time, if the package run a test-suite in debian/rules,
   which is strongly encouraged.

-- By the maintainer before uploading them.

-- Dependencies are checked by the testing scripts.

-- Packaging standards are checked by lintian.debian.org

-- Some developers try to rebuild the whole distribution from
   scratch to check for FTBFS regression.

-- Now we start to use piuparts to check for maintainer scripts
   behaviour and upgradability.

-- Developers run scripts to check for various issues varying from typo
   in description to menu hierachy inbalance.

-- etc...

Debian is an organisation which can afford a lot of decentralisation,
but where centralisation is very expensive (debian-admin time, etc.). 
Doing the checks in debian/rules is not perfect, but it happens before
the package is uploaded, is performed for all architecture and more
importantly is done with the current infrastructure.  

Going toward a centralised testing facility where packages are checked
before they go to unstable might be nice but even if the software was
ready today, it would take ages before we could adapt the infrastructure
to take advantage of it.

It might be that Ubuntu is a different organisation with different
ressources and the Debian way is not the most efficient for Ubuntu.

If Ubuntu want to improve the testing coverage, you could start by
submitting patches to packages that don't run test-suite in
debian/rules. That would profit both Debian and Ubuntu and there are
lot of work to do there.

Cheers,
-- 
Bill. <[EMAIL PROTECTED]>

Imagine a large red swirl here. 


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Automated testing - design and interfaces

2005-11-17 Thread Ian Jackson
Note: that this is one of two messages on roughly the same topic.

This message will deal solely with TECHNICAL issues.  If you have some
technical followup then please go ahead and reply here.  If you have a
complaint or comment about my or Ubuntu's approach, please reply in
debian-*project* to my other message, titled `Automated testing -
politics, information, and Ubuntu's plans'.

The first two paragraphs are common to the two messages:


One thing that the Debian universe is lacking is a good way to
automatically test packages.  This makes it hard to spot regressions,
and difficult to be systematic.

Ubuntu are proposing to invent a system to allow us to do automated
regression tests.  The full plan consists of a number of pieces and
can be read here:
  https://wiki.ubuntu.com/AutomatedTesting


If you're interested in this subject I'd appreciate it if you could go
and take a look.  Feel free to make your comments here on -policy, or
any other appropriate channel.

One of the important parts needed here is an interface for packages to
supply tests.  That is, a standard way for a Debian-format package to
enumerate and describe the tests it supplies, and to allow a test
harness to invoke them.

Note that the point is to be able to test the _actual package_, _as
installed_ (eg on a testbed system).  This is much better than testing
the package from the source treeu during build time because it will
detect packaging mistakes as well as program source problems and as we
know packaging mistakes of one kind or another are one of the main
causes of problems.

It seems to me that the right way for packages to offer their tests in
the source tree.  As I say in the wiki page:

  Other possibilities include a special .deb generated by the source
  (which is a bit strange and what happens to this .deb and will make
  it even harder to reuse upstream test suites), or putting them in
  the .deb to be tested (definitely wrong - most people won't want the
  tests and they might be very large) or having them floating about
  separately somewhere (which prevents us from sharing and exchanging
  tests with other parts of the Free Software community).  The source
  package is always available when development is taking place.


To save you looking at the wiki page for the details of the draft
interface spec. I have reproduced it here:

  Tests/metadata
  ...

  The source package provides a test metadata file debian/tests/
  control. This is a file containing zero or more RFC822-style
  stanzas, along these lines:

  Tests: fred bill bongo
  Restrictions: needs-root breaks-computer

  This means execute debian/tests/fred, debian/tests/bill, etc.,
  each with no arguments, expecting exit status 0 and no stderr. The
  cwd is guaranteed to be the root of the source package which will
  have been built (but note that the tests must test the installed
  version).

  If the stanza contains:

  Tests-Directory: path-from-source-root

  then we execute path-from-source-root/fred, path-from-source-root/
  bar, etc. This allows tests to live outside the debian/ metadata
  area, so that they can more palatably be shared with non-Debian
  distributions.

  Any unknown thing in Restrictions, or any unknown field in the
  RFC822 stanza, causes the tester core to skip the test with a
  message like `test environment does not support "blames-canada"
  restriction of test "simpsons"'.

  Additional possibilities:

  Depends: ...
  Tests: filenamepattern*
  Restrictions: modifies-global-data needs-x-display

  etc. - moves complexity from individual packages into central
  tester core.

  A basic test could be simply running the binary and checking the
  result status (or other variants of this). Eventually every
  package would to be changed to include at least one test.

  Ideally eventually where possible the upstream regression tests
  could be massaged so that they test the installed version. Whether
  this is possible and how best to achieve it has to be decided on a
  per-package basis.

  Even integration tests can be represented like this: if one
  package's tests Depend on the other's, then they are effectively
  integration tests. The actual tests can live in whichever package
  is most convenient.


So, what do you think ?


Thanks,
Ian.
(wearing both my Debian and Ubuntu hats)


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]