See inline.

On Mon, Oct 26, 2015 at 9:01 AM, Will Andrews <w...@firepipe.net> wrote:

> On Mon, Oct 26, 2015 at 9:06 AM, Prakash Surya
> <prakash.su...@delphix.com> wrote:
> > I see the following all fail on illumos master when we run it through the
> > regression suite at Delphix:
> >
> >     cli_root/zfs_property/zfs_written_property_001_pos
> >     cli_root/zfs_mount/zfs_mount_all_001_pos
> >     mdb/mdb_001_pos
> >     rsend/rsend_009_pos
> >     zvol/zvol_swap/zvol_swap_004_pos
> >     rootpool/rootpool_002_neg
> >     acl/nontrivial/zfs_acl_chmod_inherit_003_pos
> >     cli_root/zfs_mount/zfs_mount_009_neg
> >     grow_replicas/grow_replicas_001_pos
> >     rsend/rsend_008_pos
> >     cli_root/zfs_snapshot/zfs_snapshot_009_pos
> >     refreserv/refreserv_004_pos
> >     cli_root/zpool_get/zpool_get_002_pos
>
> Thanks Prakash.
>
> Is there a reason why we don't simply disable these tests until they
> are fixed?  Then we can quickly validate test runs by whether they're
> 100% passing.
>
> Obviously, failing tests should be fixed... but if they are taking
> months/years to fix, then it seems to me the gray line of "not
> important enough to fix right now" has long since been crossed.
>

This is just my personal opinion, but disabling tests that fail is
practically equivalent to
removing the test altogether. I've seen tests disabled in other projects
because "they
always fail" and then they're forgotten about, never fixed, and never
re-enabled.

With that said, I fully understand that if a test has been around for
months/years and
hasn't been addressed, then its more detrimental to leave it linger in a
failing state (since
it will likely mask "new failures").

The solution we've used at Delphix is we have a simply python script that
will parse
the results of the ZFS test suite output, query our internal bug tracker
for any failed
tests, and only report failure if it finds a failed test that doesn't have
a bug opened to
track it. Short of actually fixing the bug, I like this solution because it
still runs the test
and reports that it failed, but it keeps the failure from "masking" other
failures.

FWIW, I'm slowly working on porting this script over to get it working on
the OpenZFS
automated builds for GitHub pull requests (the builds are still a work in
progress, but
Matt mentioned them a bit during the Developer Summit last week). See:


https://github.com/prakashsurya/openzfs-build/commit/51e6c17ae6ab5a85a0470ce68a2b1cd0de5f098c

Once I get that integrated into the OpenZFS automated builds, it will
hopefully
make it easier to distinguish "expected failures" from "new failures
introduced
by a given patch"; but that will only be helpful for the automated builds
kicked
off via pull requests.

It might be worth pulling that script (or something equivalent) into the
illumos
source to make the output of the ZFS test suite easier to consume. It's
probably
possible to also have it be more dynamic, and query the illumos bug tracker
like
we do with our internal bug tracker.

I think there's many different ways to "fix" this (assuming we don't just
go and
fix the bugs that cause the test failures), it's just a matter of deciding
which way
is best, and what the requirements are.


>
> --Will.
>
_______________________________________________
developer mailing list
developer@open-zfs.org
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to