[racket-dev] DrDr Feature Request

2011-08-08 Thread Vincent St-Amour

I love DrDr, but there's a small thing that annoys me about it.

Some tests are prone to intermittent failures. For example, some
benchmarks need to create a file, and several benchmarks share the
same file, which leads to race conditions. Similarly, some DrRacket
tests sometimes fail for focus reasons.

So, whenever someone pushes, they may get failures from these tests,
then have go look at the actual errors, and try to figure out if they
actually broke something or not.

(Or, they ignore these failures, which is bad.)

Here are two potential solutions. Let's assume that I just pushed
something, and a test started failing.

- Have DrDr send me email for every push about the broken test for as
  long as it fails. If I get email more than once, it's likely that I
  actually broke something. If I only get email once, the problem went
  away on its own, and was likely an intermittent failure.

- Have the possiblity to flag some tests as intermittent (something
  like `drdr:random'), and only report failures for these tests if
  they fail twice in a row. This would reduce the amount of noise,
  since I expect most of these tests to pass most of the time. Actual
  breakage would still be detected, since it's unlikely that such
  failures would go away on their own. Detection would happen one push
  late, but that shouldn't be too much of an issue.

  Or, maybe only notify the pusher after two failures in a row, but
  notify the responsible person right away.

Any thoughts?

Vincent
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Robby Findler
I like the two-times-in-a-row thought.

FWIW, please try to avoid race conditions of the second kind.

I think the drracket test suites are special because they fail
not-so-often and I don't actually know how to fix them.  If either of
those weren't true then I'd say they should just not run in drdr. (So
the race-condition/using the same file thing fails this test.)

Robby

On Mon, Aug 8, 2011 at 10:56 AM, Vincent St-Amour stamo...@ccs.neu.edu wrote:

 I love DrDr, but there's a small thing that annoys me about it.

 Some tests are prone to intermittent failures. For example, some
 benchmarks need to create a file, and several benchmarks share the
 same file, which leads to race conditions. Similarly, some DrRacket
 tests sometimes fail for focus reasons.

 So, whenever someone pushes, they may get failures from these tests,
 then have go look at the actual errors, and try to figure out if they
 actually broke something or not.

 (Or, they ignore these failures, which is bad.)

 Here are two potential solutions. Let's assume that I just pushed
 something, and a test started failing.

 - Have DrDr send me email for every push about the broken test for as
  long as it fails. If I get email more than once, it's likely that I
  actually broke something. If I only get email once, the problem went
  away on its own, and was likely an intermittent failure.

 - Have the possiblity to flag some tests as intermittent (something
  like `drdr:random'), and only report failures for these tests if
  they fail twice in a row. This would reduce the amount of noise,
  since I expect most of these tests to pass most of the time. Actual
  breakage would still be detected, since it's unlikely that such
  failures would go away on their own. Detection would happen one push
  late, but that shouldn't be too much of an issue.

  Or, maybe only notify the pusher after two failures in a row, but
  notify the responsible person right away.

 Any thoughts?

 Vincent
 _
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Robby Findler
PS: I'm also happy if this class of tests only emails the responsible
person, and not the pusher.

Robby

On Mon, Aug 8, 2011 at 10:59 AM, Robby Findler
ro...@eecs.northwestern.edu wrote:
 I like the two-times-in-a-row thought.

 FWIW, please try to avoid race conditions of the second kind.

 I think the drracket test suites are special because they fail
 not-so-often and I don't actually know how to fix them.  If either of
 those weren't true then I'd say they should just not run in drdr. (So
 the race-condition/using the same file thing fails this test.)

 Robby

 On Mon, Aug 8, 2011 at 10:56 AM, Vincent St-Amour stamo...@ccs.neu.edu 
 wrote:

 I love DrDr, but there's a small thing that annoys me about it.

 Some tests are prone to intermittent failures. For example, some
 benchmarks need to create a file, and several benchmarks share the
 same file, which leads to race conditions. Similarly, some DrRacket
 tests sometimes fail for focus reasons.

 So, whenever someone pushes, they may get failures from these tests,
 then have go look at the actual errors, and try to figure out if they
 actually broke something or not.

 (Or, they ignore these failures, which is bad.)

 Here are two potential solutions. Let's assume that I just pushed
 something, and a test started failing.

 - Have DrDr send me email for every push about the broken test for as
  long as it fails. If I get email more than once, it's likely that I
  actually broke something. If I only get email once, the problem went
  away on its own, and was likely an intermittent failure.

 - Have the possiblity to flag some tests as intermittent (something
  like `drdr:random'), and only report failures for these tests if
  they fail twice in a row. This would reduce the amount of noise,
  since I expect most of these tests to pass most of the time. Actual
  breakage would still be detected, since it's unlikely that such
  failures would go away on their own. Detection would happen one push
  late, but that shouldn't be too much of an issue.

  Or, maybe only notify the pusher after two failures in a row, but
  notify the responsible person right away.

 Any thoughts?

 Vincent
 _
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev



_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Vincent St-Amour
At Mon, 8 Aug 2011 10:59:24 -0500,
Robby Findler wrote:
 FWIW, please try to avoid race conditions of the second kind.

Some of these I can try to fix. But I don't think all intermittent
failures fit in this category.

 I think the drracket test suites are special because they fail
 not-so-often and I don't actually know how to fix them.  If either of
 those weren't true then I'd say they should just not run in drdr. (So
 the race-condition/using the same file thing fails this test.)

Running these tests in DrDr has the benefit of detecting actual
breakage when it happens, so I don't think we should give up on this.

 PS: I'm also happy if this class of tests only emails the responsible
 person, and not the pusher.

I like that, and it's probably simpler to implement.

Vincent
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Robby Findler
On Mon, Aug 8, 2011 at 11:05 AM, Vincent St-Amour stamo...@ccs.neu.edu wrote:
 At Mon, 8 Aug 2011 10:59:24 -0500,
 Robby Findler wrote:
 FWIW, please try to avoid race conditions of the second kind.

 Some of these I can try to fix. But I don't think all intermittent
 failures fit in this category.

Right. I'm saying this: if you know you have a race condition and you
know how to fix it, then it is better to fix it than to use a
mechanism like the one you're asking for.

Robby
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Vincent St-Amour
At Mon, 8 Aug 2011 11:06:30 -0500,
Robby Findler wrote:
 
 On Mon, Aug 8, 2011 at 11:05 AM, Vincent St-Amour stamo...@ccs.neu.edu 
 wrote:
  At Mon, 8 Aug 2011 10:59:24 -0500,
  Robby Findler wrote:
  FWIW, please try to avoid race conditions of the second kind.
 
  Some of these I can try to fix. But I don't think all intermittent
  failures fit in this category.
 
 Right. I'm saying this: if you know you have a race condition and you
 know how to fix it, then it is better to fix it than to use a
 mechanism like the one you're asking for.

I agree.

Vincent
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Jon Rafkind
Another request: could DrDr process the latest push first? Its a little
annoying to get emails for tests that failed when the latest push fixes
them but DrDr is so far behind. Is there any benefit to testing all the
intermediate pushes?

On 08/08/2011 09:56 AM, Vincent St-Amour wrote:
 I love DrDr, but there's a small thing that annoys me about it.

 Some tests are prone to intermittent failures. For example, some
 benchmarks need to create a file, and several benchmarks share the
 same file, which leads to race conditions. Similarly, some DrRacket
 tests sometimes fail for focus reasons.

 So, whenever someone pushes, they may get failures from these tests,
 then have go look at the actual errors, and try to figure out if they
 actually broke something or not.

 (Or, they ignore these failures, which is bad.)

 Here are two potential solutions. Let's assume that I just pushed
 something, and a test started failing.

 - Have DrDr send me email for every push about the broken test for as
   long as it fails. If I get email more than once, it's likely that I
   actually broke something. If I only get email once, the problem went
   away on its own, and was likely an intermittent failure.

 - Have the possiblity to flag some tests as intermittent (something
   like `drdr:random'), and only report failures for these tests if
   they fail twice in a row. This would reduce the amount of noise,
   since I expect most of these tests to pass most of the time. Actual
   breakage would still be detected, since it's unlikely that such
   failures would go away on their own. Detection would happen one push
   late, but that shouldn't be too much of an issue.

   Or, maybe only notify the pusher after two failures in a row, but
   notify the responsible person right away.

 Any thoughts?

 Vincent
 _
   For list-related administrative tasks:
   http://lists.racket-lang.org/listinfo/dev

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Robby Findler
This is a rare event (playing catchup like this) so I think it is
probably best if we just let it catch up. Should be just a couple of
more days (maybe a week) by my sketchy guesstimationizing.

Robby

On Mon, Aug 8, 2011 at 12:34 PM, Jon Rafkind rafk...@cs.utah.edu wrote:
 Another request: could DrDr process the latest push first? Its a little
 annoying to get emails for tests that failed when the latest push fixes
 them but DrDr is so far behind. Is there any benefit to testing all the
 intermediate pushes?

 On 08/08/2011 09:56 AM, Vincent St-Amour wrote:
 I love DrDr, but there's a small thing that annoys me about it.

 Some tests are prone to intermittent failures. For example, some
 benchmarks need to create a file, and several benchmarks share the
 same file, which leads to race conditions. Similarly, some DrRacket
 tests sometimes fail for focus reasons.

 So, whenever someone pushes, they may get failures from these tests,
 then have go look at the actual errors, and try to figure out if they
 actually broke something or not.

 (Or, they ignore these failures, which is bad.)

 Here are two potential solutions. Let's assume that I just pushed
 something, and a test started failing.

 - Have DrDr send me email for every push about the broken test for as
   long as it fails. If I get email more than once, it's likely that I
   actually broke something. If I only get email once, the problem went
   away on its own, and was likely an intermittent failure.

 - Have the possiblity to flag some tests as intermittent (something
   like `drdr:random'), and only report failures for these tests if
   they fail twice in a row. This would reduce the amount of noise,
   since I expect most of these tests to pass most of the time. Actual
   breakage would still be detected, since it's unlikely that such
   failures would go away on their own. Detection would happen one push
   late, but that shouldn't be too much of an issue.

   Or, maybe only notify the pusher after two failures in a row, but
   notify the responsible person right away.

 Any thoughts?

 Vincent
 _
   For list-related administrative tasks:
   http://lists.racket-lang.org/listinfo/dev

 _
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Jon Rafkind
Could DrDr say This build is not the latest or The latest push is
234234?

On 08/08/2011 11:37 AM, Jay McCarthy wrote:
 It is useful to test all of them to find out when errors start. It
 doesn't do the newest first, because then the calculation of new
 issue wouldn't make any sense, because you wouldn't have the previous
 push's tests.

 Jay

 On Mon, Aug 8, 2011 at 11:34 AM, Jon Rafkind rafk...@cs.utah.edu wrote:
 Another request: could DrDr process the latest push first? Its a little
 annoying to get emails for tests that failed when the latest push fixes
 them but DrDr is so far behind. Is there any benefit to testing all the
 intermediate pushes?

 On 08/08/2011 09:56 AM, Vincent St-Amour wrote:
 I love DrDr, but there's a small thing that annoys me about it.

 Some tests are prone to intermittent failures. For example, some
 benchmarks need to create a file, and several benchmarks share the
 same file, which leads to race conditions. Similarly, some DrRacket
 tests sometimes fail for focus reasons.

 So, whenever someone pushes, they may get failures from these tests,
 then have go look at the actual errors, and try to figure out if they
 actually broke something or not.

 (Or, they ignore these failures, which is bad.)

 Here are two potential solutions. Let's assume that I just pushed
 something, and a test started failing.

 - Have DrDr send me email for every push about the broken test for as
   long as it fails. If I get email more than once, it's likely that I
   actually broke something. If I only get email once, the problem went
   away on its own, and was likely an intermittent failure.

 - Have the possiblity to flag some tests as intermittent (something
   like `drdr:random'), and only report failures for these tests if
   they fail twice in a row. This would reduce the amount of noise,
   since I expect most of these tests to pass most of the time. Actual
   breakage would still be detected, since it's unlikely that such
   failures would go away on their own. Detection would happen one push
   late, but that shouldn't be too much of an issue.

   Or, maybe only notify the pusher after two failures in a row, but
   notify the responsible person right away.

 Any thoughts?

 Vincent
 _
   For list-related administrative tasks:
   http://lists.racket-lang.org/listinfo/dev
 _
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev




_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Jay McCarthy
Your wish is my command.

On Mon, Aug 8, 2011 at 10:00 AM, Robby Findler
ro...@eecs.northwestern.edu wrote:
 PS: I'm also happy if this class of tests only emails the responsible
 person, and not the pusher.

 Robby

 On Mon, Aug 8, 2011 at 10:59 AM, Robby Findler
 ro...@eecs.northwestern.edu wrote:
 I like the two-times-in-a-row thought.

 FWIW, please try to avoid race conditions of the second kind.

 I think the drracket test suites are special because they fail
 not-so-often and I don't actually know how to fix them.  If either of
 those weren't true then I'd say they should just not run in drdr. (So
 the race-condition/using the same file thing fails this test.)

 Robby

 On Mon, Aug 8, 2011 at 10:56 AM, Vincent St-Amour stamo...@ccs.neu.edu 
 wrote:

 I love DrDr, but there's a small thing that annoys me about it.

 Some tests are prone to intermittent failures. For example, some
 benchmarks need to create a file, and several benchmarks share the
 same file, which leads to race conditions. Similarly, some DrRacket
 tests sometimes fail for focus reasons.

 So, whenever someone pushes, they may get failures from these tests,
 then have go look at the actual errors, and try to figure out if they
 actually broke something or not.

 (Or, they ignore these failures, which is bad.)

 Here are two potential solutions. Let's assume that I just pushed
 something, and a test started failing.

 - Have DrDr send me email for every push about the broken test for as
  long as it fails. If I get email more than once, it's likely that I
  actually broke something. If I only get email once, the problem went
  away on its own, and was likely an intermittent failure.

 - Have the possiblity to flag some tests as intermittent (something
  like `drdr:random'), and only report failures for these tests if
  they fail twice in a row. This would reduce the amount of noise,
  since I expect most of these tests to pass most of the time. Actual
  breakage would still be detected, since it's unlikely that such
  failures would go away on their own. Detection would happen one push
  late, but that shouldn't be too much of an issue.

  Or, maybe only notify the pusher after two failures in a row, but
  notify the responsible person right away.

 Any thoughts?

 Vincent
 _
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev



 _
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev



-- 
Jay McCarthy j...@cs.byu.edu
Assistant Professor / Brigham Young University
http://faculty.cs.byu.edu/~jay

The glory of God is Intelligence - DC 93

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] DrDr Feature Request

2011-08-08 Thread Jon Rafkind
I noticed this functionality just now.. thanks a lot!

On 08/08/2011 12:38 PM, Jay McCarthy wrote:
 Your wish is my command.

 On Mon, Aug 8, 2011 at 10:00 AM, Robby Findler
 ro...@eecs.northwestern.edu wrote:
 PS: I'm also happy if this class of tests only emails the responsible
 person, and not the pusher.

 Robby

 On Mon, Aug 8, 2011 at 10:59 AM, Robby Findler
 ro...@eecs.northwestern.edu wrote:
 I like the two-times-in-a-row thought.

 FWIW, please try to avoid race conditions of the second kind.

 I think the drracket test suites are special because they fail
 not-so-often and I don't actually know how to fix them.  If either of
 those weren't true then I'd say they should just not run in drdr. (So
 the race-condition/using the same file thing fails this test.)

 Robby

 On Mon, Aug 8, 2011 at 10:56 AM, Vincent St-Amour stamo...@ccs.neu.edu 
 wrote:
 I love DrDr, but there's a small thing that annoys me about it.

 Some tests are prone to intermittent failures. For example, some
 benchmarks need to create a file, and several benchmarks share the
 same file, which leads to race conditions. Similarly, some DrRacket
 tests sometimes fail for focus reasons.

 So, whenever someone pushes, they may get failures from these tests,
 then have go look at the actual errors, and try to figure out if they
 actually broke something or not.

 (Or, they ignore these failures, which is bad.)

 Here are two potential solutions. Let's assume that I just pushed
 something, and a test started failing.

 - Have DrDr send me email for every push about the broken test for as
  long as it fails. If I get email more than once, it's likely that I
  actually broke something. If I only get email once, the problem went
  away on its own, and was likely an intermittent failure.

 - Have the possiblity to flag some tests as intermittent (something
  like `drdr:random'), and only report failures for these tests if
  they fail twice in a row. This would reduce the amount of noise,
  since I expect most of these tests to pass most of the time. Actual
  breakage would still be detected, since it's unlikely that such
  failures would go away on their own. Detection would happen one push
  late, but that shouldn't be too much of an issue.

  Or, maybe only notify the pusher after two failures in a row, but
  notify the responsible person right away.

 Any thoughts?

 Vincent
 _
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

 _
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev



_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev