Re: cron: better error checking of random values

2023-05-06 Thread Todd C . Miller
Now that the random range changes are committed I would like to
revisit this diff.

This fixes two issues with the parsing of random values:

1) Garbage after a random value is now detected.  This fixes a bug
   in the random range parsing that handles the optional final
   number.  For example:

~foo* * * * echo invalid
0~59bar * * * * echo invalid
10~%$!  * * * * echo invalid

  These kind of syntax errors are already detected for normal ranges.

2) Bounds check the high and low numbers in a random range.
   Previously, only the final randomized number was bounds-checked
   (which is usually too late).  The bounds are checked for normal
   ranges in set_element() but in the case of random ranges this
   is too late.  The following invalid entry is now rejected.

0~60  * * * * echo max minute is 59

   Whereas before it would work most (but not all!) of the time.

OK?

 - todd

Index: usr.sbin/cron/entry.c
===
RCS file: /cvs/src/usr.sbin/cron/entry.c,v
retrieving revision 1.54
diff -u -p -u -r1.54 entry.c
--- usr.sbin/cron/entry.c   6 May 2023 23:06:27 -   1.54
+++ usr.sbin/cron/entry.c   6 May 2023 23:36:56 -
@@ -499,12 +499,24 @@ get_range(bitstr_t *bits, int low, int h
/* get the (optional) number following the tilde
 */
ch = get_number(&num2, low, names, ch, file, "/, \t\n");
-   if (ch == EOF)
+   if (ch == EOF) {
+   /* no second number, check for valid terminator
+*/
ch = get_char(file);
+   if (!strchr("/, \t\n", ch)) {
+   unget_char(ch, file);
+   return (EOF);
+   }
+   }
if (ch == EOF || num1 > num2) {
unget_char(ch, file);
return (EOF);
}
+
+   /* we must perform the bounds checking ourselves
+*/
+   if (num1 < low || num2 > high)
+   return (EOF);
 
if (ch == '/') {
/* randomize the step value instead of num1



Re: cron: better error checking of random values

2023-05-05 Thread Mark Jamsek
On 23-05-05 10:41AM, Todd C. Miller wrote:
> On Fri, 05 May 2023 16:13:01 +1000, Mark Jamsek wrote:
> 
> > I found kn's attempted syntax intuitive though; it feels like a natural
> > extension of the existing random and step syntax. I also assumed ~/15
> > would run every 15 minutes starting with a random minute, and since
> > discovering it didn't work like that, I've been carrying a simple patch
> > that allows kn's syntax:
> >
> >   ~/15  random 15 minute intervals in [0, 59]
> >   1~9/10random 10 minute intervals in [1,59]
> 
> It does seem like people prefer the ~/step syntax over my own, which
> is fine.
> 
> However, I'm unsure about the x~y/step syntax.  Personally, I think
> it would be more natural to treat x~y/step like x-y/step and just
> use a random offset in the interval [0,step-1].  Do you have a
> use-case for the way your diff handles this or did it just follow
> naturally from the random value being in num1?
> 
> Also, using the % operator with the random value results in modulo
> bias, which we would like to avoid.  If we use a random offset based
> on the step interval instead there is no need for the modulus.

Both tbh. I primarily wanted to get intervals that would not fall on
common times where I already had jobs running; namely, 0th minutes
(e.g., 0, 30). And I described the second example wrong, so this was
most likely lost in my first response:

> >   1~9/10random 10 minute intervals in [1,59]

The "intervals in [1,59]" is not correct in this example. Rather, this
will result in an initial random value 1 <= n <= 9, so each interval in
this step /10 case will also fall in [N1,N9] (e.g., 7, 17, 27, ...) and
avoid said common times. And the modulo is used to get the first point
in [low,high] from which to start the step loop that fit this
constraint. In other words, I wanted some control over where the random
intervals landed such that being able to avoid ranges was as much
a requirement, in my use case, as random steps.

But I also didn't want to change the code too much so it just followed
naturally. This could be done other ways, too, and I'm not presuming
such a requirement is as important to anyone else.

-- 
Mark Jamsek 
GPG: F2FF 13DE 6A06 C471 CA80  E6E2 2930 DC66 86EE CF68


Re: cron: better error checking of random values

2023-05-05 Thread Todd C . Miller
On Fri, 05 May 2023 16:13:01 +1000, Mark Jamsek wrote:

> I found kn's attempted syntax intuitive though; it feels like a natural
> extension of the existing random and step syntax. I also assumed ~/15
> would run every 15 minutes starting with a random minute, and since
> discovering it didn't work like that, I've been carrying a simple patch
> that allows kn's syntax:
>
>   ~/15random 15 minute intervals in [0, 59]
>   1~9/10  random 10 minute intervals in [1,59]

It does seem like people prefer the ~/step syntax over my own, which
is fine.

However, I'm unsure about the x~y/step syntax.  Personally, I think
it would be more natural to treat x~y/step like x-y/step and just
use a random offset in the interval [0,step-1].  Do you have a
use-case for the way your diff handles this or did it just follow
naturally from the random value being in num1?

Also, using the % operator with the random value results in modulo
bias, which we would like to avoid.  If we use a random offset based
on the step interval instead there is no need for the modulus.

 - todd



Re: cron: better error checking of random values

2023-05-05 Thread Peter Hessler
On 2023 May 05 (Fri) at 16:13:01 +1000 (+1000), Mark Jamsek wrote:
:On 23-05-04 05:40PM, Todd C. Miller wrote:
:> On Thu, 04 May 2023 21:41:26 -, Klemens Nanni wrote:
:> 
:> > On Thu, May 04, 2023 at 03:30:30PM -0600, Todd C. Miller wrote:
:> > > This fixes two issues with the parsing of random values:
:> > > 
:> > > 1) A random value with a step is now rejected.  For example:
:> > > 
:> > > ~/10* * * * echo invalid
:> >
:> > I've ben using ~/10 to randomly distribute four similar tasks so that
:> > they don't start at the same time.
:> >
:> > Is that wrong?
:> 
:> I'm fairly certain that doesn't do what you think it does.  When I
:> tested it "~/10" behaved the same as "~".  The step value is not
:> even parsed.
:

I really dislike "previously accepted (even if behaved differently)"
configs being rejected ...


:todd is correct in that the step value is not parsed with "~/10". We
:recently discovered this when setting up Got mirrors to sync every 15
:minutes. IIRC, Lucas (or op?) asked about syncing each mirror at
:a different 15 minute interval by using the same syntax kn is using.
:
:I found kn's attempted syntax intuitive though; it feels like a natural
:extension of the existing random and step syntax. I also assumed ~/15
:would run every 15 minutes starting with a random minute, and since
:discovering it didn't work like that, I've been carrying a simple patch
:that allows kn's syntax:
:
:  ~/15 random 15 minute intervals in [0, 59]
:  1~9/10   random 10 minute intervals in [1,59]
:

... but I really like this syntax and behaviour.

I haven't had a chance to review the code, but I think this would be a
better direction for us to go.

-peter


:8<
:diff refs/remotes/origin/master refs/heads/master
:commit - e253a7cc21de530da6fcf49c1279258fecade8f4
:commit + 761b09ae46431344766330cc14c958ffca5a3a0a
:blob - ab683b8476a8c862aabc53101b4080959820835a
:blob + 030ab599dcf07eb3e94efe90c118e4e9bea8f6c4
:--- usr.sbin/cron/entry.c
:+++ usr.sbin/cron/entry.c
:@@ -456,10 +456,11 @@ get_range(bitstr_t *bits, int low, int high, const cha
:   /* range = number | number* "~" number* | number "-" number ["/" number]
:*/
: 
:-  int i, num1, num2, num3;
:+  int i, num1, num2, num3, rndstep;
: 
:   num1 = low;
:   num2 = high;
:+  rndstep = 0;
: 
:   if (ch == '*') {
:   /* '*' means [low, high] but can still be modified by /step
:@@ -497,7 +498,7 @@ get_range(bitstr_t *bits, int low, int high, const cha
: 
:   /* get the (optional) number following the tilde
:*/
:-  ch = get_number(&num2, low, names, ch, file, ", \t\n");
:+  ch = get_number(&num2, low, names, ch, file, "/, \t\n");
:   if (ch == EOF)
:   ch = get_char(file);
:   if (ch == EOF || num1 > num2) {
:@@ -509,6 +510,10 @@ get_range(bitstr_t *bits, int low, int high, const cha
:*/
:   num3 = num1;
:   num1 = arc4random_uniform(num2 - num3 + 1) + num3;
:+  if (ch == '/') {
:+  rndstep = 1;
:+  break;
:+  }
:   /* FALLTHROUGH */
:   default:
:   /* not a range, it's a single number.
:@@ -538,6 +543,10 @@ get_range(bitstr_t *bits, int low, int high, const cha
:   ch = get_number(&num3, 0, NULL, ch, file, ", \t\n");
:   if (ch == EOF || num3 == 0)
:   return (EOF);
:+  if (rndstep) {
:+  num1 %= num3;
:+  num2 = high;
:+  }
:   } else {
:   /* no step.  default==1.
:*/
:>8
:
:> It sounds like what you want is the proposed syntax "*/~10"
:> to use a random offset.
:
:But this would be nice too! Anything that enables regular intervals from
:a random offset would satisfy a common enough use case.
:
:-- 
:Mark Jamsek 
:GPG: F2FF 13DE 6A06 C471 CA80  E6E2 2930 DC66 86EE CF68

-- 
Time flies like an arrow, but fruit flies like a banana.



Re: cron: better error checking of random values

2023-05-05 Thread Klemens Nanni
On Thu, May 04, 2023 at 05:40:10PM -0600, Todd C. Miller wrote:
> I'm fairly certain that doesn't do what you think it does.  When I
> tested it "~/10" behaved the same as "~".  The step value is not
> even parsed.

Oh I see, it is actually picking a random minute and ignores steps,
so entries run once an hour:

~/10 * * * *-qs touch /tmp/cron-foo.$(date +\%s)
~/10 * * * *-qs touch /tmp/cron-bar.$(date +\%s)
~/10 * * * *-qs touch /tmp/cron-lol.$(date +\%s)

from earlier gave me:

-rw-r--r--  1 kn  wheel  0 May  5 18:08 /tmp/cron-bar.1683295681
-rw-r--r--  1 kn  wheel  0 May  5 19:08 /tmp/cron-bar.1683299281
-rw-r--r--  1 kn  wheel  0 May  5 18:34 /tmp/cron-foo.1683297241
-rw-r--r--  1 kn  wheel  0 May  5 18:11 /tmp/cron-lol.1683295861
-rw-r--r--  1 kn  wheel  0 May  5 19:11 /tmp/cron-lol.1683299461

> 
> It sounds like what you want is the proposed syntax "*/~10"
> to use a random offset.

Indeed, thanks.



Re: cron: better error checking of random values

2023-05-04 Thread Mark Jamsek
On 23-05-04 05:40PM, Todd C. Miller wrote:
> On Thu, 04 May 2023 21:41:26 -, Klemens Nanni wrote:
> 
> > On Thu, May 04, 2023 at 03:30:30PM -0600, Todd C. Miller wrote:
> > > This fixes two issues with the parsing of random values:
> > > 
> > > 1) A random value with a step is now rejected.  For example:
> > > 
> > > ~/10* * * * echo invalid
> >
> > I've ben using ~/10 to randomly distribute four similar tasks so that
> > they don't start at the same time.
> >
> > Is that wrong?
> 
> I'm fairly certain that doesn't do what you think it does.  When I
> tested it "~/10" behaved the same as "~".  The step value is not
> even parsed.

todd is correct in that the step value is not parsed with "~/10". We
recently discovered this when setting up Got mirrors to sync every 15
minutes. IIRC, Lucas (or op?) asked about syncing each mirror at
a different 15 minute interval by using the same syntax kn is using.

I found kn's attempted syntax intuitive though; it feels like a natural
extension of the existing random and step syntax. I also assumed ~/15
would run every 15 minutes starting with a random minute, and since
discovering it didn't work like that, I've been carrying a simple patch
that allows kn's syntax:

  ~/15  random 15 minute intervals in [0, 59]
  1~9/10random 10 minute intervals in [1,59]

8<
diff refs/remotes/origin/master refs/heads/master
commit - e253a7cc21de530da6fcf49c1279258fecade8f4
commit + 761b09ae46431344766330cc14c958ffca5a3a0a
blob - ab683b8476a8c862aabc53101b4080959820835a
blob + 030ab599dcf07eb3e94efe90c118e4e9bea8f6c4
--- usr.sbin/cron/entry.c
+++ usr.sbin/cron/entry.c
@@ -456,10 +456,11 @@ get_range(bitstr_t *bits, int low, int high, const cha
/* range = number | number* "~" number* | number "-" number ["/" number]
 */
 
-   int i, num1, num2, num3;
+   int i, num1, num2, num3, rndstep;
 
num1 = low;
num2 = high;
+   rndstep = 0;
 
if (ch == '*') {
/* '*' means [low, high] but can still be modified by /step
@@ -497,7 +498,7 @@ get_range(bitstr_t *bits, int low, int high, const cha
 
/* get the (optional) number following the tilde
 */
-   ch = get_number(&num2, low, names, ch, file, ", \t\n");
+   ch = get_number(&num2, low, names, ch, file, "/, \t\n");
if (ch == EOF)
ch = get_char(file);
if (ch == EOF || num1 > num2) {
@@ -509,6 +510,10 @@ get_range(bitstr_t *bits, int low, int high, const cha
 */
num3 = num1;
num1 = arc4random_uniform(num2 - num3 + 1) + num3;
+   if (ch == '/') {
+   rndstep = 1;
+   break;
+   }
/* FALLTHROUGH */
default:
/* not a range, it's a single number.
@@ -538,6 +543,10 @@ get_range(bitstr_t *bits, int low, int high, const cha
ch = get_number(&num3, 0, NULL, ch, file, ", \t\n");
if (ch == EOF || num3 == 0)
return (EOF);
+   if (rndstep) {
+   num1 %= num3;
+   num2 = high;
+   }
} else {
/* no step.  default==1.
 */
>8

> It sounds like what you want is the proposed syntax "*/~10"
> to use a random offset.

But this would be nice too! Anything that enables regular intervals from
a random offset would satisfy a common enough use case.

-- 
Mark Jamsek 
GPG: F2FF 13DE 6A06 C471 CA80  E6E2 2930 DC66 86EE CF68


Re: cron: better error checking of random values

2023-05-04 Thread Todd C . Miller
On Thu, 04 May 2023 21:41:26 -, Klemens Nanni wrote:

> On Thu, May 04, 2023 at 03:30:30PM -0600, Todd C. Miller wrote:
> > This fixes two issues with the parsing of random values:
> > 
> > 1) A random value with a step is now rejected.  For example:
> > 
> > ~/10* * * * echo invalid
>
> I've ben using ~/10 to randomly distribute four similar tasks so that
> they don't start at the same time.
>
> Is that wrong?

I'm fairly certain that doesn't do what you think it does.  When I
tested it "~/10" behaved the same as "~".  The step value is not
even parsed.

It sounds like what you want is the proposed syntax "*/~10"
to use a random offset.
 
 - todd



Re: cron: better error checking of random values

2023-05-04 Thread Klemens Nanni
On Thu, May 04, 2023 at 03:30:30PM -0600, Todd C. Miller wrote:
> This fixes two issues with the parsing of random values:
> 
> 1) A random value with a step is now rejected.  For example:
> 
> ~/10* * * * echo invalid

I've ben using ~/10 to randomly distribute four similar tasks so that
they don't start at the same time.

Is that wrong?

I could use, e.g. 1/10, 3/10, 5/10, 7/10 and 9/10, but ~/10 seems nicer.

> 0~59/10 * * * * echo invalid
> 10~/10  * * * * echo invalid
> ~40/10  * * * * echo invalid
> 
> Previously, the '/' would just be discarded.
> 
> 2) The high and low random bound values are now checked.  Previously,
>only the randomized number was bounds-checked (which is usually
>too late).  This is more consistent with the behavior of ranges
>(low-high).  The following invalid entry is now rejected.
> 
> 0~60  * * * * echo max minute is 59
> 
>Whereas before it would work most (but not all!) of the time.
> 
> OK?
> 
>  - todd
> 
> diff -u -p -u -r1.53 entry.c
> --- usr.sbin/cron/entry.c 21 May 2022 01:21:29 -  1.53
> +++ usr.sbin/cron/entry.c 4 May 2023 21:19:40 -
> @@ -498,12 +498,17 @@ get_range(bitstr_t *bits, int low, int h
>   /* get the (optional) number following the tilde
>*/
>   ch = get_number(&num2, low, names, ch, file, ", \t\n");
> - if (ch == EOF)
> + if (ch == EOF) {
> + /* no second number, check for valid terminator
> +  */
>   ch = get_char(file);
> - if (ch == EOF || num1 > num2) {
> - unget_char(ch, file);
> - return (EOF);
> + if (!strchr(", \t\n", ch)) {
> + unget_char(ch, file);
> + return (EOF);
> + }
>   }
> + if (num1 > num2 || num1 < low || num2 > high)
> + return (EOF);
>  
>   /* get a random number in the interval [num1, num2]
>*/
> 



cron: better error checking of random values

2023-05-04 Thread Todd C . Miller
This fixes two issues with the parsing of random values:

1) A random value with a step is now rejected.  For example:

~/10* * * * echo invalid
0~59/10 * * * * echo invalid
10~/10  * * * * echo invalid
~40/10  * * * * echo invalid

Previously, the '/' would just be discarded.

2) The high and low random bound values are now checked.  Previously,
   only the randomized number was bounds-checked (which is usually
   too late).  This is more consistent with the behavior of ranges
   (low-high).  The following invalid entry is now rejected.

0~60  * * * * echo max minute is 59

   Whereas before it would work most (but not all!) of the time.

OK?

 - todd

diff -u -p -u -r1.53 entry.c
--- usr.sbin/cron/entry.c   21 May 2022 01:21:29 -  1.53
+++ usr.sbin/cron/entry.c   4 May 2023 21:19:40 -
@@ -498,12 +498,17 @@ get_range(bitstr_t *bits, int low, int h
/* get the (optional) number following the tilde
 */
ch = get_number(&num2, low, names, ch, file, ", \t\n");
-   if (ch == EOF)
+   if (ch == EOF) {
+   /* no second number, check for valid terminator
+*/
ch = get_char(file);
-   if (ch == EOF || num1 > num2) {
-   unget_char(ch, file);
-   return (EOF);
+   if (!strchr(", \t\n", ch)) {
+   unget_char(ch, file);
+   return (EOF);
+   }
}
+   if (num1 > num2 || num1 < low || num2 > high)
+   return (EOF);
 
/* get a random number in the interval [num1, num2]
 */