check if message is in a particular sequence?

2021-04-29 Thread Paul Fox
What's the fastest/easiest way to check if a particular message
is a member of a particular sequence?

I thought I'd be able to compare the message number against the output
of "mark -list", but since sequences can be abbreviated with range
notation, that gets complicated quickly.  For example, message 6 is in
the sequence todo, but it's hard to tell from this:
$ mark -list -sequence todo
todo: 1 3 5-8 13 16 46 49 52-53

paul
=--
 paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 47.3 degrees)




Re: check if message is in a particular sequence?

2021-04-29 Thread Ken Hornstein
>What's the fastest/easiest way to check if a particular message
>is a member of a particular sequence?
>
>I thought I'd be able to compare the message number against the output
>of "mark -list", but since sequences can be abbreviated with range
>notation, that gets complicated quickly.  For example, message 6 is in
>the sequence todo, but it's hard to tell from this:

You can get the expanded list of messages in a sequence by doing this:

scan -format '%(msg)' sequence-name

I think from there, it's easy.  I do not know of a better way, but it
wouldn't surprise me if someone came up with something better.

--Ken



Re: check if message is in a particular sequence?

2021-04-29 Thread Bob Carragher
On Thu, 29 Apr 2021 12:49:40 -0400 Ken Hornstein  sez:

> >What's the fastest/easiest way to check if a particular message
> >is a member of a particular sequence?
> >
> >I thought I'd be able to compare the message number against the output
> >of "mark -list", but since sequences can be abbreviated with range
> >notation, that gets complicated quickly.  For example, message 6 is in
> >the sequence todo, but it's hard to tell from this:
>
> You can get the expanded list of messages in a sequence by doing this:
>
>   scan -format '%(msg)' sequence-name
>
> I think from there, it's easy.  I do not know of a better way, but it
> wouldn't surprise me if someone came up with something better.

I don't know that it's better, but my first thought was awk(1) to
parse the mark(1mh) sequence and search for the message number in
there.  It saves on filesystem access, but would suffer compared
with Ken's approach in that deleted messages would still be
reported -- which may or may not be what Paul wants.  Also, I'm
making assumptions about mark(1mh)'s output format, which is
wholly unnecessary (and more brittle than) if you use scan(1mh).

Checking for message # 6:

 $ echo 'todo: 1 3 5-8 13 16 46 49 52-53' | awk -v MY_MSG=6 -v RS=" " 
'BEGIN{missing=1;} /^[0-9]+$/{if (MY_MSG==$0) {missing=0; print "in sequence 
entry"}} /^[0-9]+-[0-9]+$/{split($0,ss,"-"); if (MY_MSG>=ss[1] && 
MY_MSG<=ss[2]) {missing=0; print "in sequence range"}} END{if (missing) print 
"not in sequence"}'
 in sequence range

Checking for message # 49:

 $ echo 'todo: 1 3 5-8 13 16 46 49 52-53' | awk -v MY_MSG=49 -v RS=" " 
'BEGIN{missing=1;} /^[0-9]+$/{if (MY_MSG==$0) {missing=0; print "in sequence 
entry"}} /^[0-9]+-[0-9]+$/{split($0,ss,"-"); if (MY_MSG>=ss[1] && 
MY_MSG<=ss[2]) {missing=0; print "in sequence range"}} END{if (missing) print 
"not in sequence"}'
 in sequence entry

Checking for (out-of-sequence) message # 99:

 $ echo 'todo: 1 3 5-8 13 16 46 49 52-53' | awk -v MY_MSG=99 -v RS=" " 
'BEGIN{missing=1;} /^[0-9]+$/{if (MY_MSG==$0) {missing=0; print "in sequence 
entry"}} /^[0-9]+-[0-9]+$/{split($0,ss,"-"); if (MY_MSG>=ss[1] && 
MY_MSG<=ss[2]) {missing=0; print "in sequence range"}} END{if (missing) print 
"not in sequence"}'
 not in sequence

Bob



Re: check if message is in a particular sequence?

2021-04-29 Thread Paul Fox
ken wrote:
 > >What's the fastest/easiest way to check if a particular message
 > >is a member of a particular sequence?
 > >
 > >I thought I'd be able to compare the message number against the output
 > >of "mark -list", but since sequences can be abbreviated with range
 > >notation, that gets complicated quickly.  For example, message 6 is in
 > >the sequence todo, but it's hard to tell from this:
 > 
 > You can get the expanded list of messages in a sequence by doing this:
 > 
 >  scan -format '%(msg)' sequence-name
 > 
 > I think from there, it's easy.  I do not know of a better way, but it
 > wouldn't surprise me if someone came up with something better.

Thanks.  Seems...  overkill...  to have to fire up a parsing language,
but it will definitely work.

Bob's solution, falls in the same (mostly good :-) category.

It also seems like mark(1) could do it.  Currently it ignores its message
args if -list is given.  It could be enhanced so that if message
args are given, then -list would only output sequence membership for
the given args.

...time passes...

I have a partial patch for mark which does what I described.  Don't
know whether it's worth it or not.

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 47.5 degrees)




Re: check if message is in a particular sequence?

2021-04-29 Thread Ken Hornstein
>Thanks.  Seems...  overkill...  to have to fire up a parsing language,
>but it will definitely work.

Remind me ... how many parsing languages get fired up for a random Unix
command on a modern Unix system, today? :-)

But, okay, that's not really fair.  The mh-format language is, I think,
pretty lightweight in terms of embeddable languages.  Probably more
problematic is I think scan(1) opens the message even though it
doesn't need anything from it.

>It also seems like mark(1) could do it.  Currently it ignores its
>message args if -list is given.  It could be enhanced so that if
>message args are given, then -list would only output sequence
>membership for the given args.
>
>...time passes...
>
>I have a partial patch for mark which does what I described.  Don't
>know whether it's worth it or not.

I think that would make sense to me; I cannot think of a reason why that
wouldn't generally be useful.

--Ken



Re: check if message is in a particular sequence?

2021-04-30 Thread Paul Fox
ken wrote:
 > >Thanks.  Seems...  overkill...  to have to fire up a parsing language,
 > >but it will definitely work.
 > 
 > Remind me ... how many parsing languages get fired up for a random Unix
 > command on a modern Unix system, today? :-)

Touché!

 > >It also seems like mark(1) could do it.  Currently it ignores its
 > >message args if -list is given.  It could be enhanced so that if
 > >message args are given, then -list would only output sequence
 > >membership for the given args.
 > >
 > >...time passes...
 > >
 > >I have a partial patch for mark which does what I described.  Don't
 > >know whether it's worth it or not.
 > 
 > I think that would make sense to me; I cannot think of a reason why that
 > wouldn't generally be useful.

I have a patch, including man page and test script changes.  The
change is somewhat backward-INcompatible, in that previously, "mark
-list" ignored any supplied "msgs".  Now they make a difference.

$ mark -list
cur: 1
odd: 1 3 5 7 9
even: 2 4 6 8 10

$ mark -seq even -list
even: 2 4 6 8 10

$ mark -seq even -list 2-6
even: 2 4 6 <-- previously output was "even: 2 4 6 8 10"

$ mark -seq odd -list 2-6
odd: 3 5<-- previously output was "odd: 1 3 5 7 9"

$ mark -list 2-6<-- previously behaved as "mark -list", above
odd: 3 5
even: 2 4 6

$ mark -list 1-4<-- previously behaved as "mark -list", above
cur: 1
odd: 1 3
even: 2 4

>From the new man page:
The -list switch tells mark to list the messages associated with
sequences specified by a -sequence switch, or with any sequences if no
-sequence switch is present.  If the sequence is private, this will be
indicated.  If msgs are specified, then only the sequence memberships
for the given messages are shown, either for all sequences, or just
for those named by -sequence switches.  The -zero switch does not
affect the operation of -list.

Are we in any kind of waiting period for the next release?

Any objections if I commit to master?

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 57.2 degrees)




Re: check if message is in a particular sequence?

2021-04-30 Thread David Levine
Paul wrote:

> I have a patch, including man page and test script changes.  The
> change is somewhat backward-INcompatible, in that previously, "mark
> -list" ignored any supplied "msgs".  Now they make a difference.

That almost seems like a bug fix to me.  Anyway, we're not frozen for
the new release, so I think it would be fine to add it.

David



Re: check if message is in a particular sequence?

2021-04-30 Thread Ralph Corderoy
Hi Paul,

> Ken wrote:
> > scan -format '%(msg)' sequence-name
>
> Thanks.  Seems...  overkill...  to have to fire up a parsing language,

Haven't heard fgrep(1) called that before.  :-)

$ scan -format '%(msg)' last:10 | fgrep -qx 42; echo $?
1
$ scan -format '%(msg)' last:10 | fgrep -qx 96910; echo $?
0

Ken points out each message is opened by scan(1) even though the read
headers aren't used.  I thought

pick -list last:10

may avoid that, but it also opens each file.  I suppose that means the
code which assembles the bytecode to do the formatting should set a flag
whenever it needs something from the email and if that's still clear
when the bytecode is run then the file need not be read.  Or, open the
file and read the headers lazily on first access.

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-04-30 Thread Ralph Corderoy
Hi Paul,

> I thought I'd be able to compare the message number against the output
> of "mark -list", but since sequences can be abbreviated with range
> notation, that gets complicated quickly.  For example, message 6 is in
> the sequence todo, but it's hard to tell from this:
> $ mark -list -sequence todo
> todo: 1 3 5-8 13 16 46 49 52-53

This will expand the message numbers printed by mark(1) making a check
by something like ‘fgrep -qx 42’ trivial.

mark -list -seq public -seq private -seq notexist |
sed 's/.*: //' |
awk '{
for (n = 1; n <= NF; n++) {
c = split($n, w, /-/)
a = w[1]; b = c == 1 ? w[1] : w[2]
for (m = a; m <= b; m++) print m
}
}'

(My ~/bin/toseq happens to do roughly the opposite, compressing
sequential numbers into a range.  :-)

-- 
Cheers, Ralph



Re: check if message is in a particular sequence?

2021-04-30 Thread Ralph Corderoy
Hi Bob,

> but would suffer compared with Ken's approach in that deleted messages
> would still be reported

I don't think they would; mark(1) doesn't output removed messages,
whether by rmm(1) or through the filesystem.

$ p -seq bob last:3
3 hits
$ mark -list -s bob
bob: 96916-96918
$ rmm 96917
$ mark -list -s bob
bob: 96916 96918
$ rm `mhpath 96918`
$ mark -list -s bob
bob: 96916
$

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-01 Thread Ralph Corderoy
Hi Paul,

> $ mark -list
> cur: 1
> odd: 1 3 5 7 9
> even: 2 4 6 8 10
>
> $ mark -seq even -list
> even: 2 4 6 8 10
>
> $ mark -seq even -list 2-6
> even: 2 4 6   <-- previously output was "even: 2 4 6 8 10"
>
> $ mark -seq odd -list 2-6
> odd: 3 5  <-- previously output was "odd: 1 3 5 7 9"

Up to here seems fine, assuming ‘2-6’ can also be ‘3 5 2 4 6’ or
‘3 5 even:3’.  IOW, all the things I could normally scan(1), etc.

> $ mark -list 2-6<-- previously behaved as "mark -list", above
> odd: 3 5
> even: 2 4 6

I would have expected an extra line,

$ mark -list 2-6
 +  cur: 
odd: 3 5
even: 2 4 6

because the messages given are being intersected with the normal
‘mark -list’ output you showed at the start above.  IOW, if no messages
are given then the default is ‘all’.  This seems more orthogonal to me
and means a script can give multiple sequences and expect one line for
each in the output in the order the sequences were stated; there's no
need to parse the ‘foo:’ or ‘bar (private):’ to identify the sequence
involved.

> $ mark -list 1-4<-- previously behaved as "mark -list", above
> cur: 1
> odd: 1 3
> even: 2 4

An example not given here would be empty sequences, i.e. ones which
don't exist.  Currently:

$ mark -l -s cur -s foo -s bar -s xyzzy
cur: 96894
foo: 
bar: 97036
xyzzy: 
$ 

Still showing empty sequences with the new intersection would again be
less surprising and simpler to explain.

$ mark -l -s cur -s foo -s bar -s xyzzy notcur
cur: 
foo: 
bar: 97036
xyzzy: 
$ 

BTW, ‘first’, etc., aren't sequences, as we know.

$ p -seq first 42
pick: sequence name is reserved: first

Yet,

$ mark -l -s first -s cur -s last -s foo -s bar -s xyzzy
first: 
cur: 96894
last: 
foo: 
bar: 97036
xyzzy: 
$ 

mark(1) doesn't complain and I'd expect it to as pick does.

How does this new functionality help your original need?  Were you
thinking ‘mark -l -s foo 42’ would either be silent or not depending if
42 were in foo?  If so, what parsing language were you cranking up to
check.  ;-)

Finally, when I've wanted this functionality in the past, I've wondered
if a new pick(1) test would be the way.  Perhaps ‘-msg’ to match
mh-format(5)'s ‘msg’ function.

pick -msg 42 foo

The exit status would be sufficient to tell if 42 was in sequence foo.

Or if I want to know if any of sequence foo are in bar, xyzzy, or the
last few messages then it would be nice if ‘-msg’s parameter could be
more than a single message number.

pick -msg foo bar xyzzy last:42

Really, all this brings us back to needing a nice set-based consistent
algebra which all commands take.  :-)  Completely made up, without much
consideration:

forw subject:nmh \( !address:paul / mime-type:image/jpeg \)

Mercurial, the CVS, Subversion, ... thing, has a couple of notations
which are interesting for identifying files and revisions.  The former
has predicate functions, and the later has operators covering ancestry
because revisions form a tree, much like emails in a thread.

Specifying file sets  https://manned.org/hg.1#head14
Specifying revisions  https://manned.org/hg.1#head24

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-01 Thread Bob Carragher
Ah, thanks, Ralph!  So, if in one's use case one typically makes
use of the output of mark(1mh) immediately, then one is fine, as
it'll check for the message files' current statuses.  Or, at
least if one is really careful about it.

That is, until you're using Paul's enhancement to mark(1mh).  B-)

Bob

On Fri, 30 Apr 2021 22:23:11 +0100 Ralph Corderoy  sez:

> Hi Bob,
>
> > but would suffer compared with Ken's approach in that deleted messages
> > would still be reported
>
> I don't think they would; mark(1) doesn't output removed messages,
> whether by rmm(1) or through the filesystem.
>
> $ p -seq bob last:3
> 3 hits
> $ mark -list -s bob
> bob: 96916-96918
> $ rmm 96917
> $ mark -list -s bob
> bob: 96916 96918
> $ rm `mhpath 96918`
> $ mark -list -s bob
> bob: 96916
> $
>
> -- 
> Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-01 Thread Paul Fox
ralph wrote:
 > ...a lot...

I knew I was being wise to wait until you chimed in.  It will take me
a day or two to get to this.

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 40.1 degrees)




Re: check if message is in a particular sequence?

2021-05-01 Thread Ralph Corderoy
Hi Bob,

> Ah, thanks, Ralph!

Thanks to you for reminding me of awk's RS which, coupled with knowing
the only words starting with a positive number are a message range,
shortens my earlier sed and awk to

$ mark -list -seq public -seq private -seq notexist
public: 1-3 42
private (private): 3141 97057-97059
notexist: 
$
$ mark -list -seq public -seq private -seq notexist |
> gawk -v RS=' |\n' -F - '
> $0+0 {u = NF==2 ? $2 : $1; for (n = $1; n <= u; n++) print n}
> '
1
2
3
42
3141
97057
97058
97059
$

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-01 Thread Ralph Corderoy
Hi,

I wrote:
> $ mark -list -seq public -seq private -seq notexist
> public: 1-3 42
> private (private): 3141 97057-97059
> notexist: 
> $
> $ mark -list -seq public -seq private -seq notexist |
> > gawk -v RS=' |\n' -F - '
> > $0+0 {u = NF==2 ? $2 : $1; for (n = $1; n <= u; n++) print n}
> > '
> 1
> 2
> 3
> 42
> 3141
> 97057
> 97058
> 97059
> $

Silly me.  No need for gawk's regexp RS as the ‘42\nprivate’ which
arrives with just the POSIX RS=' ' always has something to discard after
the linefeed so there's no need to split it off into its own record.
Thus, the above simplifies further, with the coercing of $1, into

mark -list -seq public -seq private -seq notexist |
awk -v RS=' ' -F - '
$0+0 {u = NF==2 ? $2 : $1; for (n = $1+0; n <= u; n++) print n}
'

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-01 Thread Ralph Corderoy
I'll shut up soon.

> mark -list -seq public -seq private -seq notexist |
> awk -v RS=' ' -F - '
>   $0+0 {u = NF==2 ? $2 : $1; for (n = $1+0; n <= u; n++) print n}
> '

I thought I may as well create this and see if it gets used.

$ cat ~/bin/mhinseq
#! /bin/sh

# Successfully exit only if the message is in the sequence.
# usage: mhinseq seq 42

mark -list -sequence "${1?}" |
awk -v RS=' ' -F - -v msg="${2?}" '
$0+0 &&
((NF == 1 && $1+0 == msg) ||
 (NF == 2 && $1 <= msg && msg <= $2)) {f=1; last}
END {exit !f}
'
$ 

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-01 Thread Paul Fox
ralph wrote:
 > Hi Paul,
 > 
 > > $ mark -list
 > > cur: 1
 > > odd: 1 3 5 7 9
 > > even: 2 4 6 8 10
 > >
 > > $ mark -seq even -list
 > > even: 2 4 6 8 10
 > >
 > > $ mark -seq even -list 2-6
 > > even: 2 4 6<-- previously output was "even: 2 4 6 8 10"
 > >
 > > $ mark -seq odd -list 2-6
 > > odd: 3 5   <-- previously output was "odd: 1 3 5 7 9"
 > 
 > Up to here seems fine, assuming ‘2-6’ can also be ‘3 5 2 4 6’ or
 > ‘3 5 even:3’.  IOW, all the things I could normally scan(1), etc.

Yes.

 > 
 > > $ mark -list 2-6<-- previously behaved as "mark -list", above
 > > odd: 3 5
 > > even: 2 4 6
 > 
 > I would have expected an extra line,
 > 
 > $ mark -list 2-6
 >  +  cur: 
 > odd: 3 5
 > even: 2 4 6
 > 
 > because the messages given are being intersected with the normal
 > ‘mark -list’ output you showed at the start above.  IOW, if no messages

The current behavior matches my requirements, and the (new)
description in the man page describes it.  I wasn't thinking
of it as an intersection, but a membership listing:
   "If msgs are specified, then only the sequence memberships for
   the given messages are shown, either for all se quences, or
   just for those named by -sequence switches."

 > are given then the default is ‘all’.  This seems more orthogonal to me
 > and means a script can give multiple sequences and expect one line for
 > each in the output in the order the sequences were stated; there's no
 > need to parse the ‘foo:’ or ‘bar (private):’ to identify the sequence
 > involved.

I understand your need.  How about if adding "-zero" caused sequences
in which the named messages aren't members to be listed as well? 
I.e., "include sequences with 'zero' results", The -zero switch is
already overused by delete (to mean, "invert"), so I don't think this
is too big a leap.  New (additional) man description:
   "Normally sequences in which none of the given msgs are members
   are suppressed in the output.  The -zero switch will cause all
   sequences mentioned on the command line to be listed,
   whether or not they include any of the specified messages."

 > An example not given here would be empty sequences, i.e. ones which
 > don't exist.  Currently:
 > 
 > $ mark -l -s cur -s foo -s bar -s xyzzy
 > cur: 96894
 > foo: 
 > bar: 97036
 > xyzzy: 

Is this actually the desired behavior?  Shouldn't mark instead complain
with "mark: no such sequence as xyzzy"?

I hadn't realized that mark was currently silent about this, and my
patch is not silent like that, when messages are provided:
$ mark -l -s odd -s xyzzy 
odd: 1 3 5 7 9
xyzzy: 
$ mark -l -s odd -s xyzzy 1
odd: 1
mark: no such sequence as xyzzy

The behaviors should clearly match.  I think I'd prefer the error, but
you can try to convince me..

 > ...
 > 
 > BTW, ‘first’, etc., aren't sequences, as we know.
 > 
 > $ p -seq first 42
 > pick: sequence name is reserved: first
 > 
 > Yet,
 > 
 > $ mark -l -s first -s cur -s last -s foo -s bar -s xyzzy
 > first: 
 > cur: 96894
 > last: 
 > foo: 
 > bar: 97036
 > xyzzy: 
 > $ 
 > 
 > mark(1) doesn't complain and I'd expect it to as pick does.

I agree that it should be an error.  And again, it seems that if
"first" is in error, then "xyzzy" should also be considered an error.

 > 
 > How does this new functionality help your original need?  Were you
 > thinking ‘mark -l -s foo 42’ would either be silent or not depending if
 > 42 were in foo?  If so, what parsing language were you cranking up to
 > check.  ;-)

Obviously, the same one I'm already running:
if [ "$(mark -list -sequence foo 42)" ]
then
...

 > 
 > Finally, when I've wanted this functionality in the past, I've wondered
 > if a new pick(1) test would be the way.  Perhaps ‘-msg’ to match
 > mh-format(5)'s ‘msg’ function.
 > 
 > pick -msg 42 foo
 > 

Perhaps I should have started there.  But the man page for mark came
so close to describing what I wanted that I actually tried it,
assuming/hoping it would already work.

paul

 > The exit status would be sufficient to tell if 42 was in sequence foo.
 > 
 > Or if I want to know if any of sequence foo are in bar, xyzzy, or the
 > last few messages then it would be nice if ‘-msg’s parameter could be
 > more than a single message number.
 > 
 > pick -msg foo bar xyzzy last:42
 > 
 > Really, all this brings us back to needing a nice set-based consistent
 > algebra which all commands take.  :-)  Completely made up, without much
 > consideration:
 > 
 > forw subject:nmh \( !address:paul / mime-type:image/jpeg \)
 > 
 > Mercurial, the CVS, Subversion, ... thing, has a couple of notations
 > which are interesting for identifying files and revisions.  The former
 > has predicate functions, and the later has operators covering ancestry
 > because revisions form a tree, much like emai

Re: check if message is in a particular sequence?

2021-05-01 Thread Ralph Corderoy
Hi Paul,

> > > $ mark -list 2-6<-- previously behaved as "mark -list", above
> > > odd: 3 5
> > > even: 2 4 6
> > 
> > I would have expected an extra line,
> > 
> > $ mark -list 2-6
> >  +  cur: 
> > odd: 3 5
> > even: 2 4 6
> > 
> > because the messages given are being intersected with the normal
> > ‘mark -list’ output you showed at the start above.
>
> The current behavior matches my requirements, and the (new)
> description in the man page describes it.

Naturally.  :-)

> I wasn't thinking of it as an intersection, but a membership listing:
>
>"If msgs are specified, then only the sequence memberships for the
> given messages are shown, either for all sequences, or just for
> those named by -sequence switches."
>
> > IOW, if no messages are given then the default is ‘all’.  This seems
> > more orthogonal to me and means a script can give multiple sequences
> > and expect one line for each in the output in the order the
> > sequences were stated; there's no need to parse the ‘foo:’ or ‘bar
> > (private):’ to identify the sequence involved.
>
> I understand your need.  How about if adding "-zero" caused sequences
> in which the named messages aren't members to be listed as well?
> I.e., "include sequences with 'zero' results", The -zero switch is
> already overused by delete (to mean, "invert"), so I don't think this
> is too big a leap.  New (additional) man description:
>
>"Normally sequences in which none of the given msgs are members are
> suppressed in the output.  The -zero switch will cause all
> sequences mentioned on the command line to be listed, whether or
> not they include any of the specified messages."

My suggested behaviour:

- Using -list shows sequences and the messages in them.
- If any -sequences are given then only those sequences are listed
  rather than all sequences.
- If any messages are given then only those messages will be shown
  rather than all messages.

I don't need to describe the interaction of those two cases because they
combine in the natural manner; they're orthogonal to each other.
(The code may well echo that lack of intertwining.)

You're suggesting not listing sequences with zero messages.  The
existing -zero/-nozero option's default is -nozero but mark's existing
behaviour is to list sequences with zero messages so the meaning doesn't
match.  (And the overloading of -zero for -delete's use has always
seemed confusing and non-obvious to me.  :-)

A new -empty/-noempty could state whether to list empty sequences.
Default -empty to keep the existing behaviour.

> > An example not given here would be empty sequences, i.e. ones which
> > don't exist.  Currently:
> > 
> > $ mark -l -s cur -s foo -s bar -s xyzzy
> > cur: 96894
> > foo: 
> > bar: 97036
> > xyzzy: 
>
> Is this actually the desired behavior?  Shouldn't mark instead complain
> with "mark: no such sequence as xyzzy"?

That would be annoying if I want to know which sequences are empty.
Would mark(1) state that on stderr and stop or keep going on stdout for
non-empty sequences?  Does stderr get just the first one which doesn't
exist, or all of them?  Just the first means a script would have to
adjust and re-run until stderr is empty as well as capturing both stdout
and stderr.

(It's an annoying artefact of MH's implementation that an empty sequence
is deleted and thus indistinguishable from a typo'd sequence.  It's like
not having the number zero.)

> I hadn't realized that mark was currently silent about this, and my
> patch is not silent like that, when messages are provided:
>
> $ mark -l -s odd -s xyzzy 
> odd: 1 3 5 7 9
> xyzzy: 
> $ mark -l -s odd -s xyzzy 1
> odd: 1
> mark: no such sequence as xyzzy
>
> The behaviors should clearly match.

Yes.

> I think I'd prefer the error, but you can try to convince me..

You'd be changing existing behaviour.

>  > BTW, ‘first’, etc., aren't sequences, as we know.
>  > 
>  > $ p -seq first 42
>  > pick: sequence name is reserved: first
>  > 
>  > Yet,
>  > 
>  > $ mark -l -s first -s cur -s last -s foo -s bar -s xyzzy
>  > first: 
>  > cur: 96894
>  > last: 
>  > foo: 
>  > bar: 97036
>  > xyzzy: 
>  > $ 
>  > 
>  > mark(1) doesn't complain and I'd expect it to as pick does.
>
> I agree that it should be an error.  And again, it seems that if
> "first" is in error, then "xyzzy" should also be considered an error.

I don't think so because ‘first’ is a reserved word, as the error says,
whereas ‘xyzzy’ isn't.  (Though surely we can think of some meaning and
make is so.  :-)

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-01 Thread Paul Fox
Okay -- I'm on board.

How's this?  The commands are:
uip/mark -l 
uip/mark -l -noempty
uip/mark -l -s big -s cur
uip/mark -l -s big -s cur -noempty
uip/mark -l -s big -s cur -s xyzzy
uip/mark -l -s big -s cur -s xyzzy -noempty
uip/mark -l 9
uip/mark -l 9 -noempty
uip/mark -l -s big -s cur 9
uip/mark -l -s big -s cur 9 -noempty
uip/mark -l -s big -s cur -s xyzzy 9
uip/mark -l -s big -s cur -s xyzzy 9 -noempty

$ uip/mark -l
cur: 
odd: 1 3 5 7 9
even: 2 4 6 8 10
big: 8-10

$ uip/mark -l -noempty
odd: 1 3 5 7 9
even: 2 4 6 8 10
big: 8-10

$ uip/mark -l -s big -s cur
big: 8-10
cur: 

$ uip/mark -l -s big -s cur -noempty
big: 8-10

$ uip/mark -l -s big -s cur -s xyzzy
big: 8-10
cur: 
xyzzy: 

$ uip/mark -l -s big -s cur -s xyzzy -noempty
big: 8-10

$ uip/mark -l 9
cur: 
odd: 9
even: 
big: 9

$ uip/mark -l 9 -noempty
odd: 9
big: 9

$ uip/mark -l -s big -s cur 9
big: 9
cur: 

$ uip/mark -l -s big -s cur 9 -noempty
big: 9

$ uip/mark -l -s big -s cur -s xyzzy 9
big: 9
cur: 
xyzzy: 

$ uip/mark -l -s big -s cur -s xyzzy 9 -noempty
big: 9



=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 55.0 degrees)




Re: check if message is in a particular sequence?

2021-05-01 Thread Ralph Corderoy
Hi Paul,

> How's this?  The commands are:
>   uip/mark -l 
>   uip/mark -l -noempty
>   uip/mark -l -s big -s cur
>   uip/mark -l -s big -s cur -noempty
>   uip/mark -l -s big -s cur -s xyzzy
>   uip/mark -l -s big -s cur -s xyzzy -noempty
>   uip/mark -l 9
>   uip/mark -l 9 -noempty
>   uip/mark -l -s big -s cur 9
>   uip/mark -l -s big -s cur 9 -noempty
>   uip/mark -l -s big -s cur -s xyzzy 9
>   uip/mark -l -s big -s cur -s xyzzy 9 -noempty

I think the output looks spot on and the tests cover the twelve ways of
combining 3 × 2 × 2 settings.

Sequence: all existing existing-and-non-existing
Message: all stated
Empty: empty noempty

What if ‘uip/mark -l 42’ is done where 42 doesn't exist?  I think that
should complain 42 doesn't exist to be consistent with a mark to add to
a sequence, or a scan of it, and you probably get that behaviour ‘for
free’, but I'm just checking.

Sorry for the extra labour, I realise it's easier to just heckle and lob
empty beer bottles.  :-)

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-01 Thread Ken Hornstein
>> but would suffer compared with Ken's approach in that deleted messages
>> would still be reported
>
>I don't think they would; mark(1) doesn't output removed messages,
>whether by rmm(1) or through the filesystem.
>[...]

Right, any nmh program which calls folder_read() (which is basically
almost all of them) performs a readdir() on the whole folder and reads in
the sequences; as part of that it cleans up any sequence entries that
refer to missing files.

Well, let me amend that statement.  _Internally_ the sequence list is
cleaned up.  But the cleaned-up sequence doesn't get written unless you
use a program that modifies any sequence (specifically, it has to set
the SEQMOD bit on the folder).  So if you are directly reading the
sequence file it is possible that it will refer to a message that
does not exist _IF_ you delete messages from a folder without going
through nmh.  But no nmh utility should say that a nonexistant message
is a member of a sequence (with the caveat that it COULD happen if you
delete a message after a nmh program calls folder_read()).

--Ken



Re: check if message is in a particular sequence?

2021-05-01 Thread Paul Fox
ralph wrote:
 > I think the output looks spot on and the tests cover the twelve ways of
 > combining 3 × 2 × 2 settings.

:-)   That's why I wrote the commands first, and automated running
them.  I kept forgetting to try some combination or other.

 > What if ‘uip/mark -l 42’ is done where 42 doesn't exist?  I think that

$ mark -l 9 42
mark: message 42 doesn't exist

No output except the error message.

 > 
 > Sorry for the extra labour, I realise it's easier to just heckle and lob
 > empty beer bottles.  :-)

No worries!  It's important to get it right.  As a bonus, somehow the
man page paragraph for -list has shrunk bit, which I think means the
options are cleaner, or I'm thinking about them more cleanly.

   The -list switch tells mark to list all sequences and the
   messages associated with them.  The output can be limited to
   just certain sequences (with -sequence switches) and/or
   messages (with msgs arguments).  The -noempty switch will
   suppress sequences which would be listed as empty, either
   because they actually are empty, or because they don't include
   any messages specified by msgs.  The -zero switch does not
   affect the operation of -list.

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 58.6 degrees)




Re: check if message is in a particular sequence?

2021-05-02 Thread Paul Fox
One more thing.

mark has always compressed its output:  "3 4 5 6" becomes "3-6".

It still does so after my changes.  While we were talking about
using existing commands to solve my "check if a message is in
a sequence" problem, it became clear that the range compression
was an impediment.  Now that I've added the ability to explicitly
ask "is message 4 part of sequence foo", range compression isn't
really an issue, but it feels like we should be able to control it.
I'd kind of like to add -terse/-noterse:

$ uip/mark -l
big: 9-10
lots: 7-12

$ uip/mark -l -noterse
big: 9 10
lots: 7 8 9 10 11 12

Thoughts?

(I'd also love to fix the old bug that causes "9 10" to be displayed as
"9-10", but I probably shouldn't.  Someone probably relies on it.)

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 64.6 degrees)




Re: check if message is in a particular sequence?

2021-05-02 Thread David Levine
Paul wrote:

> I'd kind of like to add -terse/-noterse:

> Thoughts?

I like it.

> (I'd also love to fix the old bug that causes "9 10" to be displayed as
> "9-10", but I probably shouldn't.  Someone probably relies on it.)

Agreed.

David



Re: check if message is in a particular sequence?

2021-05-02 Thread Ken Hornstein
>> I'd kind of like to add -terse/-noterse:
>
>> Thoughts?
>
>I like it.

I confess I don't LOVE -terse/-noterse, but I cannot think of anything
better right now; I say go for it.

--Ken



Re: check if message is in a particular sequence?

2021-05-02 Thread Paul Fox
ken wrote:
 > >> I'd kind of like to add -terse/-noterse:
 > >
 > >> Thoughts?
 > >
 > >I like it.
 > 
 > I confess I don't LOVE -terse/-noterse, but I cannot think of anything
 > better right now; I say go for it.

I thought of using "curt", and "nocurt", but that didn't seem right
either.  ;-)

But maybe you're not fond of the concept, rather than just the name?

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 67.5 degrees)




Re: check if message is in a particular sequence?

2021-05-02 Thread Ralph Corderoy
Hi Paul,

> it became clear that the range compression was an impediment.  Now
> that I've added the ability to explicitly ask "is message 4 part of
> sequence foo", range compression isn't really an issue, but it feels
> like we should be able to control it.  I'd kind of like to add
> -terse/-noterse:

-range/-norange.  :-)

> (I'd also love to fix the old bug that causes "9 10" to be displayed as
> "9-10", but I probably shouldn't.  Someone probably relies on it.)

I wasn't aware it was a bug, more an easy way to spot singletons;
messages without an adjacent neighbour in the sequence.

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-02 Thread Paul Fox
ralph wrote:
 > Hi Paul,
 > 
 > > it became clear that the range compression was an impediment.  Now
 > > that I've added the ability to explicitly ask "is message 4 part of
 > > sequence foo", range compression isn't really an issue, but it feels
 > > like we should be able to control it.  I'd kind of like to add
 > > -terse/-noterse:
 > 
 > -range/-norange.  :-)

I thought of that too, as a close second to -terse/-noterse.  And it's
more specific.  I'd be happy with that, if others prefer it.

 > 
 > > (I'd also love to fix the old bug that causes "9 10" to be displayed as
 > > "9-10", but I probably shouldn't.  Someone probably relies on it.)
 > 
 > I wasn't aware it was a bug, more an easy way to spot singletons;
 > messages without an adjacent neighbour in the sequence.

Thanks.  That might make me feel better about it.  I'd never write an
adjacent pair as a range myself, so I find it jarring, and it makes it
look like there's something in between.  It's hard to believe it wasn't
intentional, though, given that it's a two line fix.

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 67.5 degrees)




Re: check if message is in a particular sequence?

2021-05-02 Thread Ken Hornstein
> > I confess I don't LOVE -terse/-noterse, but I cannot think of anything
> > better right now; I say go for it.
>
>I thought of using "curt", and "nocurt", but that didn't seem right
>either.  ;-)
>
>But maybe you're not fond of the concept, rather than just the name?

Oh, no, the CONCEPT is fine, I just didn't love the name.  I like
-range/-norange slightly better, but I still don't find it great.
Something like -compress/-nocompress MIGHT be better, but again, I'm
not loving it.

I am only offering feedback on this because you asked for it.  Really,
I think the important thing is to get the feature in there.  Any of the
suggested names I could live with; just pick whatever you think is best.

--Ken



Re: check if message is in a particular sequence?

2021-05-02 Thread Ralph Corderoy
Hi Ken,

> Oh, no, the CONCEPT is fine, I just didn't love the name.  I like
> -range/-norange slightly better, but I still don't find it great.
> Something like -compress/-nocompress MIGHT be better, but again, I'm
> not loving it.
>
> I am only offering feedback on this because you asked for it.  Really,
> I think the important thing is to get the feature in there.  Any of
> the suggested names I could live with; just pick whatever you think is
> best.

Once it escapes into a release, it's hard to cull or replace.  :-)

I went for -range because it's controlling the printing as ‘lo-hi’ and
mh-sequence(5) already says things like

A message range is specified as “name1-name2” or “name:n”, where
`name', `name1' and `name2' are message names, and `n' is an
integer.

The “reserved” message name “all” is a shorthand for the message
range “first-last”.

Sequence File Format
A contiguous range of messages can be represented as
“lownum-highnum”.

It's nice to keep to as small a range of specialist vocabularly as
possible.  Some of the more confusing aspects of understanding something
new is when multiple terms refer to the same or overlapping thing.

-- 
Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-02 Thread Paul Fox
ken wrote:
 > >
 > >But maybe you're not fond of the concept, rather than just the name?
 > 
 > Oh, no, the CONCEPT is fine, I just didn't love the name.  I like
 > -range/-norange slightly better, but I still don't find it great.
 > Something like -compress/-nocompress MIGHT be better, but again, I'm
 > not loving it.

Maybe shrink/noshrink?

I'll let all these sit for a day, then commit something.

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 72.9 degrees)




Re: check if message is in a particular sequence?

2021-05-02 Thread Conrad Hughes
> -terse/-noterse
> -curt/-nocurt
> -range/-norange
> -compress/-nocompress

Probably not helping, but I see the word "compact" applied to number
ranges like this in a few places on the net.

Conrad



Re: check if message is in a particular sequence?

2021-05-02 Thread David Levine
Paul wrote:

> Maybe shrink/noshrink?

I prefer [no]range.  It seems to me to be the most relevant and specific.

David



Re: check if message is in a particular sequence?

2021-05-02 Thread Ken Hornstein
>> Maybe shrink/noshrink?
>
>I prefer [no]range.  It seems to me to be the most relevant and specific.

I find myself ultimately persuaded by Ralph's argument, so also put me
down for [no]range.

--Ken



Re: check if message is in a particular sequence?

2021-05-02 Thread Laura Creighton
In a message of Sun, 02 May 2021 14:11:44 -0400, Ken Hornstein writes:
>> > I confess I don't LOVE -terse/-noterse, but I cannot think of anything
>> > better right now; I say go for it.
>>
>>I thought of using "curt", and "nocurt", but that didn't seem right
>>either.  ;-)
>>
>>But maybe you're not fond of the concept, rather than just the name?
>
>Oh, no, the CONCEPT is fine, I just didn't love the name.  I like
>-range/-norange slightly better, but I still don't find it great.
>Something like -compress/-nocompress MIGHT be better, but again, I'm
>not loving it.

-noexpand/-expand ??

Laura




Re: check if message is in a particular sequence?

2021-05-02 Thread Bob Carragher
I don't know about others, Ralph, but I would prefer you don't
shut up soon.  B-)  This is how I learn (or am reminded) about
all kinds of features of awk/sed/bash/etc. -- in addition to NMH
stuff I never knew since I'm such a "basic" user -- and I really
appreciate the (perhaps unintended) lessons!  B-)

Bob

> From: Ralph Corderoy 
> To:   nmh-workers@nongnu.org
> Date: Sat, 01 May 2021 12:26:06 +0100
> Subject:  Re: check if message is in a particular sequence?
> 
> Hi Bob,
> 
> > Ah, thanks, Ralph!
> 
> Thanks to you for reminding me of awk's RS which, coupled with knowing
> the only words starting with a positive number are a message range,
> shortens my earlier sed and awk to
> 
> $ mark -list -seq public -seq private -seq notexist
> public: 1-3 42
> private (private): 3141 97057-97059
> notexist: 
> $
> $ mark -list -seq public -seq private -seq notexist |
> > gawk -v RS=' |\n' -F - '
> > $0+0 {u = NF==2 ? $2 : $1; for (n = $1; n <= u; n++) print n}
> > '
> 1
> 2
> 3
> 42
> 3141
> 97057
> 97058
> 97059
> $
> 
> -- 
> Cheers, Ralph.



> From: Ralph Corderoy 
> To:   nmh-workers@nongnu.org
> Date: Sat, 01 May 2021 12:43:27 +0100
> Subject:  Re: check if message is in a particular sequence?
> 
> Hi,
> 
> I wrote:
> > $ mark -list -seq public -seq private -seq notexist
> > public: 1-3 42
> > private (private): 3141 97057-97059
> > notexist: 
> > $
> > $ mark -list -seq public -seq private -seq notexist |
> > > gawk -v RS=' |\n' -F - '
> > > $0+0 {u = NF==2 ? $2 : $1; for (n = $1; n <= u; n++) print n}
> > > '
> > 1
> > 2
> > 3
> > 42
> > 3141
> > 97057
> > 97058
> > 97059
> > $
> 
> Silly me.  No need for gawk's regexp RS as the ‘42\nprivate’ which
> arrives with just the POSIX RS=' ' always has something to discard after
> the linefeed so there's no need to split it off into its own record.
> Thus, the above simplifies further, with the coercing of $1, into
> 
> mark -list -seq public -seq private -seq notexist |
> awk -v RS=' ' -F - '
>   $0+0 {u = NF==2 ? $2 : $1; for (n = $1+0; n <= u; n++) print n}
> '
> 
> -- 
> Cheers, Ralph.



> From: Ralph Corderoy 
> To:   nmh-workers@nongnu.org
> Date: Sat, 01 May 2021 12:59:16 +0100
> Subject:  Re: check if message is in a particular sequence?
> 
> I'll shut up soon.
> 
> > mark -list -seq public -seq private -seq notexist |
> > awk -v RS=' ' -F - '
> > $0+0 {u = NF==2 ? $2 : $1; for (n = $1+0; n <= u; n++) print n}
> > '
> 
> I thought I may as well create this and see if it gets used.
> 
> $ cat ~/bin/mhinseq
> #! /bin/sh
> 
> # Successfully exit only if the message is in the sequence.
> # usage: mhinseq seq 42
> 
> mark -list -sequence "${1?}" |
> awk -v RS=' ' -F - -v msg="${2?}" '
>   $0+0 &&
>   ((NF == 1 && $1+0 == msg) ||
>(NF == 2 && $1 <= msg && msg <= $2)) {f=1; last}
>   END {exit !f}
>   '
> $ 
> 
> -- 
> Cheers, Ralph.



Re: check if message is in a particular sequence?

2021-05-03 Thread Paul Fox
laura wrote:
 > 
 > -noexpand/-expand ??
 > 

You almost had me with that one, and it pointed out that I'd only been
looking at synonyms for compress, or terse, forgetting completely that
an antonym could work just as well.

In the end, though, the specificity of -range/-norange won out.

I've pushed all of the mark(1) changes I've been working on.

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 55.6 degrees)




Re: check if message is in a particular sequence?

2021-05-03 Thread Valdis Klētnieks
On Sun, 02 May 2021 19:23:06 +0100, Ralph Corderoy said:

> I went for -range because it's controlling the printing as ‘lo-hi’ and
> mh-sequence(5) already says things like
>
> A message range is specified as “name1-name2” or “name:n”, where
> `name', `name1' and `name2' are message names, and `n' is an
> integer.

So -range means "make it into a range if possible" and -norange doesn't.
I'm sold. :)

And yes, being able to say -norange is handy if you're doing something where
you're grabbing a range and then feeding it to a shell's "for i in $foo". (Over 
the
decades that's annoyed me - a big reason I don't use mark more is because I'm
usually using that to feed something else that doesn't grok "14-19"


pgpry30ACwrrf.pgp
Description: PGP signature


Re: check if message is in a particular sequence?

2021-05-04 Thread Ralph Corderoy
Hi Valdis,

> So -range means "make it into a range if possible" and -norange doesn't.

Yes.  -showrange would be more explicit with the new -empty being
-showempty to match but then the common -show prefix hampers nmh's
shortest-unique-prefix option-parsing.  Given we're MH users, I thought
we could handle thinking ‘I want empty’ ⇒ -empty.

-- 
Cheers, Ralph.