Re: [hackers] [PATCH] Update macros in arg.h to not cause hidden side effects to argv or rely on previously defined variables.

2024-08-28 Thread Mattias Andrée
ARGBEGIN...ARGEND is used to parse AND consume the options (and it also
consumes argv[0]), leaving the operands in argv for the code below ARGEND.
That's why mutates argv and argc. Now the code below ARGEND has no idea
which elements in argv are options and which elements are operands.


On Wed, 28 Aug 2024 18:44:53 -0300
sebastien peterson boudreau  wrote:

>  I mainly wanted to gauge interest in this change by sending it. I was
>  looking at the macros in arg.h and comparing to the POSIX `getopt`, but
>  what I disliked is that `ARGBEGIN` mutates `argv` and relies on the
>  variable `argv0` being previously defined. Of course, this is also an
>  issue with `getopt` using `optarg` IMO.
> 
>  This patch changes the macro to define a new variable, `_argv`,
>  although the name is _certainly_ subject to change (especially
>  considering we already have `argv_`. I just wanted to make the change
>  quick and I figured others could bike-shed about the name). This allows
>  the original `argv` argument to `main` to go unchanged. The only other
>  change this requires is that `usage` take a string argument to print,
>  which can be easily passed when called from `main`.
> 
>  I also went ahead and formatted the macro to be a bit easier to
>  read just to make making the changes easier on myself. It looks like
>  the origin author used tabs to vertically align, rather than spaces (as
>  specified in the coding style), so the formatting breaks if you have a
>  different tabwidth than them. I believe the formatting I used should
>  look equally as correct, regardless of tabwidth.
> 
> ---
>  arg.h | 65 +--
>  x.c   | 25 +++
>  2 files changed, 44 insertions(+), 46 deletions(-)
> 
> diff --git a/arg.h b/arg.h
> index a22e019..0726d0a 100644
> --- a/arg.h
> +++ b/arg.h
> @@ -6,45 +6,44 @@
>  #ifndef ARG_H__
>  #define ARG_H__
>  
> -extern char *argv0;
> -
>  /* use main(int argc, char *argv[]) */
> -#define ARGBEGIN for (argv0 = *argv, argv++, argc--;\
> - argv[0] && argv[0][0] == '-'\
> - && argv[0][1];\
> - argc--, argv++) {\
> - char argc_;\
> - char **argv_;\
> - int brk_;\
> - if (argv[0][1] == '-' && argv[0][2] == '\0') {\
> - argv++;\
> - argc--;\
> - break;\
> - }\
> - int i_;\
> - for (i_ = 1, brk_ = 0, argv_ = argv;\
> - argv[0][i_] && !brk_;\
> - i_++) {\
> - if (argv_ != argv)\
> - break;\
> - argc_ = argv[0][i_];\
> - switch (argc_)
> -
> -#define ARGEND   }\
> - }
> +#define ARGBEGIN \
> + char **_argv;\
> + argc--;\
> + for (_argv = argv+1;\
> + _argv[0] && _argv[0][0] == '-' && _argv[0][1];\
> + argc--, _argv++) {\
> + char argc_;\
> + char **argv_;\
> + int brk_;\
> + if (_argv[0][1] == '-' && _argv[0][2] == '\0') {\
> + _argv++;\
> + argc--;\
> + break;\
> + }\
> + int i_;\
> + for (i_ = 1, brk_ = 0, argv_ = _argv; _argv[0][i_] && !brk_; 
> i_++) {\
> + if (argv_ != _argv)\
> + break;\
> + argc_ = _argv[0][i_];\
> + switch (argc_)
> +
> +#define ARGEND \
> + }\
> + }
>  
>  #define ARGC()   argc_
>  
> -#define EARGF(x) ((argv[0][i_+1] == '\0' && argv[1] == NULL)?\
> +#define EARGF(x) ((_argv[0][i_+1] == '\0' && _argv[1] == NULL)?\
>   ((x), abort(), (char *)0) :\
> - (brk_ = 1, (argv[0][i_+1] != '\0')?\
> - (&argv[0][i_+1]) :\
> - (argc--, argv++, argv[0])))
> + (brk_ = 1, (_argv[0][i_+1] != '\0')?\
> + (&_argv[0][i_+1]) :\
> + (argc--, _argv++, _argv[0])))
>  
> -#define ARGF()   ((argv[0][i_+1] == '\0' && argv[1] == NULL)?\
> +#define ARGF()   ((_argv[0][i_+1] == '\0' && _argv[1] == NULL)?\
>   (char *)0 :\
> - (brk_ = 1, (argv[0][i_+1] != '\0')?\
> - (&argv[0][i_+1]) :\
> - (

Re: [hackers] [libgrapheme] Use (size_t)(-1) instead of SIZE_MAX and fix style || Laslo Hunhold

2022-07-31 Thread Mattias Andrée
Why wouldn't SIZE_MAX be the maximum of size_t?

On Sun, 31 Jul 2022 11:47:40 +0200
 wrote:

> commit 25d89e6e460e68329e7a3f388fe3e150a8f5474a
> Author: Laslo Hunhold 
> AuthorDate: Sun Jul 31 11:46:48 2022 +0200
> Commit: Laslo Hunhold 
> CommitDate: Sun Jul 31 11:46:48 2022 +0200
> 
> Use (size_t)(-1) instead of SIZE_MAX and fix style
> 
> SIZE_MAX cannot be relied upon to fully reflect size_t's size. A
> portable idiom is to simply cast -1 to size_t to get the type's maximum
> value.
> 
> Signed-off-by: Laslo Hunhold 
> 
> diff --git a/gen/util.c b/gen/util.c
> index cefcee7..bfe0dbf 100644
> --- a/gen/util.c
> +++ b/gen/util.c
> @@ -34,11 +34,12 @@ struct break_test_payload
>  static void *
>  reallocate_array(void *p, size_t len, size_t size)
>  {
> - if (len > 0 && size > SIZE_MAX/len) {
> + if (len > 0 && size > (size_t)(-1) / len) {
>   errno = ENOMEM;
>   return NULL;
>   }
> - return realloc(p, len*size);
> +
> + return realloc(p, len * size);
>  }
>  
>  int
> 




Re: [hackers] [libgrapheme] Add reallocarray implementation || robert

2022-07-31 Thread Mattias Andrée
On Sun, 31 Jul 2022 11:47:40 +0200
 wrote:

> commit bdf42537c5792f6beb0360517ff378834cfd8a68
> Author: robert 
> AuthorDate: Sat Jul 30 14:29:05 2022 -0700
> Commit: Laslo Hunhold 
> CommitDate: Sun Jul 31 11:41:08 2022 +0200
> 
> Add reallocarray implementation
> 
> reallocarray is nonstandard and glibc declares it only when _GNU_SOURCE
> is defined. Without this patch or _GNU_SOURCE (for glibc < 2.29) defined,
> you get a segfault from reallocarray being implicitly declared with the
> wrong signature.
> 
> Signed-off-by: Laslo Hunhold 
> 
> diff --git a/gen/util.c b/gen/util.c
> index d234ddd..c97a1ea 100644
> --- a/gen/util.c
> +++ b/gen/util.c
> @@ -31,6 +31,16 @@ struct break_test_payload
>   size_t *testlen;
>  };
>  
> +static void *
> +reallocarray(void *p, size_t len, size_t size)
> +{
> + if (len > 0 && size > SIZE_MAX/len) {

I think

if (size && len > SIZE_MAX / size) {

would be a little nicer, the compiler has a better chance of optimising
it to simply `len > SOME_CONSTANT` (remove the if-statement completely)
would `size` be 0) and I think it is also more intuitive as you think
in terms of if the memory can fit enough elements, not if it can fit
large enough elements.

> + errno = ENOMEM;
> + return NULL;
> + }
> + return realloc(p, len*size);
> +}
> +
>  int
>  hextocp(const char *str, size_t len, uint_least32_t *cp)
>  {
> 




Re: [hackers] [libgrapheme] Use SIZE_MAX instead of (size_t)-1 || Laslo Hunhold

2021-12-18 Thread Mattias Andrée
It appears you are correct, I've been tricked by some tool that
checked for undefined behaviour during runtime (don't remember
which, it was an website that forced it upon the user). Casting
a signed value X to unsigned is for an N-bit integer shall
result in the number that that is congruent with X modulo
2^N. So, 2^N + X for negative numbers.

And yes, signed overflow is undefined, despite some LinkedIn
Learning course I took claiming otherwise (it even claimed that
C always used two's complement). (And no, LinkedIn Learning is
not worth your money, whatever it may cost; my employer pays
for it.)


On Sat, 18 Dec 2021 15:07:30 -0500
Ethan Sommer  wrote:

> On Sat, Dec 18, 2021 at 3:02 PM Mattias Andrée  wrote:
> 
> > (size_t)-1 is also undefined behaviour.  
> 
> 
> It isn't, wrap-around with unsigned types is defined, it's only signed
> overflow that isn't.
> 
> 
> > On Sat, 18 Dec 2021 20:21:42 +0100
> >  wrote:
> >  
> > > commit cb7e9c00899ae0ed57a84991308b7f880f4ddef6
> > > Author: Laslo Hunhold 
> > > AuthorDate: Sat Dec 18 20:21:04 2021 +0100
> > > Commit: Laslo Hunhold 
> > > CommitDate: Sat Dec 18 20:21:04 2021 +0100
> > >
> > > Use SIZE_MAX instead of (size_t)-1
> > >
> > > This makes a bit clearer what we mean, and given the library is C99
> > > we can rely on this constant to exist.
> > >
> > > Signed-off-by: Laslo Hunhold 
> > >
> > > diff --git a/man/grapheme_decode_utf8.3 b/man/grapheme_decode_utf8.3
> > > index 26e3afb..d5c7c9d 100644
> > > --- a/man/grapheme_decode_utf8.3
> > > +++ b/man/grapheme_decode_utf8.3
> > > @@ -31,8 +31,8 @@ Given NUL has a unique 1 byte representation, it is  
> > safe to operate on  
> > >  NUL-terminated strings by setting
> > >  .Va len
> > >  to
> > > -.Dv (size_t)-1
> > > -and terminating when
> > > +.Dv SIZE_MAX
> > > +(stdint.h is already included by grapheme.h) and terminating when
> > >  .Va cp
> > >  is 0 (see
> > >  .Sx EXAMPLES
> > > @@ -87,7 +87,7 @@ print_cps_nul_terminated(const char *str)
> > >   uint_least32_t cp;
> > >
> > >   for (off = 0; (ret = grapheme_decode_utf8(str + off,
> > > -   (size_t)-1, &cp)) > 0 &&
> > > +   SIZE_MAX, &cp)) > 0 &&
> > >cp != 0; off += ret) {
> > >   printf("%"PRIxLEAST32"\\n", cp);
> > >   }
> > > diff --git a/src/character.c b/src/character.c
> > > index 015b4e0..8f1143f 100644
> > > --- a/src/character.c
> > > +++ b/src/character.c
> > > @@ -197,19 +197,19 @@ grapheme_next_character_break(const char *str)
> > >* miss it, even if the previous UTF-8 sequence terminates
> > >* unexpectedly, as it would either act as an unexpected byte,
> > >* saved for later, or as a null byte itself, that we can catch.
> > > -  * We pass (size_t)-1 to the length, as we will never read beyond
> > > +  * We pass SIZE_MAX to the length, as we will never read beyond
> > >* the null byte for the reasons given above.
> > >*/
> > >
> > >   /* get first codepoint */
> > > - len += grapheme_decode_utf8(str, (size_t)-1, &cp0);
> > > + len += grapheme_decode_utf8(str, SIZE_MAX, &cp0);
> > >   if (cp0 == GRAPHEME_INVALID_CODEPOINT) {
> > >   return len;
> > >   }
> > >
> > >   while (cp0 != 0) {
> > >   /* get next codepoint */
> > > - ret = grapheme_decode_utf8(str + len, (size_t)-1, &cp1);
> > > + ret = grapheme_decode_utf8(str + len, SIZE_MAX, &cp1);
> > >
> > >   if (cp1 == GRAPHEME_INVALID_CODEPOINT ||
> > >   grapheme_is_character_break(cp0, cp1, &state)) {
> > >  
> >
> >
> >  




Re: [hackers] [libgrapheme] Use SIZE_MAX instead of (size_t)-1 || Laslo Hunhold

2021-12-18 Thread Mattias Andrée
(size_t)-1 is also undefined behaviour.

On Sat, 18 Dec 2021 20:21:42 +0100
 wrote:

> commit cb7e9c00899ae0ed57a84991308b7f880f4ddef6
> Author: Laslo Hunhold 
> AuthorDate: Sat Dec 18 20:21:04 2021 +0100
> Commit: Laslo Hunhold 
> CommitDate: Sat Dec 18 20:21:04 2021 +0100
> 
> Use SIZE_MAX instead of (size_t)-1
> 
> This makes a bit clearer what we mean, and given the library is C99
> we can rely on this constant to exist.
> 
> Signed-off-by: Laslo Hunhold 
> 
> diff --git a/man/grapheme_decode_utf8.3 b/man/grapheme_decode_utf8.3
> index 26e3afb..d5c7c9d 100644
> --- a/man/grapheme_decode_utf8.3
> +++ b/man/grapheme_decode_utf8.3
> @@ -31,8 +31,8 @@ Given NUL has a unique 1 byte representation, it is safe to 
> operate on
>  NUL-terminated strings by setting
>  .Va len
>  to
> -.Dv (size_t)-1
> -and terminating when
> +.Dv SIZE_MAX
> +(stdint.h is already included by grapheme.h) and terminating when
>  .Va cp
>  is 0 (see
>  .Sx EXAMPLES
> @@ -87,7 +87,7 @@ print_cps_nul_terminated(const char *str)
>   uint_least32_t cp;
>  
>   for (off = 0; (ret = grapheme_decode_utf8(str + off,
> -   (size_t)-1, &cp)) > 0 &&
> +   SIZE_MAX, &cp)) > 0 &&
>cp != 0; off += ret) {
>   printf("%"PRIxLEAST32"\\n", cp);
>   }
> diff --git a/src/character.c b/src/character.c
> index 015b4e0..8f1143f 100644
> --- a/src/character.c
> +++ b/src/character.c
> @@ -197,19 +197,19 @@ grapheme_next_character_break(const char *str)
>* miss it, even if the previous UTF-8 sequence terminates
>* unexpectedly, as it would either act as an unexpected byte,
>* saved for later, or as a null byte itself, that we can catch.
> -  * We pass (size_t)-1 to the length, as we will never read beyond
> +  * We pass SIZE_MAX to the length, as we will never read beyond
>* the null byte for the reasons given above.
>*/
>  
>   /* get first codepoint */
> - len += grapheme_decode_utf8(str, (size_t)-1, &cp0);
> + len += grapheme_decode_utf8(str, SIZE_MAX, &cp0);
>   if (cp0 == GRAPHEME_INVALID_CODEPOINT) {
>   return len;
>   }
>  
>   while (cp0 != 0) {
>   /* get next codepoint */
> - ret = grapheme_decode_utf8(str + len, (size_t)-1, &cp1);
> + ret = grapheme_decode_utf8(str + len, SIZE_MAX, &cp1);
>  
>   if (cp1 == GRAPHEME_INVALID_CODEPOINT ||
>   grapheme_is_character_break(cp0, cp1, &state)) {
> 




Re: [hackers] [libgrapheme] Rename API functions to improve readability || Laslo Hunhold

2021-12-18 Thread Mattias Andrée
I would prefer the “libgrapheme_” prefix, so that it
is obvious that the functions belong to the libgrapheme
library.


Regards,
Mattias Andrée


On Sat, 18 Dec 2021 19:54:57 +0100
 wrote:

> commit 4483b44e8444d4a57bcbb31dbe9eac3e6b80c1ad
> Author: Laslo Hunhold 
> AuthorDate: Sat Dec 18 19:49:34 2021 +0100
> Commit: Laslo Hunhold 
> CommitDate: Sat Dec 18 19:49:34 2021 +0100
> 
> Rename API functions to improve readability
> 
> I thought about how to address the fact that "isbreak" and "nextbreak"
> kind of breaks the snake case, but "grapheme_character_is_break" sounds
> convoluted.
> 
> The solution is to loosen the naming a bit and not require the
> "component" (in this case "character") to immediately follow the
> "grapheme_" prefix. Instead, the "is" and "next" keywords are brought
> to the front, which improves the readability substantially and the
> functions are well-grouped into "is" and "next" functions.
> 
> Analogously, it makes more sense to "decode_utf8" than "utf8_decode",
> so this was changed as well, including going back to
> GRAPHEME_INVALID_CODEPOINT, which just rolls off the tongue better.
> 
> Signed-off-by: Laslo Hunhold 
> 
> diff --git a/Makefile b/Makefile
> index 8f6d694..cdda874 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -25,10 +25,10 @@ TEST =\
>   test/utf8-encode\
>  
>  MAN3 =\
> - man/lg_grapheme_isbreak.3\
> - man/lg_grapheme_nextbreak.3\
> - man/lg_utf8_decode.3\
> - man/lg_utf8_encode.3\
> + man/grapheme_decode_utf8.3\
> + man/grapheme_encode_utf8.3\
> + man/grapheme_is_character_break.3\
> + man/grapheme_next_character_break.3\
>  
>  MAN7 = man/libgrapheme.7
>  
> diff --git a/grapheme.h b/grapheme.h
> index b9c381c..ea8a02d 100644
> --- a/grapheme.h
> +++ b/grapheme.h
> @@ -17,13 +17,13 @@ typedef struct grapheme_internal_segmentation_state {
>   uint_least16_t flags;
>  } GRAPHEME_STATE;
>  
> -#define GRAPHEME_CODEPOINT_INVALID UINT32_C(0xFFFD)
> +#define GRAPHEME_INVALID_CODEPOINT UINT32_C(0xFFFD)
>  
> -size_t grapheme_character_nextbreak(const char *);
> +size_t grapheme_next_character_break(const char *);
>  
> -bool grapheme_character_isbreak(uint_least32_t, uint_least32_t, 
> GRAPHEME_STATE *);
> +bool grapheme_is_character_break(uint_least32_t, uint_least32_t, 
> GRAPHEME_STATE *);
>  
> -size_t grapheme_utf8_decode(const char *, size_t, uint_least32_t *);
> -size_t grapheme_utf8_encode(uint_least32_t, char *, size_t);
> +size_t grapheme_decode_utf8(const char *, size_t, uint_least32_t *);
> +size_t grapheme_encode_utf8(uint_least32_t, char *, size_t);
>  
>  #endif /* GRAPHEME_H */
> diff --git a/man/grapheme_utf8_decode.3 b/man/grapheme_decode_utf8.3
> similarity index 84%
> rename from man/grapheme_utf8_decode.3
> rename to man/grapheme_decode_utf8.3
> index 6a1f5c2..26e3afb 100644
> --- a/man/grapheme_utf8_decode.3
> +++ b/man/grapheme_decode_utf8.3
> @@ -1,16 +1,16 @@
>  .Dd 2021-12-17
> -.Dt GRAPHEME_UTF8_DECODE 3
> +.Dt GRAPHEME_DECODE_UTF8 3
>  .Os suckless.org
>  .Sh NAME
> -.Nm grapheme_utf8_decode
> +.Nm grapheme_decode_utf8
>  .Nd decode first codepoint in UTF-8-encoded string
>  .Sh SYNOPSIS
>  .In grapheme.h
>  .Ft size_t
> -.Fn grapheme_utf8_decode "const char *str" "size_t len" "uint_least32_t *cp"
> +.Fn grapheme_decode_utf8 "const char *str" "size_t len" "uint_least32_t *cp"
>  .Sh DESCRIPTION
>  The
> -.Fn grapheme_utf8_decode
> +.Fn grapheme_decode_utf8
>  function decodes the next codepoint in the UTF-8-encoded string
>  .Va str
>  of length
> @@ -18,7 +18,7 @@ of length
>  If the UTF-8-sequence is invalid (overlong encoding, unexpected byte,
>  string ends unexpectedly, empty string, etc.) the decoding is stopped
>  at the last processed byte and the decoded codepoint set to
> -.Dv GRAPHEME_CODEPOINT_INVALID.
> +.Dv GRAPHEME_INVALID_CODEPOINT.
>  .Pp
>  If
>  .Va cp
> @@ -39,7 +39,7 @@ is 0 (see
>  for an example).
>  .Sh RETURN VALUES
>  The
> -.Fn grapheme_utf8_decode
> +.Fn grapheme_decode_utf8
>  function returns the number of processed bytes and 0 if
>  .Va str
>  is
> @@ -65,7 +65,7 @@ print_cps(const char *str, size_t len)
>   uint_least32_t cp;
>  
>   for (off = 0; off < len; off += ret) {
> - if ((ret = grapheme_utf8_decode(str + off,
> + if ((ret = grapheme_decode_utf8(str + off,
>   len - 

Re: [hackers] A very unconventional unit testing library

2021-08-11 Thread Mattias Andrée
Hi Thomas,

On Wed, 11 Aug 2021 13:42:25 +0200
Thomas Oltmann  wrote:

> Hi Andrée,
> 
> Am Mi., 11. Aug. 2021 um 13:12 Uhr schrieb Mattias Andrée :
> > This looks like a very neat test framework. I would however like
> > the file name to be printed in addition to line number as the
> > test can cover multiple files: the file with the tests and files
> > with utilities functions, or replacement for standard functions,
> > that the tests use.  
> 
> Thus far, my tests were simply pushing their filenames onto the hierarchy,
> which was good enough for my purposes.

That's a very good solution, I didn't think of that possibility.

> But you're right, it would be more flexible to always have the
> filename be part the error message.
> 
> On a related note, I'm also not quite satisfied with the way dh_cuts
> is currently recovering from failures.
> Right now, after a failed assert, it will just keep going, reporting
> all further errors.
> This is sometimes useful, but when you're running tests in a loop, it
> can potentially result in a lot of messages.
> I think it might be best if I change it so that dh_cuts automatically
> aborts if any asserts fail.

Yes, you definitely want to abort as tests can assume that previous
tests were successful, and you don't want too much noise if it fails
in a large loop.

You could let the user choose what code to execute on failure, with
`exit(1);` as the default. This would let user perform a long jump to
the next test that doesn't depend on the failed code, or even do noting
and just carry on.

> You can still recover from the abort and keep going by encapsulating
> the test in a dh_branch() macro,
> so this shouldn't pose much of an issue.
> 
> Best regards,
>   Thomas
> 

Regards,
Mattias Andrée



Re: [hackers] A very unconventional unit testing library

2021-08-11 Thread Mattias Andrée
Hi Thomas,

This looks like a very neat test framework. I would however like
the file name to be printed in addition to line number as the
test can cover multiple files: the file with the tests and files
with utilities functions, or replacement for standard functions,
that the tests use.


Regards,
Mattias Andrée


On Tue, 10 Aug 2021 23:45:43 +0200
Thomas Oltmann  wrote:

> Hi everybody!
> 
> Quite some time ago, at slcon19, someone was really interested in my
> very unconventional way of unit testing C code,
> and wanted me to write a post about it on the mailing list. I'm a
> couple years late, but here it is anyway:
> 
> 
> I've tried multiple well-known C unit testing frameworks in the past,
> but they all suffered from the same issues:
> 
> - They're quite large and complicated because they pack lots of
> features for huge projects with huge development teams.
>   For small projects like mine (or pretty much all suckless projects)
> most of the offered features are complete overkill.
> 
> - they force you to organize your tests into a fixed hierarchy of
> 'test suite' > 'test case' > 'test' or the like.
>   This is a bad fit for any project but those of one specific size class.
> 
> - they make debugging really hard, because they tend to manage the
> overall control flow themselves,
>   spawning all sorts of threads and child processes, so that merely
> attaching a debugger can be a pain.
> 
> - they love to spam info logs, often making it difficult to see
> whether anything went wrong or not.
>   On a side note, some of them have this absurd notion that failing a
> couple tests is not a big deal ?!
> 
> 
> So I wrote my own library, which I called 'dh_cuts' - "Dynamic
> Hierarchy C Unit Testing System".
> It is shamefully trivial, but served me well so far.
> 
> The central idea is to replace the fixed hierarchy of tests nested in
> suites, as found in other frameworks,
> with a naming hierarchy that is completely dynamic at runtime. This is
> realized as follows:
> When you want your code to enter a new nested level in the hierarchy,
> you call dh_push(""),
> and when you want to leave it again, you call dh_pop().
> 
> An example:
> 
> void test_the_flux_capacitor() {
> dh_push("flux capacitor");
> ...
> dh_assert(condition_1 == true);
> ...
> dh_push("turning some knobs");
> ...
> dh_assert(condition_2 == true);
> ...
> dh_pop();
> 
> dh_pop();
> }
> 
> void run_scifi_testsuite() {
> dh_push("sci-fi");
> test_the_flux_capacitor();
> dh_pop();
> }
> 
> int main() {
> dh_init(stderr);
> run_scifi_testsuite();
> }
> 
> Per default, if all asserts are passed, this program will print nothing.
> But suppose the test fails because condition_2 is false. In that case, dh_cuts
> will print a trace of the part of the hierarchy where the error occurred:
> 
> └ sci-fi
> ․․└ flux capacitor
> └ turning some knobs
> ․․└ triggered assert in line 013: condition_2 == true   ← FAIL
> 
> This system is great in terms of debuggability, because your tests can convey
> as much diagnostic information via dh_push() as you see fit.
> dh_push() even accepts the same formatted messages as printf(), so you can
> insert things like iteration counts into the hierarchy, and they won't clutter
> the output because they're only shown for asserts that fail:
> 
> void monte_carlo_test() {
> dh_push("monte carlo");
> for (int n = 0; n < 100; n++) {
> dh_push("iteration %d", n);
> /* perform the n-th round of randomized testing */
> ...
> dh_pop();
> }
> dh_pop();
> }
> 
> 
> There's a handful of other features to dh_cuts that make it more practical:
> 
> - You can use the macro dh_branch like this:
> dh_branch(
> do_some_stuff();
> more *stuff = ...;
> ...
> )
>  to sandbox the code in the parentheses, meaning if that code crashes,
> then code following the dh_branch() macro should still be able to execute.
> 
> - There's multiple different dh_assert() variations for convenience as
> well as dh_throw() to unconditionally fail with a custom error
> message.
> 
> - dh_summarize() can be used to print a one-line summary of executed
> vs failed checks.
> 
> - If you don't want the output to include fancy Unicode sequences, you
> can define
>   DH_OPTION_ASCII_ONLY as 1 before including dh_cuts.h.
> 
> - The entire thing is just a tiny single header library, so you can
> simply copy-paste it if you want, no dependency management neccessary.
> 
> 
> If you're interested, you can find the code here:
> https://github.com/tomolt/dh_cuts/blob/master/dh_cuts.h
> As a more complete example, I've also attached some testing code for a
> basic hashtable implementation to this post.
> Quite frankly, I don't expect anybody else to start using it, but I
> thought people here might be interested by the idea.
> 
> Cheers,
>   Thomas Oltmann




Re: [hackers] setsid: add optional -f to force fork()

2020-07-14 Thread Mattias Andrée
Hi,

Is there any reason you would want to force it?


Regards,
Mattias Andrée


On Tue, 14 Jul 2020 10:15:43 +0200
Hiltjo Posthuma  wrote:

> Hi,
> 
> The below patch adds an -f flag to force fork(2)ing and creating a new 
> process.
> 
> 
> From a75ef384c11b64732dd6a3adc9249ba6beb8a67e Mon Sep 17 00:00:00 2001
> From: Hiltjo Posthuma 
> Date: Tue, 14 Jul 2020 10:11:43 +0200
> Subject: [PATCH] setsid: add optional -f to force fork()
> 
> ---
>  setsid.1 | 3 ++-
>  setsid.c | 9 +++--
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/setsid.1 b/setsid.1
> index d43bcfc..4df6439 100644
> --- a/setsid.1
> +++ b/setsid.1
> @@ -1,4 +1,4 @@
> -.Dd 2015-10-08
> +.Dd 2020-07-14
>  .Dt SETSID 1
>  .Os sbase
>  .Sh NAME
> @@ -6,6 +6,7 @@
>  .Nd run a command in a new session
>  .Sh SYNOPSIS
>  .Nm
> +.Op Fl f
>  .Ar cmd
>  .Op Ar arg ...
>  .Sh DESCRIPTION
> diff --git a/setsid.c b/setsid.c
> index 28d3442..3355b40 100644
> --- a/setsid.c
> +++ b/setsid.c
> @@ -4,10 +4,12 @@
>  
>  #include "util.h"
>  
> +static int fflag = 0;
> +
>  static void
>  usage(void)
>  {
> - eprintf("usage: %s cmd [arg ...]\n", argv0);
> + eprintf("usage: %s cmd [-f] [arg ...]\n", argv0);
>  }
>  
>  int
> @@ -16,6 +18,9 @@ main(int argc, char *argv[])
>   int savederrno;
>  
>   ARGBEGIN {
> + case 'f':
> + fflag = 1;
> + break;
>   default:
>   usage();
>   } ARGEND
> @@ -23,7 +28,7 @@ main(int argc, char *argv[])
>   if (!argc)
>   usage();
>  
> - if (getpgrp() == getpid()) {
> + if (fflag || getpgrp() == getpid()) {
>   switch (fork()) {
>   case -1:
>   eprintf("fork:");




[hackers] A (much) simpler syscall tracer

2020-06-01 Thread Mattias Andrée
Hi everyone!

For that last few days I've been writing an alternative implementation
of strace(1):

https://github.com/maandree/sctrace

It is currently only implemented for x86-64 Linux.

It is ready for use, but you may find that it, as of yet, does not
provide a lot of information for every system call (there are a lot
if system calls, so it will take some time), but apart from that
everything is finished, except there two flags that may be useful
to add: print detailed signal information, and truncate strings
included in the output (so that for example cp(1) for a 1G file
wouldn't flood you terminal uninteresting binary data from a file,
so you can easier find the system calls). And then, maybe some
code cleanup is needed.

Unlike strace(1), this program will no allow you to modify syscall
results, filter traces, or give you timestamps, and will not give
you ugly output, and it returns syscalls results as is without
changing negative error codes to -1.

Is there any interest in this project?

Otherwise, I will stop development here and only occasionally add
some improvements (like syscalls formatters) and only accept patches
(like syscalls formatters, they are easily to add, hint, hint), and
continue on with my next project, which was the reason I made this
in the first place: a fakeroot(1)-like utility for package managers
and static linking.


Best regards,
Mattias Andrée



Re: [hackers] [libgrapheme][PATCH] Simplify cp_decode and be more strict

2020-05-28 Thread Mattias Andrée
On Thu, 28 May 2020 19:33:12 +0200
Laslo Hunhold  wrote:

> On Thu, 28 May 2020 16:49:22 +0200
> Mattias Andrée  wrote:
> 
> Dear Mattias,
> 
> > > > So would you recommend an explicit cast to uint32_t, i.e.
> > > > 
> > > >(uint32_t)1 << 16
> > > > 
> > > > to overcome this?  
> > > 
> > > Yes.
> > 
> > Don't forget about 0x10, where you need UINT32_C(0x10).
> > Casting doesn't work here unless you also add the L suffix.  
> 
> thanks for your input on this. I've fixed these problems and also
> simplified the mask-operation as you've suggested.
> 
> With best regards
> 
> Laslo
> 

Looks good. But I wouldn't call the mask operation change
and optimisation (even though it is), than I would have
suggested `*cp = s[0] ^ lut[off].lower`. :)



Re: [hackers] [libgrapheme][PATCH] Simplify cp_decode and be more strict

2020-05-28 Thread Mattias Andrée
On Thu, 28 May 2020 16:45:20 +0200
Mattias Andrée  wrote:

> Hi Laslo,
> 
> On Thu, 28 May 2020 15:53:32 +0200
> Laslo Hunhold  wrote:
> 
> > On Thu, 28 May 2020 13:48:18 +0200
> > Mattias Andrée  wrote:
> > 
> > Dear Mattias,
> >   
> > > Looks good, and I especially like the simplification it brings for
> > > using partially loaded strings.
> > 
> > I'm glad to hear that. Thanks!
> >   
> > > However, I have three minor comments:
> > > 
> > > I preferred `lut[off].mask` over `(lut[off].upper - lut[off].lower)`.
> > > It is clearer what it means, and storing the mask in `lut` doesn't
> > > even increase its size since it is padded anyway because `mincp` is
> > > (atleast on x86-64 and i386) aligned to 4 bytes. An alternative,
> > > is to use `~lut[off].lower` which I think is clearer than
> > > `(lut[off].upper - lut[off].lower)`, but again, I prefer
> > > `lut[off].mask`. You could also write
> > >   *cp = s[0] - lut[off].lower;
> > > I think this alternative is about as clear as using `lut[off].mask`.
> > 
> > I was first vary of this way, because it would be problematic if s[0] <
> > lut[off].lower, but because we check this beforehand this is possible.
> > I'll note it and add it later.
> >   
> > > In POSIX (but not Linux) `1 << 16` can be either 0, 1, or 2¹⁶,
> > > since `1` is an `int` which minimum width is 16, not 32. Similarly,
> > > `0x10` could overflow to 0x.
> > 
> > So would you recommend an explicit cast to uint32_t, i.e.
> > 
> >(uint32_t)1 << 16
> > 
> > to overcome this?  
> 
> Yes.

Don't forget about 0x10, where you need UINT32_C(0x10).
Casting doesn't work here unless you also add the L suffix.

> 
> >   
> > > I think `(s[i] & 0xC0) != 0x80` is clearer than `!BETWEEN(s[i], 0x80,
> > > 0xBF)`, but since you changed this I assume you disagree.
> > 
> > I don't disagree either way. The comment I added above is sufficient in
> > terms of readability. I'm not a big fan of micro-optimizations and
> > prefer higher "readability". Both solutions are readable enough, given
> > a proper comment, but I just went with the "BETWEEN"-approach as it is
> > similar to how we check it earlier.
> > 
> > With best regards
> > 
> > Laslo
> > 
> > PS: No need to CC me, I am subscribed to the list. :P
> >   
> 
> Sorry, I pressing reply to all instead of reply to list
> without really looking, I need to remove the former option.
> 
> 
> Regards,
> Mattias Andrée
> 




Re: [hackers] [libgrapheme][PATCH] Simplify cp_decode and be more strict

2020-05-28 Thread Mattias Andrée
Hi Laslo,

On Thu, 28 May 2020 15:53:32 +0200
Laslo Hunhold  wrote:

> On Thu, 28 May 2020 13:48:18 +0200
> Mattias Andrée  wrote:
> 
> Dear Mattias,
> 
> > Looks good, and I especially like the simplification it brings for
> > using partially loaded strings.  
> 
> I'm glad to hear that. Thanks!
> 
> > However, I have three minor comments:
> > 
> > I preferred `lut[off].mask` over `(lut[off].upper - lut[off].lower)`.
> > It is clearer what it means, and storing the mask in `lut` doesn't
> > even increase its size since it is padded anyway because `mincp` is
> > (atleast on x86-64 and i386) aligned to 4 bytes. An alternative,
> > is to use `~lut[off].lower` which I think is clearer than
> > `(lut[off].upper - lut[off].lower)`, but again, I prefer
> > `lut[off].mask`. You could also write
> > *cp = s[0] - lut[off].lower;
> > I think this alternative is about as clear as using `lut[off].mask`.  
> 
> I was first vary of this way, because it would be problematic if s[0] <
> lut[off].lower, but because we check this beforehand this is possible.
> I'll note it and add it later.
> 
> > In POSIX (but not Linux) `1 << 16` can be either 0, 1, or 2¹⁶,
> > since `1` is an `int` which minimum width is 16, not 32. Similarly,
> > `0x10` could overflow to 0x.  
> 
> So would you recommend an explicit cast to uint32_t, i.e.
> 
>(uint32_t)1 << 16
> 
> to overcome this?

Yes.

> 
> > I think `(s[i] & 0xC0) != 0x80` is clearer than `!BETWEEN(s[i], 0x80,
> > 0xBF)`, but since you changed this I assume you disagree.  
> 
> I don't disagree either way. The comment I added above is sufficient in
> terms of readability. I'm not a big fan of micro-optimizations and
> prefer higher "readability". Both solutions are readable enough, given
> a proper comment, but I just went with the "BETWEEN"-approach as it is
> similar to how we check it earlier.
> 
> With best regards
> 
> Laslo
> 
> PS: No need to CC me, I am subscribed to the list. :P
> 

Sorry, I pressing reply to all instead of reply to list
without really looking, I need to remove the former option.


Regards,
Mattias Andrée



Re: [hackers] [libgrapheme][PATCH] Simplify cp_decode and be more strict

2020-05-28 Thread Mattias Andrée
Hi Laslo,

Looks good, and I especially like the simplification it brings for
using partially loaded strings. However, I have three minor comments:

I preferred `lut[off].mask` over `(lut[off].upper - lut[off].lower)`.
It is clearer what it means, and storing the mask in `lut` doesn't even
increase its size since it is padded anyway because `mincp` is
(atleast on x86-64 and i386) aligned to 4 bytes. An alternative,
is to use `~lut[off].lower` which I think is clearer than
`(lut[off].upper - lut[off].lower)`, but again, I prefer `lut[off].mask`.
You could also write
*cp = s[0] - lut[off].lower;
I think this alternative is about as clear as using `lut[off].mask`.

In POSIX (but not Linux) `1 << 16` can be either 0, 1, or 2¹⁶,
since `1` is an `int` which minimum width is 16, not 32. Similarly,
`0x10` could overflow to 0x.

I think `(s[i] & 0xC0) != 0x80` is clearer than `!BETWEEN(s[i], 0x80, 0xBF)`,
but since you changed this I assume you disagree.


Regards,
Mattias Andrée


On Thu, 28 May 2020 13:07:21 +0200
Laslo Hunhold  wrote:

> On Wed, 27 May 2020 15:22:35 +0200
> Mattias Andrée  wrote:
> 
> Dear Mattias,
> 
> >   
> 
> I refactored the decoder on a deeper level to improve the readability.
> It should be more or less clear from the code what is happening, and it
> was my fault back then that I wrote it more as "write-only" code to
> save time.
> 
> Please let me know if you find any issues with the refactored (it was
> basically rewritten).
> 
> With best regards
> 
> Laslo
> 




[hackers] Re: [libgrapheme][PATCH] Simplify cp_decode and be more strict

2020-05-27 Thread Mattias Andrée
I missed something, that I will fix later, but there are three options of what 
to do:

grapheme_len() assumes cp_decode() returns 0 at the end of the string, whereas 
this
change will return 1 (it is counter-intuitive that an UTF-8 decode would say 
that the
NUL character is 0 bytes longs as it is indeed a character, and one which you 
may want
to support (I did when I rewrote the decoder for another project)).

grapheme_len.3 is sparse on details, but in the example it checks for the NUL 
byte
before calling grapheme_len() rather than checking if grapheme_len() returns 0 
(started
at the NUL byte).

So option 1 is to make cp_decode return 0 on NUL. I really don't like this 
option as
it eliminates the support for the NUL character, and you may want to return NUL
characters without any special handling. And it doesn't simply anything for the 
user
even if he wanted to and at NUL.

Option 2 as option 1 but also change the man page check if grapheme_len() 
returns 0.
This is a little butter as it would document this feature.

Option 3 is change grapheme_len() to support the NUL character. I think this is 
the
best option as it would add support NUL character without the user doing 
anything
special, and it barely complicates things. The change needed in grapheme_len() 
would
be to compare `cp0` instead of `ret` against 0, after running `len += ret`, in 
the
first call to `cp_decode`. This handling of NUL in grapheme_len() would still be
needed to ensure that it does not read outside the string.


Regards,
Mattias Andrée

On Wed, 27 May 2020 15:22:35 +0200
Mattias Andrée  wrote:

> No special cases for NUL and ASCII.
> 
> Simplified check of overencoded codepoints:
> lowest non-overencoded codepoints are stored in the lookup table
> 
> Do not accept surrogate code points.
> From RFC 3629:
>   The definition of UTF-8 prohibits encoding character
>   numbers between U+D800 and U+DFFF, [...]
> 
> Do not accept plane 17 and above:
> From RFC 3629:
>   Restricted the range of characters to -10
>   (the UTF-16 accessible range).
> ---
>  src/codepoint.c | 73 +++--
>  1 file changed, 22 insertions(+), 51 deletions(-)
> 
> diff --git a/src/codepoint.c b/src/codepoint.c
> index 0b63184..6cb0d4f 100644
> --- a/src/codepoint.c
> +++ b/src/codepoint.c
> @@ -8,73 +8,44 @@
>  size_t
>  cp_decode(const uint8_t *str, Codepoint *p)
>  {
> - size_t off, j, k, l;
> + size_t rank, i, len;
>  
>   struct {
>   uint8_t lower;
>   uint8_t upper;
>   uint8_t mask;
> - uint8_t bits;
> + Codepoint lowest;
>   } lookup[] = {
> - { 0x00, 0x7F, 0xFF,  7 }, /*  - 0111, 0111 */
> - { 0xC0, 0xDF, 0x1F, 11 }, /* 1100 - 1101, 0001 */
> - { 0xE0, 0xEF, 0x0F, 16 }, /* 1110 - 1110,  */
> - { 0xF0, 0xF7, 0x07, 21 }, /*  - 0111, 0111 */
> - { 0xF8, 0xFB, 0x03, 26 }, /* 1000 - 1011, 0011 */
> - { 0xFC, 0xFD, 0x01, 31 }, /* 1100 - 1101, 0001 */
> + { 0x00, 0x7F, 0xFF, UINT32_C(0x00) },
> + { 0xC0, 0xDF, 0x1F, UINT32_C(0x80) },
> + { 0xE0, 0xEF, 0x0F, UINT32_C(0x000800) },
> + { 0xF0, 0xF7, 0x07, UINT32_C(0x01) }
>   };
>  
> - /* empty string */
> - if (str[0] == '\0') {
> - *p = 0;
> - return 0;
> - }
> -
> - /* find out in which ranges str[0] is */
> - for (off = 0; off < LEN(lookup); off++) {
> - if (BETWEEN(str[0], lookup[off].lower, lookup[off].upper)) {
> - *p = str[0] & lookup[off].mask;
> + for (rank = 0; rank < LEN(lookup); rank++)
> + if (BETWEEN(str[0], lookup[rank].lower, lookup[rank].upper))
>   break;
> - }
> - }
> - if (off == 0) {
> - /* ASCII */
> - return 1;
> - } else if (off == LEN(lookup)) {
> - /* not in ranges */
> + if (rank == LEN(lookup)) {
> + /* Out of range */
>   *p = CP_INVALID;
>   return 1;
>   }
>  
> - /* off denotes the number of upcoming expected code units */
> - for (j = 0; j < off; j++) {
> - if (str[j] == '\0') {
> - *p = CP_INVALID;
> - return j;
> - }
> - if ((str[1 + j] & 0xC0) != 0x80) {
> + *p = (Codepoint)(str[0] & lookup[rank].mask);
> + len = rank + 1;
> + for (i = 1; i < len; i++) {
> + if ((str[i] & 0xC0) != 0x80) {

[hackers] [libgrapheme][PATCH] Simplify cp_decode and be more strict

2020-05-27 Thread Mattias Andrée
No special cases for NUL and ASCII.

Simplified check of overencoded codepoints:
lowest non-overencoded codepoints are stored in the lookup table

Do not accept surrogate code points.
>From RFC 3629:
The definition of UTF-8 prohibits encoding character
numbers between U+D800 and U+DFFF, [...]

Do not accept plane 17 and above:
>From RFC 3629:
Restricted the range of characters to -10
(the UTF-16 accessible range).
---
 src/codepoint.c | 73 +++--
 1 file changed, 22 insertions(+), 51 deletions(-)

diff --git a/src/codepoint.c b/src/codepoint.c
index 0b63184..6cb0d4f 100644
--- a/src/codepoint.c
+++ b/src/codepoint.c
@@ -8,73 +8,44 @@
 size_t
 cp_decode(const uint8_t *str, Codepoint *p)
 {
-   size_t off, j, k, l;
+   size_t rank, i, len;
 
struct {
uint8_t lower;
uint8_t upper;
uint8_t mask;
-   uint8_t bits;
+   Codepoint lowest;
} lookup[] = {
-   { 0x00, 0x7F, 0xFF,  7 }, /*  - 0111, 0111 */
-   { 0xC0, 0xDF, 0x1F, 11 }, /* 1100 - 1101, 0001 */
-   { 0xE0, 0xEF, 0x0F, 16 }, /* 1110 - 1110,  */
-   { 0xF0, 0xF7, 0x07, 21 }, /*  - 0111, 0111 */
-   { 0xF8, 0xFB, 0x03, 26 }, /* 1000 - 1011, 0011 */
-   { 0xFC, 0xFD, 0x01, 31 }, /* 1100 - 1101, 0001 */
+   { 0x00, 0x7F, 0xFF, UINT32_C(0x00) },
+   { 0xC0, 0xDF, 0x1F, UINT32_C(0x80) },
+   { 0xE0, 0xEF, 0x0F, UINT32_C(0x000800) },
+   { 0xF0, 0xF7, 0x07, UINT32_C(0x01) }
};
 
-   /* empty string */
-   if (str[0] == '\0') {
-   *p = 0;
-   return 0;
-   }
-
-   /* find out in which ranges str[0] is */
-   for (off = 0; off < LEN(lookup); off++) {
-   if (BETWEEN(str[0], lookup[off].lower, lookup[off].upper)) {
-   *p = str[0] & lookup[off].mask;
+   for (rank = 0; rank < LEN(lookup); rank++)
+   if (BETWEEN(str[0], lookup[rank].lower, lookup[rank].upper))
break;
-   }
-   }
-   if (off == 0) {
-   /* ASCII */
-   return 1;
-   } else if (off == LEN(lookup)) {
-   /* not in ranges */
+   if (rank == LEN(lookup)) {
+   /* Out of range */
*p = CP_INVALID;
return 1;
}
 
-   /* off denotes the number of upcoming expected code units */
-   for (j = 0; j < off; j++) {
-   if (str[j] == '\0') {
-   *p = CP_INVALID;
-   return j;
-   }
-   if ((str[1 + j] & 0xC0) != 0x80) {
+   *p = (Codepoint)(str[0] & lookup[rank].mask);
+   len = rank + 1;
+   for (i = 1; i < len; i++) {
+   if ((str[i] & 0xC0) != 0x80) {
+   /* Not continuation of character */
*p = CP_INVALID;
-   return 1 + j;
+   return 1;
}
-   *p <<= 6;
-   *p |= str[1 + j] & 0x3F; /* 0011 */
+   *p = (*p << 6) | (str[i] & 0x3F);
}
 
-   if (*p == 0) {
-   if (off != 0) {
-   /* overencoded NUL */
-   *p = CP_INVALID;
-   }
-   } else {
-   /* determine effective bytes */
-   for (k = 0; ((*p << k) & (1 << 31)) == 0; k++)
-   ;
-   for (l = 0; l < off; l++) {
-   if ((32 - k) <= lookup[l].bits) {
-   *p = CP_INVALID;
-   }
-   }
+   if (*p < lookup[rank].lowest || BETWEEN(*p, 0xD800, 0xDFFF) || *p > 
UINT32_C(0x10)) {
+   /* Overencoded, surrogate, or out of range */
+   *p = CP_INVALID;
+   return 1;
}
-
-   return 1 + j;
+   return len;
 }
-- 
2.26.2




Re: [hackers] [libgrapheme][PATCH] Expose cp_decode and boundary (both renamed) to the user

2020-05-09 Thread Mattias Andrée
On Sat, 9 May 2020 07:25:41 +0200
Laslo Hunhold  wrote:

> On Thu, 7 May 2020 18:32:23 +0200
> Mattias Andrée  wrote:
> 
> Dear Mattias,
> 
> > Perhaps, but do you I wouldn't expect anyone that don't understand
> > the difference to even use libgrapheme. But you would also state in
> > grapheme.h and the man pages that all functions except grapheme_len
> > are low-level functions.  
> 
> that could work.
> 
> > Not a goal, but a positive side-effect of exposing the boundary
> > test function.  
> 
> I agree with that, it has a positive side-effect.
> 
> > > The reason I'm conflicted with this change is that there's no
> > > guarantee the grapheme-cluster-boundary algorithm gets changed
> > > again. It already has been in Unicode 12.0, which made it suddenly
> > > require a state to be carried with it, but there's no guarantee it
> > > will get even crazier, making it almost infeasible to expose more
> > > than a "gclen()"-function to the user.
> > 
> > How about
> > 
> > typedef struct grapheme_state GRAPHEME_STATE;
> > 
> > /* Hidden from the user { */
> > struct grapheme_state {
> > uint32_t cp0;
> > int state;
> > };
> > /* } */
> > 
> > int grapheme_boundary(uint32_t cp1, GRAPHEME_STATE *);
> > 
> > GRAPHEME_STATE *grapheme_create_state(void);
> > 
> > /* Just in case the state in the future
> >  * would require dynamic allocation */
> > void grapheme_free_state(GRAPHEME_STATE *);
> > 
> > grapheme_create_state() would reset the state each time
> > a boundary is found, so no reset function would be needed,
> > and would be useful to avoid a new allocation if the
> > grapheme cluster identification process is aborted and a
> > a started for a new text. Since this would be very rare
> > there, no reset function is needed.
> > 
> > The only future I can see there this wouldn't be sufficient
> > if a cluster break (or non-break) could be retroactively
> > inserted where where the algorithm already stated that there
> > as no break (or was a break). This would be so bizarre, I
> > cannot imagine this would ever be the case.  
> 
> I don't like this change, because it destroys reentrancy, which is very
> importent for multithreaded applications, and complicates things
> unnecessarily.

malloc(3) and free(3) are thread-safe, so there shouldn't be any problem:

GRAPHEME_STATE *state = grapheme_create_state(void);
... = grapheme_boundary(..., state);
grapheme_free_state(state);

> However, I think we should just risk it and assume that further
> versions of the Unicode-Grapheme-Boundary-algorithm will only rely on
> such a state.

I agree.

> 
> > > [...]
> > > What do you think?
> > 
> > I don't see the point of including grapheme_cp_encode(), however
> > I'm not opposed to making a larger UTF-8/Unicode library, rather
> > I think it would be nice to have one place for all my Unicode
> > needs, especially if I otherwise would have a hand full of libraries
> > that all their own UTF-8 decoding functions that all have to be
> > linked.  
> 
> Yes, I agree with that. There are lots of bad and unsafe
> UTF-de/encoders out there and the one in libgrapheme is actually pretty
> fast and safe (e.g. no overencoded nul, proper error-handling, etc.).
> It would be no bloat to expose it outside, as it runs "in the
> background" anyway. It's more of a debate on the "purity" of
> libgrapheme, but when including the boundary function, offering a way
> to read codepoints from a char-array makes a lot of sense.
> 
> With best regards
> 
> Laslo
> 




Re: [hackers] [libgrapheme][PATCH] Expose cp_decode and boundary (both renamed) to the user

2020-05-07 Thread Mattias Andrée
Hi Laslo,

On Thu, 7 May 2020 15:15:24 +0200
Laslo Hunhold  wrote:

> On Sat, 25 Apr 2020 23:03:56 +0200
> Mattias Andrée  wrote:
> 
> Dear Mattias,
> 
> > cp_decode has been renamed to grapheme_decode  
> 
> I personally don't like that change. It's already difficult enough to
> understand the differences between codepoints and grapheme clusters,
> and changing the name of the function to that really complicates it
> more and might cause misunderstandings.

Perhaps, but do you I wouldn't expect anyone that don't understand
the difference to even use libgrapheme. But you would also state in
grapheme.h and the man pages that all functions except grapheme_len
are low-level functions.

> 
> This also contradicts the below goal of providing a solution for
> non-UTF-8-text, which I honestly don't care about. There's no logical
> reason to encode text into anything other than UTF-8 and the best
> approach is to just re-encode everything into UTF-8.

Not a goal, but a positive side-effect of exposing the boundary
test function.

> 
> It's a tough call to decide if we want to turn libgrapheme into a
> general purpose UTF-8-library.

I don't hiding the code function, but since the library already
used it, I thought it was nice to not have to implement your own
when using the library.

> 
> > and boundary has been renamed to grapheme_boundary.  
> 
> I'm a bit conflicted with this change, though I would probably expose
> it as grapheme_boundary(uint32_t, uint32_t, int *) if I chose to
> include it. Sure, we do use Codepoint interally as a typedef, but for a
> "public" API I prefer not to have them if possible.

I would as well, I just didn't want to change too much.

> 
> The reason I'm conflicted with this change is that there's no guarantee
> the grapheme-cluster-boundary algorithm gets changed again. It already
> has been in Unicode 12.0, which made it suddenly require a state to be
> carried with it, but there's no guarantee it will get even crazier,
> making it almost infeasible to expose more than a "gclen()"-function to
> the user.

How about

typedef struct grapheme_state GRAPHEME_STATE;

/* Hidden from the user { */
struct grapheme_state {
uint32_t cp0;
int state;
};
/* } */

int grapheme_boundary(uint32_t cp1, GRAPHEME_STATE *);

GRAPHEME_STATE *grapheme_create_state(void);

/* Just in case the state in the future
 * would require dynamic allocation */
void grapheme_free_state(GRAPHEME_STATE *);

grapheme_create_state() would reset the state each time
a boundary is found, so no reset function would be needed,
and would be useful to avoid a new allocation if the
grapheme cluster identification process is aborted and a
a started for a new text. Since this would be very rare
there, no reset function is needed.

The only future I can see there this wouldn't be sufficient
if a cluster break (or non-break) could be retroactively
inserted where where the algorithm already stated that there
as no break (or was a break). This would be so bizarre, I
cannot imagine this would ever be the case.

> 
> The current implementation can store 32 states and uses 2 of them for
> the algorithm. In this regard, we still have some headroom.
> 
> > The purpose of this is to allow faster text rendering
> > where both individual code points and grapheme clusters
> > boundaries are of interest, but it also (1) makes it
> > easy to do online processing of large document (the user
> > does not need to search for spaces, but only know an
> > upper limit for how long encoding is needed to encode
> > any codepoint) and (2) makes to library easy to use
> > with non-UTF-8 text.  
> 
> As I said above, I don't care about non-UTF-8-text and anything
> non-UTF-8 is either an internal representation (e.g. in Java with
> UTF-16LE) or ancient.
> 
> I see how the stateful function might be useful though for a
> byte-per-byte reading of a file, or something else.
> 
> > This change also eliminates all unnamespaced, non-static
> > functions that are not exposed to the user.  
> 
> This is a very good point! I'll try to solve that in an own coding
> session on the existing code. Thanks for your patch! I will have to
> think about this more. What do you think about the following
> API-overview integrating above changes? This would also include some
> UTF-8 functionality.
> 
> size_t grapheme_cp_decode(uint32_t*, char *, size_t)
>Decode the UTF-8 sequence into a codepoint from the array of given
>length, returning the number of bytes consumed or zero if there was
>an error (which also sets the codepo

[hackers] [libgrapheme][PATCH] Expose cp_decode and boundary (both renamed) to the user

2020-04-25 Thread Mattias Andrée
cp_decode has been renamed to grapheme_decode and
boundary has been renamed to grapheme_boundary.

The purpose of this is to allow faster text rendering
where both individual code points and grapheme clusters
boundaries are of interest, but it also (1) makes it
easy to do online processing of large document (the user
does not need to search for spaces, but only know an
upper limit for how long encoding is needed to encode
any codepoint) and (2) makes to library easy to use
with non-UTF-8 text.

This change also eliminates all unnamespaced, non-static
functions that are not exposed to the user.
---
 Makefile| 12 ++--
 grapheme.h  |  7 +++
 src/boundary.h  | 11 ---
 src/boundary_body.c |  5 ++---
 src/codepoint.c |  5 +++--
 src/codepoint.h | 14 --
 src/grapheme.c  |  9 -
 src/test_body.c |  4 ++--
 8 files changed, 24 insertions(+), 43 deletions(-)
 delete mode 100644 src/boundary.h
 delete mode 100644 src/codepoint.h

diff --git a/Makefile b/Makefile
index 5d52598..6b964ed 100644
--- a/Makefile
+++ b/Makefile
@@ -19,10 +19,10 @@ all: libgrapheme.a libgrapheme.so $(BIN)
 
 src/test: src/test.o $(REQ:=.o)
 
-src/boundary.o: src/boundary.c config.mk src/codepoint.h src/boundary.h
-src/codepoint.o: src/codepoint.c config.mk src/codepoint.h
-src/grapheme.o: src/grapheme.c config.mk src/codepoint.h src/boundary.h
-src/test.o: src/test.c config.mk src/codepoint.h src/boundary.h
+src/boundary.o: src/boundary.c config.mk grapheme.h
+src/codepoint.o: src/codepoint.c config.mk grapheme.h
+src/grapheme.o: src/grapheme.c config.mk grapheme.h
+src/test.o: src/test.c config.mk grapheme.h
 
 .o:
$(CC) -o $@ $(LDFLAGS) $< $(REQ:=.o)
@@ -42,7 +42,7 @@ test:
 
 src/boundary.c: data/gbt.awk $(GBP) data/emo.awk $(EMO) src/boundary_body.c
printf "/* Automatically generated by gbp.awk and emo.awk */\n" > $@
-   printf "#include \"codepoint.h\"\n" >> $@
+   printf "#include \"../grapheme.h\"\n" >> $@
awk -f data/gbp.awk $(GBP) >> $@
awk -f data/emo.awk $(EMO) >> $@
printf "\n" >> $@
@@ -51,7 +51,7 @@ src/boundary.c: data/gbt.awk $(GBP) data/emo.awk $(EMO) 
src/boundary_body.c
 src/test.c: data/gbt.awk $(GBT) src/test_body.c
printf "/* Automatically generated by gbt.awk */\n" > $@
printf "#include \n\n" >> $@
-   printf "#include \"codepoint.h\"\n\n" >> $@
+   printf "#include \"../grapheme.h\"\n\n" >> $@
awk -f data/gbt.awk $(GBT) >> $@
printf "\n" >> $@
cat src/test_body.c >> $@
diff --git a/grapheme.h b/grapheme.h
index dae667b..21e73fe 100644
--- a/grapheme.h
+++ b/grapheme.h
@@ -3,7 +3,14 @@
 #define GRAPHEME_H
 
 #include 
+#include 
+
+typedef uint32_t Codepoint;
+
+#define CP_INVALID 0xFFFD
 
 size_t grapheme_len(const char *);
+size_t grapheme_decode(const char *, Codepoint *);
+int grapheme_boundary(Codepoint, Codepoint, int *);
 
 #endif /* GRAPHEME_H */
diff --git a/src/boundary.h b/src/boundary.h
deleted file mode 100644
index 77d0054..000
--- a/src/boundary.h
+++ /dev/null
@@ -1,11 +0,0 @@
-/* See LICENSE file for copyright and license details. */
-#ifndef BOUNDARY_H
-#define BOUNDARY_H
-
-#include 
-
-#include "codepoint.h"
-
-int boundary(Codepoint, Codepoint, int *);
-
-#endif /* BOUNDARY_H */
diff --git a/src/boundary_body.c b/src/boundary_body.c
index 3a160cf..c6dd133 100644
--- a/src/boundary_body.c
+++ b/src/boundary_body.c
@@ -2,8 +2,7 @@
 #include 
 #include 
 
-#include "codepoint.h"
-#include "boundary.h"
+#include "../grapheme.h"
 
 #define LEN(x) (sizeof(x) / sizeof(*x))
 
@@ -119,7 +118,7 @@ is(Codepoint cp[2], char (*props)[2], int index, enum 
property p)
 #define IS(I, PROP) (is(cp, props, I, PROP))
 
 int
-boundary(Codepoint cp0, Codepoint cp1, int *state)
+grapheme_boundary(Codepoint cp0, Codepoint cp1, int *state)
 {
Codepoint cp[2] = { cp0, cp1 };
char props[NUM_PROPS][2];
diff --git a/src/codepoint.c b/src/codepoint.c
index 0b63184..ed00987 100644
--- a/src/codepoint.c
+++ b/src/codepoint.c
@@ -1,13 +1,14 @@
 /* See LICENSE file for copyright and license details. */
-#include "codepoint.h"
+#include "../grapheme.h"
 #include 
 
 #define BETWEEN(c, l, u) (c >= l && c <= u)
 #define LEN(x) (sizeof(x) / sizeof(*x))
 
 size_t
-cp_decode(const uint8_t *str, Codepoint *p)
+grapheme_decode(const char *str_, Codepoint *p)
 {
+   const uint8_t *str = (const uint8_t *)str_;
size_t off, j, k, l;
 
struct {
diff --git a/src/codepoint.h b/src/codepoint.h
deleted file mode 100644
index dedc2f4..000
--- a/src/codepoint.h
+++ /dev/null
@@ -1,14 +0,0 @@
-/* See LICENSE file for copyright and license details. */
-#ifndef CODEPOINT_H
-#define CODEPOINT_H
-
-#include 
-#include 
-
-typedef uint32_t Codepoint;
-
-#define CP_INVALID 0xFFFD
-
-size_t cp_decode(const uint8_t *, Codepoint *);
-
-#endif /* CODEPOINT_H */
diff --git a/src/grapheme.c b/src/grapheme.c
index 4ff917f..445fd8a 100644
--- a/src/

Re: [hackers] Unsubscribe

2020-01-26 Thread Mattias Andrée
:'(


To unsubscribe, send an e-mail to hackers+unsubscr...@suckless.org.
Almost all mailing-lists have a List-Unsubscribe header in the e-mails,
which you can look at if you are not sure how to unsubscribe.



On Sun, 26 Jan 2020 09:42:05 -0700
Jason Eastman  wrote:

> Unsubscribe
> 
> 




[hackers] [ubase][PATCH] df: Fix value in Avail column

2019-11-12 Thread Mattias Andrée
Free df(1) POSIX man page:

   
 [...] , , plus any space reserved
 by the system not normally available to a user.

   
 [...] for the creation of new files by unprivileged users, 
[...]

Note, Capacity remains unchanged because it shall be

   /( + )

and it is maid explicitly clear that  +  shall not be 
:

   [...] This percentage may be greater than 100 if  is less 
than zero [...]

Signed-off-by: Mattias Andrée 
---
 df.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/df.c b/df.c
index 4d595d6..245cca3 100644
--- a/df.c
+++ b/df.c
@@ -65,8 +65,8 @@ mnt_show(const char *fsname, const char *dir)
 
bs = s.f_frsize / blksize;
total = s.f_blocks * bs;
-   avail = s.f_bfree * bs;
-   used = total - avail;
+   avail = s.f_bavail * bs;
+   used = total - s.f_bfree * bs;
 
if (used + avail) {
capacity = (used * 100) / (used + avail);
-- 
2.24.0




[hackers] [ubase][PATCH v2] Add vmstat

2019-11-05 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 LICENSE  |   1 +
 Makefile |   1 +
 TODO |   2 +-
 vmstat.c | 322 +++
 4 files changed, 325 insertions(+), 1 deletion(-)
 create mode 100644 vmstat.c

diff --git a/LICENSE b/LICENSE
index 76cf9ea..8b8e1ee 100644
--- a/LICENSE
+++ b/LICENSE
@@ -32,3 +32,4 @@ Authors/contributors include:
 © 2014 Roberto E. Vargas Caballero 
 © 2014 Jan Tatje 
 © 2015 Risto Salminen 
+© 2019 Mattias Andrée 
diff --git a/Makefile b/Makefile
index b526421..e1dab9b 100644
--- a/Makefile
+++ b/Makefile
@@ -86,6 +86,7 @@ BIN = \
umount\
unshare   \
uptime\
+   vmstat\
vtallow   \
watch \
who
diff --git a/TODO b/TODO
index 21f5c20..6dd69d3 100644
--- a/TODO
+++ b/TODO
@@ -27,7 +27,7 @@ tabs
 taskset
 top
 tput
-vmstat
+vmstat [-sdDpS]
 
 Misc
 
diff --git a/vmstat.c b/vmstat.c
new file mode 100644
index 000..34208a4
--- /dev/null
+++ b/vmstat.c
@@ -0,0 +1,322 @@
+/* See LICENSE file for copyright and license details. */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "util.h"
+
+struct vm {
+   intmax_t cpu_user;
+   intmax_t cpu_nice;
+   intmax_t cpu_system;
+   intmax_t cpu_idle;
+   intmax_t cpu_iowait;
+   intmax_t cpu_irq;
+   intmax_t cpu_softirq;
+   intmax_t cpu_steal;
+   intmax_t cpu_guest;
+   intmax_t cpu_guest_nice;
+   intmax_t cpu_unknown;
+   intmax_t page_in;
+   intmax_t page_out;
+   intmax_t swap_in;
+   intmax_t swap_out;
+   intmax_t intr;
+   intmax_t ctxt_switches;
+   intmax_t processes;
+   intmax_t proc_running;
+   intmax_t proc_blocked;
+   intmax_t mem_free;
+   intmax_t buffers;
+   intmax_t cached;
+   intmax_t active;
+   intmax_t inactive;
+   intmax_t swap_total;
+   intmax_t swap_free;
+   intmax_t sreclaimable;
+};
+
+static intmax_t kb_per_page;
+static intmax_t hz;
+
+static int
+stpstarts(char *str, const char *head, char **end)
+{
+   size_t n = strlen(head);
+   if (!strncmp(str, head, n)) {
+   *end = &str[n];
+   return 1;
+   }
+   return 0;
+}
+
+static intmax_t
+read_ints(const char *s, intmax_t *arr, size_t n)
+{
+   const char *beginning = s;
+   intmax_t rest = 0;
+   size_t tmp;
+   int negative;
+
+   memset(arr, 0, n * sizeof(*arr));
+
+   for (; n--; arr++) {
+   while (*s && !isdigit(*s))
+   s++;
+   negative = (s != beginning && s[-1] == '-');
+   while (isdigit(*s))
+   *arr = *arr * 10 + (*s++ - '0');
+   if (negative)
+   *arr = -*arr;
+   }
+
+   for (; *s; rest += tmp) {
+   tmp = 0;
+   while (*s && !isdigit(*s))
+   s++;
+   negative = (s != beginning && s[-1] == '-');
+   while (isdigit(*s))
+   tmp = tmp * 10 + (*s++ - '0');
+   if (negative)
+   tmp = -tmp;
+   }
+
+   return rest;
+}
+
+static void
+load_vm(struct vm *s)
+{
+   static intmax_t debt = 0;
+
+   int have_page = 0, have_swap = 0;
+   char *line = NULL, *p;
+   size_t size = 0;
+   ssize_t len;
+   FILE *fp;
+
+   memset(s, 0, sizeof(*s));
+
+   fp = fopen("/proc/stat", "r");
+   if (!fp)
+   eprintf("fopen /proc/stat:");
+   while ((len = getline(&line, &size, fp)) >= 0) {
+   if (stpstarts(line, "cpu ", &p)) {
+   s->cpu_unknown = read_ints(p, &s->cpu_user, 10);
+   } else if (stpstarts(line, "page ", &p)) {
+   read_ints(p, &s->page_in, 2);
+   have_page = 1;
+   } else if (stpstarts(line, "swap ", &p)) {
+   read_ints(p, &s->swap_in, 2);
+   have_swap = 1;
+   } else if (stpstarts(line, "intr ", &p)) {
+   read_ints(p, &s->intr, 1);
+   } else if (stpstarts(line, "ctxt ", &p)) {
+   read_ints(p, &s->ctxt_switches, 1);
+   } else if (stpstarts(line, "processes ", &p)) {
+   read_ints(p, &s->processes, 1);
+   } else if (stpstarts(line, "proc_running ", &p)) {
+   read_ints(p, &s->proc_running, 1);
+   } else if (stpstarts(line, "proc_blocked ", &p)) {
+   read_ints(p, &s->proc_bl

Re: [hackers] [ubase][PATCH] Add vmstat

2019-11-05 Thread Mattias Andrée
On Tue, 5 Nov 2019 12:03:04 -0800
Michael Forney  wrote:

> On 2019-11-05, Mattias Andrée  wrote:
> > On Sat, 2 Nov 2019 12:33:58 -0700
> > Michael Forney  wrote:
> >  
> >> I've never used vmstat before, but this looks pretty good overall and
> >> seems to work well.
> >>
> >> On 2019-10-05, Mattias Andrée  wrote:  
> >> > +goto beginning;
> >> > +for (; argc && (argc < 2 || i < count); i++) {  
> >>
> >> Why not just set count = 1 when argc < 2?  
> >
> > Because that would stop the loop after 1 iteration.
> > If argc == 1, the loop should be infinite.  
> 
> Oh, right.
> 
> > An alternative that would work is:
> >
> > for (;;) {
> > load_vm(&vm[i & 1]);
> > print_vm(&vm[i & 1], &vm[~i & 1], active_mem, timestamp, 
> > one_header ? !i :
> > (i % 50 == 0));
> > i++;
> > if (!argc || (argc == 2 && i == count))
> > break;
> > clock_nanosleep(CLOCK_MONOTONIC, 0, &delay, NULL);
> > }  
> 
> FWIW, I like this approach.

I will make a new version with that approach then.



[hackers] [ubase][PATCH] vmstat.c: fix typo in comment

2019-11-05 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 vmstat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/vmstat.c b/vmstat.c
index d97ee10..4c809bb 100644
--- a/vmstat.c
+++ b/vmstat.c
@@ -282,7 +282,7 @@ main(int argc, char *argv[])
timestamp = 1;
break;
case 'w':
-   /* Ignored for compatibility (allow output to be wider then 80 
columns) */
+   /* Ignored for compatibility (allow output to be wider than 80 
columns) */
break;
default:
usage();
-- 
2.23.0




Re: [hackers] [ubase][PATCH] Add vmstat

2019-11-05 Thread Mattias Andrée
On Sat, 2 Nov 2019 12:33:58 -0700
Michael Forney  wrote:

> I've never used vmstat before, but this looks pretty good overall and
> seems to work well.
> 
> On 2019-10-05, Mattias Andrée  wrote:
> > +   goto beginning;
> > +   for (; argc && (argc < 2 || i < count); i++) {  
> 
> Why not just set count = 1 when argc < 2?

Because that would stop the loop after 1 iteration.
If argc == 1, the loop should be infinite.

> 
> > +   clock_nanosleep(CLOCK_MONOTONIC, 0, &delay, NULL);
> > +   beginning:
> > +   load_vm(&vm[i & 1]);
> > +   print_vm(&vm[i & 1], &vm[~i & 1], active_mem, timestamp, 
> > one_header ? !i
> > : (i % 50 == 0));
> > +   }  
> 
> I think it might be a bit clearer to re-arrange the loop body to avoid
> the labels and goto.
> 
> Something like
> 
>   count = argc == 2 ? atoll(argv[1]) : 1;
>   for (;;) {
>   load_vm(&vm[i & 1]);
>   print_vm(&vm[i & 1], &vm[~i & 1], active_mem, timestamp, 
> one_header
> ? !i : (i % 50 == 0));
>   if (++i == count)
>   break;
>   clock_nanosleep(CLOCK_MONOTONIC, 0, &delay, NULL);
>   }
> 

This does not work as excepted if argc == 1,
it would just run it once rather than forever.

An alternative that would work is:

for (;;) {
load_vm(&vm[i & 1]);
print_vm(&vm[i & 1], &vm[~i & 1], active_mem, timestamp, 
one_header ? !i : (i % 50 == 0));
i++;
if (!argc || (argc == 2 && i == count))
break;
clock_nanosleep(CLOCK_MONOTONIC, 0, &delay, NULL);
}

or (I prefer this one less):

load_vm(&vm[i & 1]);
print_vm(&vm[i & 1], &vm[~i & 1], active_mem, timestamp, one_header ? 
!i : (i % 50 == 0));
for (i++; argc && (argc < 2 || i < count); i++) {
clock_nanosleep(CLOCK_MONOTONIC, 0, &delay, NULL);
load_vm(&vm[i & 1]);
print_vm(&vm[i & 1], &vm[~i & 1], active_mem, timestamp, 
one_header ? !i : (i % 50 == 0));
}

(or this `argc &&` broken out as an if-statement)

I prefer the original version.



[hackers] [ubase][PATCH] Add vmstat

2019-10-05 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 Makefile |   1 +
 TODO |   2 +-
 vmstat.c | 321 +++
 3 files changed, 323 insertions(+), 1 deletion(-)
 create mode 100644 vmstat.c

diff --git a/Makefile b/Makefile
index b526421..e1dab9b 100644
--- a/Makefile
+++ b/Makefile
@@ -86,6 +86,7 @@ BIN = \
umount\
unshare   \
uptime\
+   vmstat\
vtallow   \
watch \
who
diff --git a/TODO b/TODO
index 21f5c20..6dd69d3 100644
--- a/TODO
+++ b/TODO
@@ -27,7 +27,7 @@ tabs
 taskset
 top
 tput
-vmstat
+vmstat [-sdDpS]
 
 Misc
 
diff --git a/vmstat.c b/vmstat.c
new file mode 100644
index 000..d97ee10
--- /dev/null
+++ b/vmstat.c
@@ -0,0 +1,321 @@
+/* See LICENSE file for copyright and license details. */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "util.h"
+
+struct vm {
+   intmax_t cpu_user;
+   intmax_t cpu_nice;
+   intmax_t cpu_system;
+   intmax_t cpu_idle;
+   intmax_t cpu_iowait;
+   intmax_t cpu_irq;
+   intmax_t cpu_softirq;
+   intmax_t cpu_steal;
+   intmax_t cpu_guest;
+   intmax_t cpu_guest_nice;
+   intmax_t cpu_unknown;
+   intmax_t page_in;
+   intmax_t page_out;
+   intmax_t swap_in;
+   intmax_t swap_out;
+   intmax_t intr;
+   intmax_t ctxt_switches;
+   intmax_t processes;
+   intmax_t proc_running;
+   intmax_t proc_blocked;
+   intmax_t mem_free;
+   intmax_t buffers;
+   intmax_t cached;
+   intmax_t active;
+   intmax_t inactive;
+   intmax_t swap_total;
+   intmax_t swap_free;
+   intmax_t sreclaimable;
+};
+
+static intmax_t kb_per_page;
+static intmax_t hz;
+
+static int
+stpstarts(char *str, const char *head, char **end)
+{
+   size_t n = strlen(head);
+   if (!strncmp(str, head, n)) {
+   *end = &str[n];
+   return 1;
+   }
+   return 0;
+}
+
+static intmax_t
+read_ints(const char *s, intmax_t *arr, size_t n)
+{
+   const char *beginning = s;
+   intmax_t rest = 0;
+   size_t tmp;
+   int negative;
+
+   memset(arr, 0, n * sizeof(*arr));
+
+   for (; n--; arr++) {
+   while (*s && !isdigit(*s))
+   s++;
+   negative = (s != beginning && s[-1] == '-');
+   while (isdigit(*s))
+   *arr = *arr * 10 + (*s++ - '0');
+   if (negative)
+   *arr = -*arr;
+   }
+
+   for (; *s; rest += tmp) {
+   tmp = 0;
+   while (*s && !isdigit(*s))
+   s++;
+   negative = (s != beginning && s[-1] == '-');
+   while (isdigit(*s))
+   tmp = tmp * 10 + (*s++ - '0');
+   if (negative)
+   tmp = -tmp;
+   }
+
+   return rest;
+}
+
+static void
+load_vm(struct vm *s)
+{
+   static intmax_t debt = 0;
+
+   int have_page = 0, have_swap = 0;
+   char *line = NULL, *p;
+   size_t size = 0;
+   ssize_t len;
+   FILE *fp;
+
+   memset(s, 0, sizeof(*s));
+
+   fp = fopen("/proc/stat", "r");
+   if (!fp)
+   eprintf("fopen /proc/stat:");
+   while ((len = getline(&line, &size, fp)) >= 0) {
+   if (stpstarts(line, "cpu ", &p)) {
+   s->cpu_unknown = read_ints(p, &s->cpu_user, 10);
+   } else if (stpstarts(line, "page ", &p)) {
+   read_ints(p, &s->page_in, 2);
+   have_page = 1;
+   } else if (stpstarts(line, "swap ", &p)) {
+   read_ints(p, &s->swap_in, 2);
+   have_swap = 1;
+   } else if (stpstarts(line, "intr ", &p)) {
+   read_ints(p, &s->intr, 1);
+   } else if (stpstarts(line, "ctxt ", &p)) {
+   read_ints(p, &s->ctxt_switches, 1);
+   } else if (stpstarts(line, "processes ", &p)) {
+   read_ints(p, &s->processes, 1);
+   } else if (stpstarts(line, "proc_running ", &p)) {
+   read_ints(p, &s->proc_running, 1);
+   } else if (stpstarts(line, "proc_blocked ", &p)) {
+   read_ints(p, &s->proc_blocked, 1);
+   }
+   }
+   if (ferror(fp))
+   eprintf("getline /proc/stat:");
+   fclose(fp);
+
+   if (!have_page || !have_swap) {
+   fp = fopen("/proc/vmstat", "r"

Re: [hackers] [sbase][PATCH] which: check AT_EACCESS

2019-07-29 Thread Mattias Andrée
On Mon, 29 Jul 2019 18:46:25 -0700
Michael Forney  wrote:

> On 2019-07-27, Mattias Andrée  wrote:
> > A file is executable only if the effective user
> > have permission to execute it. The real user's
> > permissions do not matter.  
> 
> Thanks for the patch, but doesn't this only make a difference if the
> `which` binary itself is setuid? If not, can you provide an example
> that is fixed by this patch?
> 
> I looked at a few other implementations and they just use access(3),
> which behaves like faccessat(3) with no flags.
> 

setuid is inherited (exec(3) never changes the effective user according
to POSIX unless the new process have the setuid flag and it is not blocked
by the ST_NOSUID mount option). However, I cannot think of a real world
scenario where this would matter; it would be if the user have a program
similar to sudo that only changes the effective user.



[hackers] [sbase][PATCH] which: check AT_EACCESS

2019-07-27 Thread Mattias Andrée
A file is executable only if the effective user
have permission to execute it. The real user's
permissions do not matter.

Signed-off-by: Mattias Andrée 
---
 which.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/which.c b/which.c
index cc93361..42ed095 100644
--- a/which.c
+++ b/which.c
@@ -20,7 +20,7 @@ canexec(int fd, const char *name)
 
if (fstatat(fd, name, &st, 0) < 0 || !S_ISREG(st.st_mode))
return 0;
-   return faccessat(fd, name, X_OK, 0) == 0;
+   return faccessat(fd, name, X_OK, AT_EACCESS) == 0;
 }
 
 static int
-- 
2.22.0




[hackers] [sbase][PATCH] which: remove unnecessary third parameter in O_RDONLY call to open(3)

2019-07-27 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 which.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/which.c b/which.c
index cc93361..3f9af6e 100644
--- a/which.c
+++ b/which.c
@@ -43,7 +43,7 @@ which(const char *path, const char *name)
if (ptr[i] != ':' && ptr[i] != '\0')
continue;
ptr[i] = '\0';
-   if ((dirfd = open(p, O_RDONLY, 0)) >= 0) {
+   if ((dirfd = open(p, O_RDONLY)) >= 0) {
if (canexec(dirfd, name)) {
found = 1;
fputs(p, stdout);
-- 
2.22.0




Re: [hackers] [sbase][PATCH v2] Add tests for some utilities

2018-08-01 Thread Mattias Andrée
On Wed, 1 Aug 2018 22:07:33 +0200
Mattias Andrée  wrote:

> On Wed, 1 Aug 2018 21:16:26 +0200
> Silvan Jegen  wrote:
> 
> > On Wed, Aug 01, 2018 at 07:53:18PM +0200, Mattias Andrée wrote:  
> > > Thank you for your time!
> > 
> > Thank you for all your work! :P  
> 
> Hi again Silvan,
> 
> > 
> >   
> > > The common code is 590 lines of code, including:
> > > 
> > > * 102 lines of code related to identifying the error when the
> > >   test fails.
> > > 
> > > * 14 lines of code for properly killing processes on failure,
> > >   abortion, and when a test case hangs.
> > > 
> > > * 32 lines of code, plus 13 of the lines counted above,
> > >   for supporting concurrent tests.
> > > 
> > > This leaves 442 lines for the fundamental stuff and a few
> > > lines to support printing all errors rather than just the
> > > first error.
> > > 
> > > Some note on your sh code (since you wrote “crappy and
> > > probably non-portable”, you are probably aware of this,
> > > but just in case):
> > > 
> > > * `source` is a Bash synonym for the portable `.` command.
> > 
> > Yeah, that sounds familiar.
> >  
> >   
> > > * `echo` is unportable and `printf` should be used instead.
> > 
> > Didn't know that echo was not portable. Thought it was just a builtin
> > that should work the same everywhere. It's probably the flags that are
> > the issue...  
> 
> It's worse than that, nothing about echo is portable:
> 
> - You don't know how backslashes are interpreted.
> 
> - You don't know whether you get a new line at the end.
> 
> - You don't know if -e is supported.
> 
> - You don't know if -n is supported.
> 
> > 
> >   
> > > * I have never seen `&>` before, so I my guess is that it
> > >   is a Bashism.
> > 
> > Yeah, seems likely. It's supposed to redirect both stderr and stdout. "sh"
> > did not complain about it but that doesn't mean much...
> > 
> >   
> > > * It looks like whichever file descriptor is not being
> > >   inspected by `check_output` is printed the inherited
> > >   file descriptor rather than to /dev/null.
> > 
> > Printing behaviour of the tests should looked at and fixed for sure.
> > 
> >   
> > > * I think it would be better to have something like:
> > > 
> > >   set -e
> > > 
> > >   run "test name" "./dirname //"
> > >   check_stderr ""
> > >   check_stdout / || check_stdout //
> > >   check_exit 0
> > > 
> > > Your sh code, with check_exit added, covers most current tests.
> > > However, it every time the usage output is modified it has to
> > > be change in the test case, which I guess is acceptable but
> > > undesirable. The tests that are currently implemented
> > 
> > I think that is working as intended. It's a behaviour change in the code
> > and should result in an error (unless we decide that we don't want to
> > test the usage output of course).
> > 
> >   
> > > and which need to be handled specially are:
> > > 
> > > * sleep:
> > >   This can be done with sh. With some adaption to the sh
> > >   code, tests can also be done in parallel as it is in
> > >   the C code.
> > > 
> > > * test:
> > >   test takes a lot of time to test, which is why multiple
> > >   tests are run in parallel in the C code. Like tty(1),
> > >   this test also requires the creation of terminals, but
> > >   it also requires the creation of sockets.
> > > 
> > > * time:
> > >   Requires setitimer(3) and pause(3).
> > > 
> > > * tty:
> > >   This test requires the creation of terminals.
> > > 
> > > * uname:
> > >   Most of uname can be tested in ed, however, the output  
> 
> s/ed/sh/ # I guess you understood that, but I cannot stand not correcting it.
> 
> > >   of uname with only one flag applied requires the uname
> > >   system call.
> > > 
> > > * whoami:
> > >   The user name can be retrieved via $LOGNAME according
> > >   to POSIX, however this requires that your login program
> > >   actually sets it. Additionally (and this should be added
> > >   to the test) when whoami is called from a program with
> > > 

Re: [hackers] [sbase][PATCH v2] Add tests for some utilities

2018-08-01 Thread Mattias Andrée
f  ps  pwd pwdx
readahead   readlinkrenice  respawn
rev rm  rmdir   rmmod
sed seq*setsid  sha1sum*
sha224sum*  sha256sum*  sha384sum*  sha512-224sum*
sha512-256sum*  sha512sum*  sortsplit
sponge  statstrings stty
su  swaplabel   swapoff swapon
switch_root syncsysctl  tail
tar tee tftptouch
tr  truncatetsort   umount
uniq*   unshare uptime  uudecode
uuencodevtallow watch   wc
which   who xargs

I think should look into what their tests will require, maybe
even write some tests with optional test framework, especially
for the tests that cannot be trivially tested in sh. We should
also identify which of them cannot be automatically tested
without a virtual machine, and write them down in the README,
and possibly list of features to test. Once this is done, we
can make a more informed discussion.

For the moment I see two options: 1) write the tests mostly
in sh and write special utilities in C where required, and
2) use the C code, but look for ways to improve it.

As I see it, the most complex parts of the C code are:

* start_process:
It's probably enough to split out some code to
separate functions: the `if (in->flags & DATA)`,
the dup2–close loops at the end.

* wait_process:
Instead of ready all file descriptors as fast as
possible, the they could probably be read in order.

* check_output_test:
It's probably enough to add a few short comments
and improve variable names.

* print_failure:
It's probably enough to add some empty lines add a
few short comments.

The other parts are pretty straight forward.


Regard,
Mattias Andrée



On Wed, 1 Aug 2018 16:36:35 +0200
Silvan Jegen  wrote:

> Hi Mattias!
> 
> On Wed, Jul 11, 2018 at 09:39:23PM +0200, Mattias Andrée wrote:
> > The following utilities are tested:
> > - basename(1)
> > - dirname(1)
> > - echo(1)
> > - false(1)
> > - link(1)
> > - printenv(1)
> > - sleep(1)
> > - test(1)
> > - time(1)
> > - true(1)
> > - tty(1)
> > - uname(1)
> > - unexpand(1)
> > - unlink(1)
> > - whoami(1)
> > - yes(1)
> > 
> > Some tests contain "#ifdef TODO", these tests current
> > fail, but there are patches submitted for most of them.
> > There are not patches submitted for fixing the
> > "#ifdef TODO"s in expand.test.c and unexpand.test.c.
> > 
> > Signed-off-by: Mattias Andrée   
> 
> Sorry for not getting around to looking at this earlier.
> 
> I definitely think we should have unit tests for sbase (and other
> projects?) as soon as possible. What concerns me with your approach is
> that we have about 700 lines of C code in testing-common.{c,h} of which
> I feel quite a bit could be dropped.
> 
> I have written some (crappy and probably non-portable) shell script
> functions to check the stdout and stderr of a process. It's about 40
> lines. I also converted your tests for dirname to use these functions
> (both files attached. The test coverage is not exactly the same but
> relatively similar).
> 
> I wonder if we couldn't use some cleaned-up version of the shell script
> functions for the easy test cases that only check stdout and stderr
> output and your custom C code for the more specialised test cases (like
> 'tty').
> 
> What do you think?
> 
> 
> Cheers,
> 
> Silvan
> 
> 
> > ---
> >  Makefile|  45 -
> >  basename.test.c |  68 +++
> >  dirname.test.c  |  55 ++
> >  echo.test.c |  51 ++
> >  expand.test.c   |  92 ++
> >  false.test.c|  32 
> >  link.test.c |  58 ++
> >  printenv.test.c |  79 
> >  sleep.test.c|  53 ++
> >  test-common.c   | 560 
> > 
> >  test-common.h   | 219 ++
> >  test.test.c | 408 +
> >  time.test.c | 218 ++
> >  true.test.c |  31 
> >  tty.test.c  |  44 +
> >  uname.test.c| 283 
> >  unexpand.test.c |  97 ++
> >  unlink.test.c   |  56 ++
> >  whoami.test.c   |  38 
> >  yes.test.c  | 131 +
> >  20 files changed, 2614 insertions(+), 4 deletions(-)
> >
> > [snip]  



pgpTNuYZt02AN.pgp
Description: OpenPGP digital signature


[hackers] [sbase][PATCH v2] Add tests for some utilities

2018-07-11 Thread Mattias Andrée
The following utilities are tested:
- basename(1)
- dirname(1)
- echo(1)
- false(1)
- link(1)
- printenv(1)
- sleep(1)
- test(1)
- time(1)
- true(1)
- tty(1)
- uname(1)
- unexpand(1)
- unlink(1)
- whoami(1)
- yes(1)

Some tests contain "#ifdef TODO", these tests current
fail, but there are patches submitted for most of them.
There are not patches submitted for fixing the
"#ifdef TODO"s in expand.test.c and unexpand.test.c.

Signed-off-by: Mattias Andrée 
---
 Makefile|  45 -
 basename.test.c |  68 +++
 dirname.test.c  |  55 ++
 echo.test.c |  51 ++
 expand.test.c   |  92 ++
 false.test.c|  32 
 link.test.c |  58 ++
 printenv.test.c |  79 
 sleep.test.c|  53 ++
 test-common.c   | 560 
 test-common.h   | 219 ++
 test.test.c | 408 +
 time.test.c | 218 ++
 true.test.c |  31 
 tty.test.c  |  44 +
 uname.test.c| 283 
 unexpand.test.c |  97 ++
 unlink.test.c   |  56 ++
 whoami.test.c   |  38 
 yes.test.c  | 131 +
 20 files changed, 2614 insertions(+), 4 deletions(-)
 create mode 100644 basename.test.c
 create mode 100644 dirname.test.c
 create mode 100644 echo.test.c
 create mode 100644 expand.test.c
 create mode 100644 false.test.c
 create mode 100644 link.test.c
 create mode 100644 printenv.test.c
 create mode 100644 sleep.test.c
 create mode 100644 test-common.c
 create mode 100644 test-common.h
 create mode 100644 test.test.c
 create mode 100644 time.test.c
 create mode 100644 true.test.c
 create mode 100644 tty.test.c
 create mode 100644 uname.test.c
 create mode 100644 unexpand.test.c
 create mode 100644 unlink.test.c
 create mode 100644 whoami.test.c
 create mode 100644 yes.test.c

diff --git a/Makefile b/Makefile
index 0e421e7..b83058f 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
 include config.mk
 
 .SUFFIXES:
-.SUFFIXES: .o .c
+.SUFFIXES: .test .test.o .o .c
 
 HDR =\
arg.h\
@@ -19,7 +19,8 @@ HDR =\
sha512-256.h\
text.h\
utf.h\
-   util.h
+   util.h\
+   test-common.h
 
 LIBUTF = libutf.a
 LIBUTFSRC =\
@@ -181,9 +182,28 @@ BIN =\
xinstall\
yes
 
+TEST =\
+   basename.test\
+   dirname.test\
+   echo.test\
+   expand.test\
+   false.test\
+   link.test\
+   printenv.test\
+   sleep.test\
+   test.test\
+   time.test\
+   true.test\
+   tty.test\
+   uname.test\
+   unexpand.test\
+   unlink.test\
+   whoami.test\
+   yes.test
+
 LIBUTFOBJ = $(LIBUTFSRC:.c=.o)
 LIBUTILOBJ = $(LIBUTILSRC:.c=.o)
-OBJ = $(BIN:=.o) $(LIBUTFOBJ) $(LIBUTILOBJ)
+OBJ = $(BIN:=.o) $(TEST:=.o) test-common.o $(LIBUTFOBJ) $(LIBUTILOBJ)
 SRC = $(BIN:=.c)
 MAN = $(BIN:=.1)
 
@@ -193,12 +213,17 @@ $(BIN): $(LIB) $(@:=.o)
 
 $(OBJ): $(HDR) config.mk
 
+$(TEST): $(@:=.o) test-common.o
+
 .o:
$(CC) $(LDFLAGS) -o $@ $< $(LIB)
 
 .c.o:
$(CC) $(CFLAGS) $(CPPFLAGS) -o $@ -c $<
 
+.test.o.test:
+   $(CC) $(LDFLAGS) -o $@ $< test-common.o
+
 $(LIBUTF): $(LIBUTFOBJ)
$(AR) rc $@ $?
$(RANLIB) $@
@@ -212,6 +237,17 @@ getconf.o: getconf.h
 getconf.h: getconf.sh
./getconf.sh > $@
 
+check: $(TEST) $(BIN)
+   @set -e;\
+   echo './sleep.test &' ; ./sleep.test & sleep_pid=$$!;\
+   for f in $(TEST); do\
+   if test "$$f" != sleep.test; then\
+   echo ./$$f; ./$$f;\
+   fi;\
+   done;\
+   echo 'wait';\
+   wait $$sleep_pid
+
 install: all
mkdir -p $(DESTDIR)$(PREFIX)/bin
cp -f $(BIN) $(DESTDIR)$(PREFIX)/bin
@@ -271,7 +307,8 @@ sbase-box-uninstall: uninstall
cd $(DESTDIR)$(PREFIX)/bin && rm -f sbase-box
 
 clean:
-   rm -f $(BIN) $(OBJ) $(LIB) sbase-box sbase-$(VERSION).tar.gz
+   rm -f $(BIN) $(TEST) $(OBJ) $(LIB) sbase-box sbase-$(VERSION).tar.gz
rm -f getconf.h
+   rm -rf testdir-*/
 
 .PHONY: all install uninstall dist sbase-box sbase-box-install 
sbase-box-uninstall clean
diff --git a/basename.test.c b/basename.test.c
new file mode 100644
index 000..bb22153
--- /dev/null
+++ b/basename.test.c
@@ -0,0 +1,68 @@
+/* See LICENSE file for copyright and license details. */
+#include "test-common.h"
+
+static struct Case {
+   const char *path;
+   const char *suffix;
+   const char *basename;
+} cases[] = {
+   {"/",  NULL,  "/\n"},
+   {"///",NULL,  "/\n"},
+   {"x/", NULL,  "x\n"},
+   {"x//",NULL,  "x\n"},
+   {"a/b/c",  NULL,  "c\n"},
+   {"/a", NULL,  "a\n"},
+   {"a/b/c

[hackers] [sbase][PATCH] uname: check that no operands are specified

2018-07-11 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 uname.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/uname.c b/uname.c
index dfc979e..122c172 100644
--- a/uname.c
+++ b/uname.c
@@ -40,6 +40,9 @@ main(int argc, char *argv[])
usage();
} ARGEND
 
+   if (argc)
+   usage();
+
if (uname(&u) < 0)
eprintf("uname:");
 
-- 
2.11.1




[hackers] [sbase][PATCH v2] Support -- in all utilities except echo(1)

2018-07-08 Thread Mattias Andrée
In POSIX-2017 it was clarified via the documentation for
basename(1) and dirname(1) that all programs should support
-- unless specified otherwise.

Signed-off-by: Mattias Andrée 
---
 chroot.c   |  5 -
 cksum.c| 11 ++-
 expr.c | 11 ++-
 hostname.c |  5 -
 link.c |  5 -
 logname.c  |  5 -
 nohup.c|  5 -
 printenv.c | 11 ++-
 printf.c   | 12 
 setsid.c   |  5 -
 sleep.c|  5 -
 sponge.c   |  5 -
 sync.c |  5 -
 tsort.c|  2 +-
 tty.c  |  5 -
 unlink.c   |  5 -
 whoami.c   |  5 -
 yes.c  | 11 ++-
 18 files changed, 97 insertions(+), 21 deletions(-)

diff --git a/chroot.c b/chroot.c
index 22bc62e..45f2dc7 100644
--- a/chroot.c
+++ b/chroot.c
@@ -17,7 +17,10 @@ main(int argc, char *argv[])
char *shell[] = { "/bin/sh", "-i", NULL }, *aux, *cmd;
int savederrno;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc)
usage();
diff --git a/cksum.c b/cksum.c
index 4e7dce6..50107b2 100644
--- a/cksum.c
+++ b/cksum.c
@@ -92,12 +92,21 @@ cksum(int fd, const char *s)
putchar('\n');
 }
 
+static void
+usage(void)
+{
+   eprintf("usage: %s [file ...]\n", argv0);
+}
+
 int
 main(int argc, char *argv[])
 {
int fd;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc) {
cksum(0, NULL);
diff --git a/expr.c b/expr.c
index d9c758d..ea3c58b 100644
--- a/expr.c
+++ b/expr.c
@@ -252,12 +252,21 @@ parse(char *expr[], int numexpr)
return (valp->str && *valp->str) || valp->num;
 }
 
+static void
+usage(void)
+{
+   eprintf("usage: %s expression\n", argv0);
+}
+
 int
 main(int argc, char *argv[])
 {
int ret;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
ret = !parse(argv, argc);
 
diff --git a/hostname.c b/hostname.c
index 495d40d..2532ec8 100644
--- a/hostname.c
+++ b/hostname.c
@@ -16,7 +16,10 @@ main(int argc, char *argv[])
 {
char host[HOST_NAME_MAX + 1];
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc) {
if (gethostname(host, sizeof(host)) < 0)
diff --git a/link.c b/link.c
index a260136..7cee4d0 100644
--- a/link.c
+++ b/link.c
@@ -12,7 +12,10 @@ usage(void)
 int
 main(int argc, char *argv[])
 {
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (argc != 2)
usage();
diff --git a/logname.c b/logname.c
index 8eb8eea..12908f5 100644
--- a/logname.c
+++ b/logname.c
@@ -15,7 +15,10 @@ main(int argc, char *argv[])
 {
char *login;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (argc)
usage();
diff --git a/nohup.c b/nohup.c
index c75ea45..2825c5d 100644
--- a/nohup.c
+++ b/nohup.c
@@ -19,7 +19,10 @@ main(int argc, char *argv[])
 {
int fd, savederrno;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc)
usage();
diff --git a/printenv.c b/printenv.c
index 7ff5393..19b5b7d 100644
--- a/printenv.c
+++ b/printenv.c
@@ -6,13 +6,22 @@
 
 extern char **environ;
 
+static void
+usage(void)
+{
+   eprintf("usage: %s [var ...]\n", argv0);
+}
+
 int
 main(int argc, char *argv[])
 {
char *var;
int ret = 0;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc) {
for (; *environ; environ++)
diff --git a/printf.c b/printf.c
index 4bc645b..094240a 100644
--- a/printf.c
+++ b/printf.c
@@ -25,11 +25,15 @@ main(int argc, char *argv[])
int cooldown = 0, width, precision, ret = 0;
char *format, *tmp, *arg, *fmt, flag;
 
-   argv0 = argv[0];
-   if (argc < 2)
+   ARGBEGIN {
+   default:
usage();
+   } ARGEND
 
-   format = argv[1];
+   if (argc < 1)
+   usage();
+
+   format = argv[0];
if ((tmp = strstr(format, "\\c"))) {
*tmp = 0;
cooldown = 1;
@@ -38,7 +42,7 @@ main(int argc, char *argv[])
if (formatlen == 0)
return 0;
lastargi = 0;
-   for (i = 0, argi = 2; !cooldown || i < formatlen; i++, i = cooldown ? i 
: (i % formatlen)) 

[hackers] [sbase][PATCH v2] Support -- in all utilities except echo(1)

2018-07-08 Thread Mattias Andrée
In POSIX-2017 it was clarified via the documentation for
basename(1) and dirname(1) that all programs should support
-- unless specified otherwise.

Signed-off-by: Mattias Andrée 
---
 chroot.c   |  5 -
 cksum.c| 11 ++-
 expr.c | 11 ++-
 hostname.c |  5 -
 link.c |  5 -
 logname.c  |  5 -
 nohup.c|  5 -
 printenv.c | 11 ++-
 printf.c   | 12 
 setsid.c   |  5 -
 sleep.c|  5 -
 sponge.c   |  5 -
 sync.c |  5 -
 tsort.c|  2 +-
 tty.c  |  5 -
 unlink.c   |  5 -
 whoami.c   |  5 -
 yes.c  | 11 ++-
 18 files changed, 97 insertions(+), 21 deletions(-)

diff --git a/chroot.c b/chroot.c
index 22bc62e..45f2dc7 100644
--- a/chroot.c
+++ b/chroot.c
@@ -17,7 +17,10 @@ main(int argc, char *argv[])
char *shell[] = { "/bin/sh", "-i", NULL }, *aux, *cmd;
int savederrno;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc)
usage();
diff --git a/cksum.c b/cksum.c
index 4e7dce6..50107b2 100644
--- a/cksum.c
+++ b/cksum.c
@@ -92,12 +92,21 @@ cksum(int fd, const char *s)
putchar('\n');
 }
 
+static void
+usage(void)
+{
+   eprintf("usage: %s [file ...]\n", argv0);
+}
+
 int
 main(int argc, char *argv[])
 {
int fd;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc) {
cksum(0, NULL);
diff --git a/expr.c b/expr.c
index d9c758d..ea3c58b 100644
--- a/expr.c
+++ b/expr.c
@@ -252,12 +252,21 @@ parse(char *expr[], int numexpr)
return (valp->str && *valp->str) || valp->num;
 }
 
+static void
+usage(void)
+{
+   eprintf("usage: %s expression\n", argv0);
+}
+
 int
 main(int argc, char *argv[])
 {
int ret;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
ret = !parse(argv, argc);
 
diff --git a/hostname.c b/hostname.c
index 495d40d..2532ec8 100644
--- a/hostname.c
+++ b/hostname.c
@@ -16,7 +16,10 @@ main(int argc, char *argv[])
 {
char host[HOST_NAME_MAX + 1];
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc) {
if (gethostname(host, sizeof(host)) < 0)
diff --git a/link.c b/link.c
index a260136..7cee4d0 100644
--- a/link.c
+++ b/link.c
@@ -12,7 +12,10 @@ usage(void)
 int
 main(int argc, char *argv[])
 {
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (argc != 2)
usage();
diff --git a/logname.c b/logname.c
index 8eb8eea..12908f5 100644
--- a/logname.c
+++ b/logname.c
@@ -15,7 +15,10 @@ main(int argc, char *argv[])
 {
char *login;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (argc)
usage();
diff --git a/nohup.c b/nohup.c
index c75ea45..2825c5d 100644
--- a/nohup.c
+++ b/nohup.c
@@ -19,7 +19,10 @@ main(int argc, char *argv[])
 {
int fd, savederrno;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc)
usage();
diff --git a/printenv.c b/printenv.c
index 7ff5393..19b5b7d 100644
--- a/printenv.c
+++ b/printenv.c
@@ -6,13 +6,22 @@
 
 extern char **environ;
 
+static void
+usage(void)
+{
+   eprintf("usage: %s [var ...]\n", argv0);
+}
+
 int
 main(int argc, char *argv[])
 {
char *var;
int ret = 0;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (!argc) {
for (; *environ; environ++)
diff --git a/printf.c b/printf.c
index 4bc645b..094240a 100644
--- a/printf.c
+++ b/printf.c
@@ -25,11 +25,15 @@ main(int argc, char *argv[])
int cooldown = 0, width, precision, ret = 0;
char *format, *tmp, *arg, *fmt, flag;
 
-   argv0 = argv[0];
-   if (argc < 2)
+   ARGBEGIN {
+   default:
usage();
+   } ARGEND
 
-   format = argv[1];
+   if (argc < 1)
+   usage();
+
+   format = argv[0];
if ((tmp = strstr(format, "\\c"))) {
*tmp = 0;
cooldown = 1;
@@ -38,7 +42,7 @@ main(int argc, char *argv[])
if (formatlen == 0)
return 0;
lastargi = 0;
-   for (i = 0, argi = 2; !cooldown || i < formatlen; i++, i = cooldown ? i 
: (i % formatlen)) 

[hackers] [sbase][PATCH] Support -- in all utilities except echo(1)

2018-07-08 Thread Mattias Andrée
In POSIX-2017 it was clarified via the documentation for
basename(1) and dirname(1) that all programs should support
-- unless specified otherwise.

Signed-off-by: Mattias Andrée 
---
 arg.h  | 13 +
 basename.c |  5 +
 chroot.c   |  2 +-
 cksum.c|  8 +++-
 dirname.c  |  5 +
 expr.c |  8 +++-
 hostname.c |  2 +-
 link.c |  2 +-
 logname.c  |  2 +-
 nohup.c|  2 +-
 printenv.c |  8 +++-
 printf.c   |  9 +
 rev.c  |  5 +
 setsid.c   |  2 +-
 sleep.c|  2 +-
 sponge.c   |  2 +-
 sync.c |  2 +-
 tsort.c|  5 +
 tty.c  |  2 +-
 unlink.c   |  2 +-
 whoami.c   |  2 +-
 yes.c  |  8 +++-
 22 files changed, 62 insertions(+), 36 deletions(-)

diff --git a/arg.h b/arg.h
index 0b23c53..e8b84ba 100644
--- a/arg.h
+++ b/arg.h
@@ -62,4 +62,17 @@ extern char *argv0;
 
 #define LNGARG()   &argv[0][0]
 
+#define ENOFLAGS(x)do {\
+   argv0 = *argv++;\
+   argc--;\
+   if (argc && argv[0][0] == '-') {\
+   if (argv[0][1] == '-' && !argv[0][2]) {\
+   argv++;\
+   argc--;\
+   } else {\
+   (x);\
+   }\
+   }\
+   } while (0)
+
 #endif
diff --git a/basename.c b/basename.c
index 94a2848..de41d86 100644
--- a/basename.c
+++ b/basename.c
@@ -17,10 +17,7 @@ main(int argc, char *argv[])
ssize_t off;
char *p;
 
-   ARGBEGIN {
-   default:
-   usage();
-   } ARGEND
+   ENOFLAGS(usage());
 
if (argc != 1 && argc != 2)
usage();
diff --git a/chroot.c b/chroot.c
index 22bc62e..e3d4c3e 100644
--- a/chroot.c
+++ b/chroot.c
@@ -17,7 +17,7 @@ main(int argc, char *argv[])
char *shell[] = { "/bin/sh", "-i", NULL }, *aux, *cmd;
int savederrno;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ENOFLAGS(usage());
 
if (!argc)
usage();
diff --git a/cksum.c b/cksum.c
index 4e7dce6..68c8fc6 100644
--- a/cksum.c
+++ b/cksum.c
@@ -92,12 +92,18 @@ cksum(int fd, const char *s)
putchar('\n');
 }
 
+static void
+usage(void)
+{
+   eprintf("usage: %s [file ...]\n", argv0);
+}
+
 int
 main(int argc, char *argv[])
 {
int fd;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ENOFLAGS(usage());
 
if (!argc) {
cksum(0, NULL);
diff --git a/dirname.c b/dirname.c
index 45e1a7e..fcc12e1 100644
--- a/dirname.c
+++ b/dirname.c
@@ -13,10 +13,7 @@ usage(void)
 int
 main(int argc, char *argv[])
 {
-   ARGBEGIN {
-   default:
-   usage();
-   } ARGEND
+   ENOFLAGS(usage());
 
if (argc != 1)
usage();
diff --git a/expr.c b/expr.c
index d9c758d..b9bed3f 100644
--- a/expr.c
+++ b/expr.c
@@ -252,12 +252,18 @@ parse(char *expr[], int numexpr)
return (valp->str && *valp->str) || valp->num;
 }
 
+static void
+usage(void)
+{
+   eprintf("usage: %s expression\n", argv0);
+}
+
 int
 main(int argc, char *argv[])
 {
int ret;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ENOFLAGS(usage());
 
ret = !parse(argv, argc);
 
diff --git a/hostname.c b/hostname.c
index 495d40d..cd66893 100644
--- a/hostname.c
+++ b/hostname.c
@@ -16,7 +16,7 @@ main(int argc, char *argv[])
 {
char host[HOST_NAME_MAX + 1];
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ENOFLAGS(usage());
 
if (!argc) {
if (gethostname(host, sizeof(host)) < 0)
diff --git a/link.c b/link.c
index a260136..cf76f32 100644
--- a/link.c
+++ b/link.c
@@ -12,7 +12,7 @@ usage(void)
 int
 main(int argc, char *argv[])
 {
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ENOFLAGS(usage());
 
if (argc != 2)
usage();
diff --git a/logname.c b/logname.c
index 8eb8eea..04a2aaa 100644
--- a/logname.c
+++ b/logname.c
@@ -15,7 +15,7 @@ main(int argc, char *argv[])
 {
char *login;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ENOFLAGS(usage());
 
if (argc)
usage();
diff --git a/nohup.c b/nohup.c
index c75ea45..436c072 100644
--- a/nohup.c
+++ b/nohup.c
@@ -19,7 +19,7 @@ main(int argc, char *argv[])
 {
int fd, savederrno;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ENOFLAGS(usage());
 
if (!argc)
usage();
diff --git a/printenv.c b/printenv.c
index 7ff5393..7ba0e47 100644
--- a/printenv.c
+++ b/printenv.c
@@ -6,13 +6,19 @@
 
 extern char **environ;
 
+static void
+u

Re: [hackers] [sbase][PATCH] Add test framework with a test for tty(1)

2018-07-08 Thread Mattias Andrée
On Sun, 8 Jul 2018 10:06:14 +0100
Dimitris Papastamos  wrote:

> On Sun, Jul 08, 2018 at 01:12:59AM +0200, Mattias Andrée wrote:
> > On Sat, 7 Jul 2018 23:29:53 +0100
> > Dimitris Papastamos  wrote:
> >   
> > > On Sun, Jul 08, 2018 at 12:12:08AM +0200, Mattias Andrée wrote:  
> > > > On Sat, 7 Jul 2018 22:55:28 +0100
> > > > Dimitris Papastamos  wrote:
> > > > 
> > > > > This is too intrusive, what's wrong with using a shell script to test
> > > > > the commands?
> > > > > 
> > > > > The test framework is more complicated than most sbase commands.
> > > > > 
> > > > > It would have been nice to discuss this in advance before writing a
> > > > > 1000 line patch that might not get merged.
> > > > > 
> > > > 
> > > > Writing all tests in shell isn't the best idea I think.
> > > > This frameworks makes it easy to write test and it will
> > > > tell you everything you need to know to figure out what
> > > > failed. I believe that in the need this will reduce the
> > > > amount of test code. There are things that are difficult
> > > > to do in shell: for example create a terminal which is
> > > > need to test tty(1). Look for example at the tests in
> > > > https://github.com/maandree/base-util-tests, they aren't
> > > > that nice, and they even require bash(1), it would be
> > > > even worse with portable sh(1).
> > > 
> > > so what you are saying is that your shell code sucks so
> > > it is better done in C?
> > > 
> > > sorry im not buying it, this is overkill
> > >   
> > 
> > No, I'm saying that if you want to do it in sh(1) you will
> > need more test, or make really crappy helper functions, and
> > you will also need to write special utilities in C just to
> > test some utilities that cannot be tested entirely in sh(1).
> > 
> > The C code contains:
> > 
> > *   a way to print where in a loop a test fails.
> > this will be useful for example when when adding test
> > cases for the *sum utilities,
> > 
> > *   a way to make tests asynchronously, this could be removed
> > but will be useful for testing sleep(1), so it does not
> > take too long,
> > 
> > *   a simple way to measure and test how long a test took,
> > 
> > *   a set of tiny functions to declare how to program shall
> > be started,
> > 
> > *   a way to create sockets and TTYs. The socket part can
> > probably be removed but it would be useful for testing
> > different file types. Support for regular files should
> > probably be added. TTYs are required to testing tty(1),
> > 
> > *   ways to test the exit status with support for both normal
> > exit and kill by signal,
> > 
> > *   ways to check the output to stdout and stderr. In sh(1),
> > test(1) and grep(1) could be used,
> > 
> > *   some code to make to test cases smaller,
> > 
> > *   a function to spawn a process with the requested files
> > and input, and
> > 
> > *   a function to read a process's output and wait for it.
> > 
> > I wouldn't say that it's complex, its just a few functions, and
> > 2 or 3 of them is are bit long, but not particularly long. I
> > think this is worth it for the consistence, short tests and
> > easily readable tests, a convenient way to locate the failing
> > test and what's actually wrong, and to not have extra utilities
> > (also written in C) to do the parts of the tests that cannot be
> > done in C. I'm not sure how mean that the code is complicated,
> > it's just long.  
> 
> https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/regress/usr.bin/
> 
> whatever you do, this test code should be in a different repo
> like sbase-regress or similar.
> 

I will reduce code in the test-common.[ch].

If the tests are in a separate repository, package maintainers
need to download both, do you really think this is a good idea?
What's the advantage with make it a separate repository?


pgp83S4qBiPq_.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] Add test framework with a test for tty(1)

2018-07-07 Thread Mattias Andrée
On Sat, 7 Jul 2018 23:29:53 +0100
Dimitris Papastamos  wrote:

> On Sun, Jul 08, 2018 at 12:12:08AM +0200, Mattias Andrée wrote:
> > On Sat, 7 Jul 2018 22:55:28 +0100
> > Dimitris Papastamos  wrote:
> >   
> > > This is too intrusive, what's wrong with using a shell script to test
> > > the commands?
> > > 
> > > The test framework is more complicated than most sbase commands.
> > > 
> > > It would have been nice to discuss this in advance before writing a
> > > 1000 line patch that might not get merged.
> > >   
> > 
> > Writing all tests in shell isn't the best idea I think.
> > This frameworks makes it easy to write test and it will
> > tell you everything you need to know to figure out what
> > failed. I believe that in the need this will reduce the
> > amount of test code. There are things that are difficult
> > to do in shell: for example create a terminal which is
> > need to test tty(1). Look for example at the tests in
> > https://github.com/maandree/base-util-tests, they aren't
> > that nice, and they even require bash(1), it would be
> > even worse with portable sh(1).  
> 
> so what you are saying is that your shell code sucks so
> it is better done in C?
> 
> sorry im not buying it, this is overkill
> 

No, I'm saying that if you want to do it in sh(1) you will
need more test, or make really crappy helper functions, and
you will also need to write special utilities in C just to
test some utilities that cannot be tested entirely in sh(1).

The C code contains:

*   a way to print where in a loop a test fails.
this will be useful for example when when adding test
cases for the *sum utilities,

*   a way to make tests asynchronously, this could be removed
but will be useful for testing sleep(1), so it does not
take too long,

*   a simple way to measure and test how long a test took,

*   a set of tiny functions to declare how to program shall
be started,

*   a way to create sockets and TTYs. The socket part can
probably be removed but it would be useful for testing
different file types. Support for regular files should
probably be added. TTYs are required to testing tty(1),

*   ways to test the exit status with support for both normal
exit and kill by signal,

*   ways to check the output to stdout and stderr. In sh(1),
test(1) and grep(1) could be used,

*   some code to make to test cases smaller,

*   a function to spawn a process with the requested files
and input, and

*   a function to read a process's output and wait for it.

I wouldn't say that it's complex, its just a few functions, and
2 or 3 of them is are bit long, but not particularly long. I
think this is worth it for the consistence, short tests and
easily readable tests, a convenient way to locate the failing
test and what's actually wrong, and to not have extra utilities
(also written in C) to do the parts of the tests that cannot be
done in C. I'm not sure how mean that the code is complicated,
it's just long.


pgpaA3iydJlYn.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] Add test framework with a test for tty(1)

2018-07-07 Thread Mattias Andrée
On Sat, 7 Jul 2018 22:55:28 +0100
Dimitris Papastamos  wrote:

> This is too intrusive, what's wrong with using a shell script to test
> the commands?
> 
> The test framework is more complicated than most sbase commands.
> 
> It would have been nice to discuss this in advance before writing a
> 1000 line patch that might not get merged.
> 

Writing all tests in shell isn't the best idea I think.
This frameworks makes it easy to write test and it will
tell you everything you need to know to figure out what
failed. I believe that in the need this will reduce the
amount of test code. There are things that are difficult
to do in shell: for example create a terminal which is
need to test tty(1). Look for example at the tests in
https://github.com/maandree/base-util-tests, they aren't
that nice, and they even require bash(1), it would be
even worse with portable sh(1).


pgptwxsDaxak0.pgp
Description: OpenPGP digital signature


[hackers] [sbase][PATCH] Add test framework with a test for tty(1)

2018-07-07 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 Makefile  |  20 +-
 test-common.c | 823 ++
 test-common.h | 190 ++
 tty.test.c|  26 ++
 4 files changed, 1055 insertions(+), 4 deletions(-)
 create mode 100644 test-common.c
 create mode 100644 test-common.h
 create mode 100644 tty.test.c

diff --git a/Makefile b/Makefile
index 0e421e7..005cf13 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
 include config.mk
 
 .SUFFIXES:
-.SUFFIXES: .o .c
+.SUFFIXES: .test .test.o .o .c
 
 HDR =\
arg.h\
@@ -19,7 +19,8 @@ HDR =\
sha512-256.h\
text.h\
utf.h\
-   util.h
+   util.h\
+   test-common.h
 
 LIBUTF = libutf.a
 LIBUTFSRC =\
@@ -181,9 +182,12 @@ BIN =\
xinstall\
yes
 
+TEST =\
+   tty.test
+
 LIBUTFOBJ = $(LIBUTFSRC:.c=.o)
 LIBUTILOBJ = $(LIBUTILSRC:.c=.o)
-OBJ = $(BIN:=.o) $(LIBUTFOBJ) $(LIBUTILOBJ)
+OBJ = $(BIN:=.o) $(TEST:=.o) test-common.o $(LIBUTFOBJ) $(LIBUTILOBJ)
 SRC = $(BIN:=.c)
 MAN = $(BIN:=.1)
 
@@ -193,12 +197,17 @@ $(BIN): $(LIB) $(@:=.o)
 
 $(OBJ): $(HDR) config.mk
 
+$(TEST): $(@:=.o) test-common.o
+
 .o:
$(CC) $(LDFLAGS) -o $@ $< $(LIB)
 
 .c.o:
$(CC) $(CFLAGS) $(CPPFLAGS) -o $@ -c $<
 
+.test.o.test:
+   $(CC) $(LDFLAGS) -o $@ $< test-common.o
+
 $(LIBUTF): $(LIBUTFOBJ)
$(AR) rc $@ $?
$(RANLIB) $@
@@ -212,6 +221,9 @@ getconf.o: getconf.h
 getconf.h: getconf.sh
./getconf.sh > $@
 
+check: $(TEST) $(BIN)
+   @set -e; for f in $(TEST); do echo ./$$f; ./$$f; done
+
 install: all
mkdir -p $(DESTDIR)$(PREFIX)/bin
cp -f $(BIN) $(DESTDIR)$(PREFIX)/bin
@@ -271,7 +283,7 @@ sbase-box-uninstall: uninstall
cd $(DESTDIR)$(PREFIX)/bin && rm -f sbase-box
 
 clean:
-   rm -f $(BIN) $(OBJ) $(LIB) sbase-box sbase-$(VERSION).tar.gz
+   rm -f $(BIN) $(TEST) $(OBJ) $(LIB) sbase-box sbase-$(VERSION).tar.gz
rm -f getconf.h
 
 .PHONY: all install uninstall dist sbase-box sbase-box-install 
sbase-box-uninstall clean
diff --git a/test-common.c b/test-common.c
new file mode 100644
index 000..458b094
--- /dev/null
+++ b/test-common.c
@@ -0,0 +1,823 @@
+/* See LICENSE file for copyright and license details. */
+#include "test-common.h"
+
+struct Counter {
+   const char *name;
+   size_t value;
+};
+
+const char *test_file = NULL;
+int test_line = 0;
+int main_ret = 0;
+int timeout = 10;
+int pdeath_sig = SIGINT;
+void (*atfork)(void) = NULL;
+
+static struct Counter counters[16];
+static size_t ncounters = 0;
+static pid_t async_pids[1024];
+static size_t async_npids = 0;
+
+static void
+eperror(const char *prefix)
+{
+   perror(prefix);
+   fflush(stderr);
+   exit(64);
+}
+
+struct Process *
+stdin_text(struct Process *proc, char *s)
+{
+   proc->input[0].data = s;
+   proc->input[0].flags &= ~IO_STREAM_BINARY;
+   return proc;
+}
+
+struct Process *
+stdin_bin(struct Process *proc, char *s, size_t n)
+{
+   proc->input[0].data = s;
+   proc->input[0].len = n;
+   proc->input[0].flags |= IO_STREAM_BINARY;
+   return proc;
+}
+
+struct Process *
+stdin_fds(struct Process *proc, int input_fd, int output_fd)
+{
+   proc->input[0].flags &= ~IO_STREAM_DATA;
+   proc->input[0].input_fd = input_fd;
+   proc->input[0].output_fd = output_fd;
+   return proc;
+}
+
+struct Process *
+stdin_type(struct Process *proc, int type)
+{
+   proc->input[0].flags &= ~IO_STREAM_CREATE_MASK;
+   proc->input[0].flags |= type;
+   return proc;
+}
+
+struct Process *
+stdout_fds(struct Process *proc, int input_fd, int output_fd)
+{
+   proc->output[0].flags &= ~IO_STREAM_DATA;
+   proc->output[0].input_fd = input_fd;
+   proc->output[0].output_fd = output_fd;
+   return proc;
+}
+
+struct Process *
+stdout_type(struct Process *proc, int type)
+{
+   proc->output[0].flags &= ~IO_STREAM_CREATE_MASK;
+   proc->output[0].flags |= type;
+   return proc;
+}
+
+struct Process *
+stderr_fds(struct Process *proc, int input_fd, int output_fd)
+{
+   proc->output[1].flags &= ~IO_STREAM_DATA;
+   proc->output[1].input_fd = input_fd;
+   proc->output[1].output_fd = output_fd;
+   return proc;
+}
+
+struct Process *
+stderr_type(struct Process *proc, int type)
+{
+   proc->output[1].flags &= ~IO_STREAM_CREATE_MASK;
+   proc->output[1].flags |= type;
+   return proc;
+}
+
+struct Process *
+set_preexec(struct Process *proc, void (*preexec)(struct Process *))
+{
+   proc->preexec = preexec;
+   return proc;
+}
+
+struct Process *
+set_async(struct Process *proc)
+{
+   proc->flags |= PROCESS_ASYNC;
+   return proc;
+}
+
+struct Process *
+set_setsid(struct Process *proc)
+{
+   proc->flags |= PROCESS_SETSID;
+   return proc;
+}
+
+void
+push_counter(const char *name)

[hackers] [PATCH] tty: fix exit value on error from 1 to 2

2018-07-07 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 tty.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tty.c b/tty.c
index a57cb8e..7ffb04a 100644
--- a/tty.c
+++ b/tty.c
@@ -7,7 +7,7 @@
 static void
 usage(void)
 {
-   eprintf("usage: %s\n", argv0);
+   enprintf(2, "usage: %s\n", argv0);
 }
 
 int
@@ -23,5 +23,6 @@ main(int argc, char *argv[])
tty = ttyname(STDIN_FILENO);
puts(tty ? tty : "not a tty");
 
-   return fshut(stdout, "") || !tty;
+   enfshut(2, stdout, "");
+   return !tty;
 }
-- 
2.11.1




[hackers] [sbase][PATCH] basename: support --

2018-07-06 Thread Mattias Andrée
POSIX-2017 clarifies that -- and normal option parsing must be supported.
See EXAMPLES in basename(1p).

Signed-off-by: Mattias Andrée 
---
 basename.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/basename.c b/basename.c
index d211799..94a2848 100644
--- a/basename.c
+++ b/basename.c
@@ -17,7 +17,10 @@ main(int argc, char *argv[])
ssize_t off;
char *p;
 
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (argc != 1 && argc != 2)
usage();
-- 
2.11.1




[hackers] [sbase][PATCH] dirname: support --

2018-07-06 Thread Mattias Andrée
POSIX-2017 clarifies that -- and normal option parsing must be supported.
See EXAMPLES in basename(1p)

Signed-off-by: Mattias Andrée 
---
 dirname.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/dirname.c b/dirname.c
index 8392bc0..45e1a7e 100644
--- a/dirname.c
+++ b/dirname.c
@@ -13,7 +13,10 @@ usage(void)
 int
 main(int argc, char *argv[])
 {
-   argv0 = *argv, argv0 ? (argc--, argv++) : (void *)0;
+   ARGBEGIN {
+   default:
+   usage();
+   } ARGEND
 
if (argc != 1)
usage();
-- 
2.11.1




[hackers] [ubase][PATCH] ps: fix argv0 position in usage line

2018-06-11 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 ps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ps.c b/ps.c
index 114983b..00405a5 100644
--- a/ps.c
+++ b/ps.c
@@ -145,7 +145,7 @@ psr(const char *file)
 static void
 usage(void)
 {
-   eprintf("usage: [-aAdef] %s\n", argv0);
+   eprintf("usage: %s [-aAdef]\n", argv0);
 }
 
 int
-- 
2.11.1




Re: [hackers] [blind] update todo: tee is too slow || Mattias Andrée

2017-12-12 Thread Mattias Andrée
You will be glad to know I will be removing `float` support. It didn't
provide the performance boost I was hoping for.

On Wed, 13 Dec 2017 08:31:21 +0100
Mattias Andrée  wrote:

> I'm not convinced, but since you guys disagree I will not implement this.
> 
> On Wed, 13 Dec 2017 00:45:18 +
> Richard Ipsum  wrote:
> 
> > I lack community standing to really comment on this.
> > 
> > That being said, as a casual observer it seems fairly obvious to me
> > that what you're proposing is totally anathema to suckless philosophy.
> > 
> > On Tue, Dec 12, 2017 at 02:49:34PM +, Mattias Andrée wrote:  
> > > It may look insane on the surface level. I am not commited to changes
> > > listed in the TODO, most of them are just ideas that I will have to 
> > > evalute
> > > later whether they are worth implementing. Although I'm pretty convinced
> > > most of them are good.
> > > 
> > > From: isabella parakiss [izaber...@gmail.com]
> > > Sent: 12 December 2017 13:31
> > > To: hackers mail list
> > > Subject: Re: [hackers] [blind] update todo: tee is too slow || Mattias 
> > > Andrée
> > > 
> > > don't you realize how insane the whole thing is?
> > > 
> > > On 12/12/17, Mattias Andrée  wrote:
> > > > Perhaps I should clarify that (1) the goal would be to have blind-tee
> > > > (and blind-cat if that is implemented) to use already existing functions
> > > > that sends data between two files, and have these functions use splice
> > > > when possible), so would only use tee explicitly, not splice, and (2)
> > > > a 1 hour long blind video with the resolution 1920x1080@30 is 6.5TB
> > > > large, so we are talking about a massive amount of data that is beeing
> > > > sent between processes. In total, a 1 hour long video could require
> > > > houndreds of terabytes being sent between process. Double that if it is 
> > > > a
> > > > 60 fps video (which you will actually on the Internet nowadays) and
> > > > multiple that by 4 for a 4K video, and double that again for a 
> > > > stereoscopic
> > > > video (you can find all of these things on Blu-ray videos). So we could
> > > > be talking about a couple of petabytes of data for a profession video,
> > > > and 100TB for an amateur video, not just a couple of gigabytes. Try to
> > > > copy that about of data with dd (in parallell of course), and you might
> > > > find that even halving that time[1] would be nice, even if most if the 
> > > > time
> > > > in the rendering process is usually spent on rendering effects, stacking
> > > > videos, and transcoding.
> > > >
> > > > [1] 42 hours per petabyte.
> > > >
> > > > 
> > > > From: Mattias Andrée [maand...@kth.se]
> > > > Sent: 12 December 2017 11:32
> > > > To: hackers mail list
> > > > Subject: RE: [hackers] [blind] update todo: tee is too slow || Mattias
> > > > Andrée
> > > >
> > > > When I rendered a video, tee used 100% while the other process was
> > > > basically at 0, for more than 50% of the rendering time. I ran it 
> > > > multiply
> > > > times to verify that is was correct. An alternative solution would be
> > > > to use sockets, but that would require changes to the shell. Optimising
> > > > tee seems like the sensible alternative. Besides, normally when you use
> > > > tools like tee and cat, you expect it to finish within milliseconds, 
> > > > even
> > > > for larger files, here we are talking about time intervals between 
> > > > seconds
> > > > and a few hours, with most if not all CPU:s at 90% to 100%, so
> > > > optimisations
> > > > do not hurt, especially not when as significant as using tee and 
> > > > splice. It
> > > > may be unfortunate to have to use both tee–splice and read–write (which
> > > > is required both for platforms not supporting tee and splice, and for 
> > > > file
> > > > types not supporting them), but considering how little complexity this
> > > > adds,
> > > > — it's not much more than a duplication of a function, — I think it is a
> > > > trade-off worth considering.
> > > >
> > > > 
> > > > Fr

Re: [hackers] [blind] update todo: tee is too slow || Mattias Andrée

2017-12-12 Thread Mattias Andrée
I'm not convinced, but since you guys disagree I will not implement this.

On Wed, 13 Dec 2017 00:45:18 +
Richard Ipsum  wrote:

> I lack community standing to really comment on this.
> 
> That being said, as a casual observer it seems fairly obvious to me
> that what you're proposing is totally anathema to suckless philosophy.
> 
> On Tue, Dec 12, 2017 at 02:49:34PM +, Mattias Andrée wrote:
> > It may look insane on the surface level. I am not commited to changes
> > listed in the TODO, most of them are just ideas that I will have to evalute
> > later whether they are worth implementing. Although I'm pretty convinced
> > most of them are good.
> > 
> > From: isabella parakiss [izaber...@gmail.com]
> > Sent: 12 December 2017 13:31
> > To: hackers mail list
> > Subject: Re: [hackers] [blind] update todo: tee is too slow || Mattias 
> > Andrée
> > 
> > don't you realize how insane the whole thing is?
> > 
> > On 12/12/17, Mattias Andrée  wrote:  
> > > Perhaps I should clarify that (1) the goal would be to have blind-tee
> > > (and blind-cat if that is implemented) to use already existing functions
> > > that sends data between two files, and have these functions use splice
> > > when possible), so would only use tee explicitly, not splice, and (2)
> > > a 1 hour long blind video with the resolution 1920x1080@30 is 6.5TB
> > > large, so we are talking about a massive amount of data that is beeing
> > > sent between processes. In total, a 1 hour long video could require
> > > houndreds of terabytes being sent between process. Double that if it is a
> > > 60 fps video (which you will actually on the Internet nowadays) and
> > > multiple that by 4 for a 4K video, and double that again for a 
> > > stereoscopic
> > > video (you can find all of these things on Blu-ray videos). So we could
> > > be talking about a couple of petabytes of data for a profession video,
> > > and 100TB for an amateur video, not just a couple of gigabytes. Try to
> > > copy that about of data with dd (in parallell of course), and you might
> > > find that even halving that time[1] would be nice, even if most if the 
> > > time
> > > in the rendering process is usually spent on rendering effects, stacking
> > > videos, and transcoding.
> > >
> > > [1] 42 hours per petabyte.
> > >
> > > 
> > > From: Mattias Andrée [maand...@kth.se]
> > > Sent: 12 December 2017 11:32
> > > To: hackers mail list
> > > Subject: RE: [hackers] [blind] update todo: tee is too slow || Mattias
> > > Andrée
> > >
> > > When I rendered a video, tee used 100% while the other process was
> > > basically at 0, for more than 50% of the rendering time. I ran it multiply
> > > times to verify that is was correct. An alternative solution would be
> > > to use sockets, but that would require changes to the shell. Optimising
> > > tee seems like the sensible alternative. Besides, normally when you use
> > > tools like tee and cat, you expect it to finish within milliseconds, even
> > > for larger files, here we are talking about time intervals between seconds
> > > and a few hours, with most if not all CPU:s at 90% to 100%, so
> > > optimisations
> > > do not hurt, especially not when as significant as using tee and splice. 
> > > It
> > > may be unfortunate to have to use both tee–splice and read–write (which
> > > is required both for platforms not supporting tee and splice, and for file
> > > types not supporting them), but considering how little complexity this
> > > adds,
> > > — it's not much more than a duplication of a function, — I think it is a
> > > trade-off worth considering.
> > >
> > > 
> > > From: isabella parakiss [izaber...@gmail.com]
> > > Sent: 12 December 2017 10:58
> > > To: hackers mail list
> > > Subject: Re: [hackers] [blind] update todo: tee is too slow || Mattias
> > > Andrée
> > >
> > > bullshit
> > >
> > > On 12/4/17, g...@suckless.org  wrote:  
> > >> commit d8aa45da86d1128149fd7ab6ac3725bf8e88a1b1
> > >> Author: Mattias Andrée 
> > >> AuthorDate: Mon Dec 4 22:35:59 2017 +0100
> > >> Commit: Mattias Andrée 
> > >> CommitDate: Mon Dec 4 22:35:59 2017 +0100
> > >>
> > >> update todo: tee is too sl

RE: [hackers] [blind] update todo: tee is too slow || Mattias Andrée

2017-12-12 Thread Mattias Andrée
It may look insane on the surface level. I am not commited to changes
listed in the TODO, most of them are just ideas that I will have to evalute
later whether they are worth implementing. Although I'm pretty convinced
most of them are good.

From: isabella parakiss [izaber...@gmail.com]
Sent: 12 December 2017 13:31
To: hackers mail list
Subject: Re: [hackers] [blind] update todo: tee is too slow || Mattias Andrée

don't you realize how insane the whole thing is?

On 12/12/17, Mattias Andrée  wrote:
> Perhaps I should clarify that (1) the goal would be to have blind-tee
> (and blind-cat if that is implemented) to use already existing functions
> that sends data between two files, and have these functions use splice
> when possible), so would only use tee explicitly, not splice, and (2)
> a 1 hour long blind video with the resolution 1920x1080@30 is 6.5TB
> large, so we are talking about a massive amount of data that is beeing
> sent between processes. In total, a 1 hour long video could require
> houndreds of terabytes being sent between process. Double that if it is a
> 60 fps video (which you will actually on the Internet nowadays) and
> multiple that by 4 for a 4K video, and double that again for a stereoscopic
> video (you can find all of these things on Blu-ray videos). So we could
> be talking about a couple of petabytes of data for a profession video,
> and 100TB for an amateur video, not just a couple of gigabytes. Try to
> copy that about of data with dd (in parallell of course), and you might
> find that even halving that time[1] would be nice, even if most if the time
> in the rendering process is usually spent on rendering effects, stacking
> videos, and transcoding.
>
> [1] 42 hours per petabyte.
>
> 
> From: Mattias Andrée [maand...@kth.se]
> Sent: 12 December 2017 11:32
> To: hackers mail list
> Subject: RE: [hackers] [blind] update todo: tee is too slow || Mattias
> Andrée
>
> When I rendered a video, tee used 100% while the other process was
> basically at 0, for more than 50% of the rendering time. I ran it multiply
> times to verify that is was correct. An alternative solution would be
> to use sockets, but that would require changes to the shell. Optimising
> tee seems like the sensible alternative. Besides, normally when you use
> tools like tee and cat, you expect it to finish within milliseconds, even
> for larger files, here we are talking about time intervals between seconds
> and a few hours, with most if not all CPU:s at 90% to 100%, so
> optimisations
> do not hurt, especially not when as significant as using tee and splice. It
> may be unfortunate to have to use both tee–splice and read–write (which
> is required both for platforms not supporting tee and splice, and for file
> types not supporting them), but considering how little complexity this
> adds,
> — it's not much more than a duplication of a function, — I think it is a
> trade-off worth considering.
>
> 
> From: isabella parakiss [izaber...@gmail.com]
> Sent: 12 December 2017 10:58
> To: hackers mail list
> Subject: Re: [hackers] [blind] update todo: tee is too slow || Mattias
> Andrée
>
> bullshit
>
> On 12/4/17, g...@suckless.org  wrote:
>> commit d8aa45da86d1128149fd7ab6ac3725bf8e88a1b1
>> Author: Mattias Andrée 
>> AuthorDate: Mon Dec 4 22:35:59 2017 +0100
>> Commit: Mattias Andrée 
>> CommitDate: Mon Dec 4 22:35:59 2017 +0100
>>
>> update todo: tee is too slow
>>
>> Signed-off-by: Mattias Andrée 
>>
>> diff --git a/TODO b/TODO
>> index b0bbde7..408e942 100644
>> --- a/TODO
>> +++ b/TODO
>> @@ -1,3 +1,7 @@
>> +blind-tee (and tee(1)) is too slow (bottleneck) and must be
>> reimplemented
>> +using tee(2) and splice(2). cat(1) may also be too slow, if this is the
>> +case, add blind-splice that just copies stdin to stdout using splice(2).
>> +
>>  blind-transform  affine transformation by matrix
>> multiplication, -[xy] for
>> tiling, -s for
>>   improve quality on downscaling (pixels'
>> neighbours must not change)
>>  blind-apply-map  remap pixels (distortion) using the X and Y
>> values, -[xy]
>> for tiling, -s for
>>
>>
>
>
>
>




RE: [hackers] [blind] update todo: tee is too slow || Mattias Andrée

2017-12-12 Thread Mattias Andrée
Perhaps I should clarify that (1) the goal would be to have blind-tee
(and blind-cat if that is implemented) to use already existing functions
that sends data between two files, and have these functions use splice
when possible), so would only use tee explicitly, not splice, and (2)
a 1 hour long blind video with the resolution 1920x1080@30 is 6.5TB
large, so we are talking about a massive amount of data that is beeing
sent between processes. In total, a 1 hour long video could require
houndreds of terabytes being sent between process. Double that if it is a
60 fps video (which you will actually on the Internet nowadays) and
multiple that by 4 for a 4K video, and double that again for a stereoscopic
video (you can find all of these things on Blu-ray videos). So we could
be talking about a couple of petabytes of data for a profession video,
and 100TB for an amateur video, not just a couple of gigabytes. Try to
copy that about of data with dd (in parallell of course), and you might
find that even halving that time[1] would be nice, even if most if the time
in the rendering process is usually spent on rendering effects, stacking
videos, and transcoding.

[1] 42 hours per petabyte.


From: Mattias Andrée [maand...@kth.se]
Sent: 12 December 2017 11:32
To: hackers mail list
Subject: RE: [hackers] [blind] update todo: tee is too slow || Mattias Andrée

When I rendered a video, tee used 100% while the other process was
basically at 0, for more than 50% of the rendering time. I ran it multiply
times to verify that is was correct. An alternative solution would be
to use sockets, but that would require changes to the shell. Optimising
tee seems like the sensible alternative. Besides, normally when you use
tools like tee and cat, you expect it to finish within milliseconds, even
for larger files, here we are talking about time intervals between seconds
and a few hours, with most if not all CPU:s at 90% to 100%, so optimisations
do not hurt, especially not when as significant as using tee and splice. It
may be unfortunate to have to use both tee–splice and read–write (which
is required both for platforms not supporting tee and splice, and for file
types not supporting them), but considering how little complexity this adds,
— it's not much more than a duplication of a function, — I think it is a
trade-off worth considering.


From: isabella parakiss [izaber...@gmail.com]
Sent: 12 December 2017 10:58
To: hackers mail list
Subject: Re: [hackers] [blind] update todo: tee is too slow || Mattias Andrée

bullshit

On 12/4/17, g...@suckless.org  wrote:
> commit d8aa45da86d1128149fd7ab6ac3725bf8e88a1b1
> Author:     Mattias Andrée 
> AuthorDate: Mon Dec 4 22:35:59 2017 +0100
> Commit:     Mattias Andrée 
> CommitDate: Mon Dec 4 22:35:59 2017 +0100
>
> update todo: tee is too slow
>
>     Signed-off-by: Mattias Andrée 
>
> diff --git a/TODO b/TODO
> index b0bbde7..408e942 100644
> --- a/TODO
> +++ b/TODO
> @@ -1,3 +1,7 @@
> +blind-tee (and tee(1)) is too slow (bottleneck) and must be reimplemented
> +using tee(2) and splice(2). cat(1) may also be too slow, if this is the
> +case, add blind-splice that just copies stdin to stdout using splice(2).
> +
>  blind-transform  affine transformation by matrix multiplication, 
> -[xy] for
> tiling, -s for
>   improve quality on downscaling (pixels' 
> neighbours must not change)
>  blind-apply-map  remap pixels (distortion) using the X and Y 
> values, -[xy]
> for tiling, -s for
>
>





RE: [hackers] [blind] update todo: tee is too slow || Mattias Andrée

2017-12-12 Thread Mattias Andrée
When I rendered a video, tee used 100% while the other process was
basically at 0, for more than 50% of the rendering time. I ran it multiply
times to verify that is was correct. An alternative solution would be
to use sockets, but that would require changes to the shell. Optimising
tee seems like the sensible alternative. Besides, normally when you use
tools like tee and cat, you expect it to finish within milliseconds, even
for larger files, here we are talking about time intervals between seconds
and a few hours, with most if not all CPU:s at 90% to 100%, so optimisations
do not hurt, especially not when as significant as using tee and splice. It
may be unfortunate to have to use both tee–splice and read–write (which
is required both for platforms not supporting tee and splice, and for file
types not supporting them), but considering how little complexity this adds,
— it's not much more than a duplication of a function, — I think it is a
trade-off worth considering.


From: isabella parakiss [izaber...@gmail.com]
Sent: 12 December 2017 10:58
To: hackers mail list
Subject: Re: [hackers] [blind] update todo: tee is too slow || Mattias Andrée

bullshit

On 12/4/17, g...@suckless.org  wrote:
> commit d8aa45da86d1128149fd7ab6ac3725bf8e88a1b1
> Author:     Mattias Andrée 
> AuthorDate: Mon Dec 4 22:35:59 2017 +0100
> Commit:     Mattias Andrée 
> CommitDate: Mon Dec 4 22:35:59 2017 +0100
>
> update todo: tee is too slow
>
>     Signed-off-by: Mattias Andrée 
>
> diff --git a/TODO b/TODO
> index b0bbde7..408e942 100644
> --- a/TODO
> +++ b/TODO
> @@ -1,3 +1,7 @@
> +blind-tee (and tee(1)) is too slow (bottleneck) and must be reimplemented
> +using tee(2) and splice(2). cat(1) may also be too slow, if this is the
> +case, add blind-splice that just copies stdin to stdout using splice(2).
> +
>  blind-transform  affine transformation by matrix multiplication, 
> -[xy] for
> tiling, -s for
>   improve quality on downscaling (pixels' 
> neighbours must not change)
>  blind-apply-map  remap pixels (distortion) using the X and Y 
> values, -[xy]
> for tiling, -s for
>
>




[hackers] [sbase][PATCH v2] Add patch(1)

2017-09-30 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 Makefile   |2 +
 README |1 +
 TODO   |1 -
 libutil/asprintf.c |   74 +++
 libutil/getlines.c |   17 +-
 patch.1|  248 +++
 patch.c| 1850 
 text.h |4 +-
 util.h |5 +
 9 files changed, 2195 insertions(+), 7 deletions(-)
 create mode 100644 libutil/asprintf.c
 create mode 100644 patch.1
 create mode 100644 patch.c

diff --git a/Makefile b/Makefile
index 1c39fef..014db74 100644
--- a/Makefile
+++ b/Makefile
@@ -45,6 +45,7 @@ LIBUTFSRC =\
 
 LIBUTIL = libutil.a
 LIBUTILSRC =\
+   libutil/asprintf.c\
libutil/concat.c\
libutil/cp.c\
libutil/crypt.c\
@@ -132,6 +133,7 @@ BIN =\
nohup\
od\
paste\
+   patch\
pathchk\
printenv\
printf\
diff --git a/README b/README
index da2e500..6c94f2f 100644
--- a/README
+++ b/README
@@ -59,6 +59,7 @@ The following tools are implemented:
 0#*|o nl  .
 0=*|o nohup   .
 0=*|o od  .
+0=patch   .
 0#* o pathchk .
  #*|o paste   .
 0=*|x printenv.
diff --git a/TODO b/TODO
index 5edb8a3..fe2344e 100644
--- a/TODO
+++ b/TODO
@@ -8,7 +8,6 @@ awk
 bc
 diff
 ed manpage
-patch
 stty
 
 If you are looking for some work to do on sbase, another option is to
diff --git a/libutil/asprintf.c b/libutil/asprintf.c
new file mode 100644
index 000..929ed09
--- /dev/null
+++ b/libutil/asprintf.c
@@ -0,0 +1,74 @@
+/* See LICENSE file for copyright and license details. */
+#include 
+#include 
+#include 
+
+#include "../util.h"
+
+static int xenvasprintf(int, char **, const char *, va_list);
+
+int
+asprintf(char **strp, const char *fmt, ...)
+{
+   va_list ap;
+   int ret;
+
+   va_start(ap, fmt);
+   ret = xenvasprintf(-1, strp, fmt, ap);
+   va_end(ap);
+
+   return ret;
+}
+
+int
+easprintf(char **strp, const char *fmt, ...)
+{
+   va_list ap;
+   int ret;
+
+   va_start(ap, fmt);
+   ret = xenvasprintf(1, strp, fmt, ap);
+   va_end(ap);
+
+   return ret;
+}
+
+int
+enasprintf(int status, char **strp, const char *fmt, ...)
+{
+   va_list ap;
+   int ret;
+
+   va_start(ap, fmt);
+   ret = xenvasprintf(status, strp, fmt, ap);
+   va_end(ap);
+
+   return ret;
+}
+
+int
+xenvasprintf(int status, char **strp, const char *fmt, va_list ap)
+{
+   int ret;
+   va_list ap2;
+
+   va_copy(ap2, ap);
+   ret = vsnprintf(0, 0, fmt, ap2);
+   va_end(ap2);
+   if (ret < 0) {
+   if (status >= 0)
+   enprintf(status, "vsnprintf:");
+   *strp = 0;
+   return -1;
+   }
+
+   *strp = malloc(ret + 1);
+   if (!*strp) {
+   if (status >= 0)
+   enprintf(status, "malloc:");
+   return -1;
+   }
+
+   vsprintf(*strp, fmt, ap);
+   return ret;
+}
diff --git a/libutil/getlines.c b/libutil/getlines.c
index b912769..9af7684 100644
--- a/libutil/getlines.c
+++ b/libutil/getlines.c
@@ -7,7 +7,7 @@
 #include "../util.h"
 
 void
-getlines(FILE *fp, struct linebuf *b)
+ngetlines(int status, FILE *fp, struct linebuf *b)
 {
char *line = NULL;
size_t size = 0, linelen = 0;
@@ -16,17 +16,24 @@ getlines(FILE *fp, struct linebuf *b)
while ((len = getline(&line, &size, fp)) > 0) {
if (++b->nlines > b->capacity) {
b->capacity += 512;
-   b->lines = erealloc(b->lines, b->capacity * 
sizeof(*b->lines));
+   b->lines = enrealloc(status, b->lines, b->capacity * 
sizeof(*b->lines));
}
linelen = len;
-   b->lines[b->nlines - 1].data = memcpy(emalloc(linelen + 1), 
line, linelen + 1);
+   b->lines[b->nlines - 1].data = memcpy(enmalloc(status, linelen 
+ 1), line, linelen + 1);
b->lines[b->nlines - 1].len = linelen;
}
free(line);
-   if (b->lines && b->nlines && linelen && b->lines[b->nlines - 
1].data[linelen - 1] != '\n') {
-   b->lines[b->nlines - 1].data = erealloc(b->lines[b->nlines - 
1].data, linelen + 2);
+   b->nolf = b->lines && b->nlines && linelen && b->lines[b->nlines - 
1].data[linelen - 1] != '\n';
+   if (b->nolf) {
+   b->lines[b->nlines - 1].data = enrealloc(status, 
b->lines[b->nlines - 1].data, linelen + 2);
b->lines[b->nlines - 1].data[linelen] = '\n';
b->lines[b->nlines - 1].data[linelen + 1] = '\0';
b->lines[b->nlines - 1].len++;
  

Re: [hackers] [PATCH][sbase] Add patch(1)

2017-09-24 Thread Mattias Andrée
On Sun, 24 Sep 2017 19:24:10 +0200
Silvan Jegen  wrote:

> Heyho
> 
> On Sun, Sep 24, 2017 at 06:28:57PM +0200, Mattias Andrée wrote:
> > On Sun, 24 Sep 2017 14:08:41 +0200
> > Silvan Jegen  wrote:
> >   
> > > > +
> > > > +   if (!new->len)
> > > > +   for (i = 0; i < old->len; i++)
> > > > +   if (old->lines[i].data[-2] != '-')
> > > 
> > > I think according to the standard, refering to data[-2] invokes undefined
> > > behaviour, since at this time, data points to the beginning of the array. 
> > >  
> > 
> > I'm not sure whether it is defined or undefined; I would think that it
> > defined, but that adding integers larger than INTPTR_MAX is undefined.
> > I will change to `*(data - 2)` as this is clearly defined.  
> 
> I was referring to
> https://stackoverflow.com/questions/3473675/are-negative-array-indexes-allowed-in-c/3473686#3473686
> . `*(data -2) is equivalent to 'data[-2]' but since 'data' doesn't point
> to the second element of the array, I don't think this is valid.

Hi!

I think there has been some misunderstanding here,
and that we are in agreement that `a[-b]` in it self
is not invalid, but that question is whether the
deferenced address is valid. I understand why this
looks incorrect, `old->lines->data[0]` does not
actually point to the first character on a line
in a line but rather to the first character in the
line that is part of the content of the file that
hunk patches. For example if the patchfile contains
the line "- abc", `old->lines->data[0]` is `a`, not
`-`, because "- " part of the annotations in the
hunk.

This should probably be clarified, but you can see
that this happening just above this code.

I will look that your other comments later.


pgpWxkIdgIKQz.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] patch: improvments suggested by Silvan

2017-09-24 Thread Mattias Andrée
On Sun, 24 Sep 2017 11:12:35 -0700
Michael Forney  wrote:

> Hi Mattias,
> 
> Instead of sending these patches on top of your original patch, can
> you send amended versions (v2, v3, etc)? You can use `git format-patch
> -v 2` to make this clear in the subject.
> 
> I think that this would make it easier to review and keep track of your patch.
> 
> Thanks!
> 

Hi Michael!

I thought it would be easier to see the changes I make.
I think an amended version makes more sense when it's
time to merge.


pgpGklutUDIM8.pgp
Description: OpenPGP digital signature


[hackers] [sbase][PATCH] patch: improvments suggested by Silvan

2017-09-24 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 patch.c | 182 +---
 1 file changed, 95 insertions(+), 87 deletions(-)

diff --git a/patch.c b/patch.c
index c118dc9..caf34be 100644
--- a/patch.c
+++ b/patch.c
@@ -41,7 +41,7 @@
 #define linecpy2mem(d, s)  (memcpy(d, (s).data, (s).len + 1))
 #define missinglf(l)   ((l).len && (l).data[(l).len - 1] != '\n')
 #define fwriteline(f, l)   (fwrite((l).data, 1, (l).len, f))
-#define enmemdup(f, s, n)  ((n) ? memcpy(enmalloc(f, n), s, n) : 0)
+#define enmemdup(f, s, n)  ((n) ? memcpy(enmalloc(f, n), s, n) : NULL)
 
 enum { REJECTED = 1, FAILURE = 2 };
 enum applicability { APPLICABLE, APPLIED, INAPPLICABLE };
@@ -104,10 +104,10 @@ struct patched_file {
 };
 
 static enum format specified_format = GUESS;
-static const char *patchfile = 0;
-static char *rejectfile = 0;
-static const char *outfile = 0;
-static char *apply_patches_to = 0;
+static const char *patchfile = NULL;
+static char *rejectfile = NULL;
+static const char *outfile = NULL;
+static char *apply_patches_to = NULL;
 static size_t pflag = SIZE_MAX;
 static int bflag = 0;
 static int fflag = 0;
@@ -115,16 +115,16 @@ static int lflag = 0;
 static int Rflag = 0;
 static int Nflag = 0;
 static int Uflag = 0;
-static char *dflag = 0;
+static char *dflag = NULL;
 static int rejected = 0;
-static struct patched_file *prevpatch = 0;
+static struct patched_file *prevpatch = NULL;
 static size_t prevpatchn = 0;
-static struct patched_file *prevout = 0;
+static struct patched_file *prevout = NULL;
 static size_t prevoutn = 0;
 static char stdin_dash[sizeof("-")];
 static char stdout_dash[sizeof("-")];
-static char *ifdef = 0;
-static char *ifndef = 0;
+static char *ifdef = NULL;
+static char *ifndef = NULL;
 
 static void
 usage(void)
@@ -134,7 +134,7 @@ usage(void)
 }
 
 static void
-load_lines(const char *path, struct file_data *out, int skip_lf, int orig)
+load_lines(struct file_data *out, const char *path, int skip_lf, int orig)
 {
FILE *f;
struct linebuf b = EMPTY_LINEBUF;
@@ -165,12 +165,12 @@ static char *
 ask(const char *instruction)
 {
FILE *f;
-   char *answer = 0;
+   char *answer = NULL;
size_t size = 0;
ssize_t n;
 
if (fflag)
-   return 0;
+   return NULL;
 
if (!(f = fopen("/dev/tty", "r+")))
enprintf(FAILURE, "fopen /dev/tty:");
@@ -180,13 +180,13 @@ ask(const char *instruction)
fflush(stdout);
 
if ((n = getline(&answer, &size, f)) <= 0) {
-   answer = 0;
+   answer = NULL;
} else {
n -= (answer[n - 1] == '\n');
-   answer[n] = 0;
+   answer[n] = '\0';
if (!*answer) {
free(answer);
-   answer = 0;
+   answer = NULL;
}
}
 
@@ -202,13 +202,13 @@ adjust_filename(const char *filename)
const char *stripped = filename;
char *rc;
 
-   if (p == filename || p[-1] == '/')
-   return 0;
+   if (p == filename || *(p - 1) == '/')
+   return NULL;
 
for (; strips && (p = strchr(stripped, '/')); strips--)
for (stripped = p; *stripped == '/'; stripped++);
if (strips && pflag != SIZE_MAX)
-   return 0;
+   return NULL;
 
if (dflag && *stripped != '/')
enasprintf(FAILURE, &rc, "%s/%s", dflag, stripped);
@@ -283,7 +283,7 @@ unquote(char *str)
}
}
 
-   str[w] = 0;
+   str[w] = '\0';
return 0;
 }
 
@@ -302,7 +302,7 @@ parse_diff_line(char *str, char **old, char **new)
char *s = strchr(str, '\0');
int ret = 0;
 
-   *new = 0;
+   *new = NULL;
if (s == str)
return -1;
 
@@ -313,7 +313,7 @@ again:
goto found;
} else {
while (--s != str && s - 1 != str)
-   if (s[-1] == ' ' && s[0] == '"')
+   if (*(s - 1) == ' ' && s[0] == '"')
goto found;
}
 
@@ -348,19 +348,24 @@ ask_for_filename(struct patchset *patchset)
 {
size_t i;
char *answer;
+   int missing = 0;
 
-   for (i = 0; i < patchset->npatches; i++)
-   if (!patchset->patches[i].path)
-   goto found_unset;
-   return;
+   for (i = 0; i < patchset->npatches; i++) {
+   if (!patchset->patches[i].path) {
+   missing = 1;
+   break;
+   }
+   }
+
+   if (!mi

Re: [hackers] [PATCH][sbase] Add patch(1)

2017-09-24 Thread Mattias Andrée
On Sun, 24 Sep 2017 14:08:41 +0200
Silvan Jegen  wrote:

> Heyho Mattias!
> 
> I had a look at the patch. It's a lot of code (still only about 1/3 of
> GNU's patch size though) and it was rather hard for me to follow so more
> review should be done. Find my comments below.
> 
> On Sun, Sep 03, 2017 at 07:13:20PM +0200, Mattias Andrée wrote:
> > +static void
> > +save_file_cpp(FILE *f, struct file_data *file)
> > +{
> > +   size_t i, j, n;
> > +   char annot = ' ';
> > +
> > +   for (i = 0; i <= file->n; i++) {
> > +   if ((n = file->d[i].nold)) {  
> 
> In other places you iterate with "i < file->n" (see save_file below for
> example) so I think this may be an off-by-one error.

There is an if-statement, that breaks the loop if `i == file->`,
after this clause, so this should be correct. I'll add blank lines
around that if-statement to make it clearer.

> 
> 
> > +   fprintf(f, "%s\n", annot == '+' ? "#else" : ifndef);
> > +   for (j = 0; j < n; j++) {
> > +   fwriteline(f, file->d[i].old[j]);
> > +   if (missinglf(file->d[i].old[j]))
> > +   fprintf(f, "\n");
> > +   }
> > +   annot = '-';
> > +   }
> > +   if (i == file->n)
> > +   break;
> > +   if (annot == '-')
> > +   fprintf(f, "%s\n", file->d[i].new ? "#else" : "#endif");
> > +   else if (annot == ' ' && file->d[i].new)
> > +   fprintf(f, "%s\n", ifdef);
> > +   else if (annot == '+' && !file->d[i].new)
> > +   fprintf(f, "#endif\n");
> > +   fwriteline(f, file->d[i].line);
> > +   if ((i + 1 < file->n || file->d[i].new) && 
> > missinglf(file->d[i].line))
> > +   fprintf(f, "\n");
> > +   annot = file->d[i].new ? '+' : ' ';
> > +   }
> > +   if (annot != ' ')
> > +   fprintf(f, "#endif\n");
> > +}
> > +
> > +static void
> > +parse_hunk_copied(struct hunk *hunk, struct parsed_hunk *parsed)
> > +{
> > +   struct hunk_content *old = &parsed->old, *new = &parsed->new;
> > +   size_t i = 0, a, b;
> > +   char *p;
> > +
> > +   free(hunk->head->data);
> > +
> > +   old->lines = enmalloc(FAILURE, hunk->nlines * sizeof(*old->lines));
> > +   new->lines = enmalloc(FAILURE, hunk->nlines * sizeof(*new->lines));
> > +   parsed->annot = enmalloc(FAILURE, hunk->nlines + 1);
> > +
> > +   p = hunk->lines[i++].data + 4;
> > +   old->start = strtoul(p, &p, 10);
> > +   old->len = 0;
> > +
> > +   for (; hunk->lines[i].data[1] == ' '; i++)
> > +   subline(old->lines + old->len++, hunk->lines + i, 2);
> > +
> > +   p = hunk->lines[i++].data + 4;
> > +   new->start = strtoul(p, &p, 10);
> > +   new->len = 0;
> > +
> > +   if (old->len) {
> > +   for (; i < hunk->nlines; i++)
> > +   subline(new->lines + new->len++, hunk->lines + i, 2);
> > +   } else {
> > +   for (; i < hunk->nlines; i++) {
> > +   subline(new->lines + new->len++, hunk->lines + i, 2);
> > +   if (hunk->lines[i].data[0] != '+')
> > +   subline(old->lines + old->len++, hunk->lines + 
> > i, 2);
> > +   }
> > +   }  
> 
> I think this if-else block can be rewritten like this.
> 
>   for (; i < hunk->nlines; i++) {
>   subline(new->lines + new->len++, hunk->lines + i, 2);
>   if (old->len == 0 && hunk->lines[i].data[0] != '+')
>   subline(old->lines + old->len++, hunk->lines + i, 2);
>   }

I will use `!old->len` instead of `old->len == 0`.

> 
> 
> > +
> > +   if (!new->len)
> > +   for (i = 0; i < old->len; i++)
> > +   if (old->lines[i].data[-2] != '-')  
> 
> I think according to the standard, refering to data[-2] invokes undefined
> behaviour, since at this time, data points to the beginning of the array.

I&

[hackers] [sbase][PATCH v2] Documentation and whitespace improvements to patch(1)

2017-09-11 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 patch.1 | 70 -
 patch.c |  9 -
 2 files changed, 38 insertions(+), 41 deletions(-)

diff --git a/patch.1 b/patch.1
index df2bf63..ba1cb35 100644
--- a/patch.1
+++ b/patch.1
@@ -22,37 +22,34 @@
 .Sh DESCRIPTION
 .Nm
 applies patches to files from difference listings
-produces by
+produced by
 .Xr diff 1 .
 .Pp
 .Nm
 will skip any garbage unless the
 .Ar patchfile
 consists entirely of garbage.
-Garbage is any data that does not conform to any
-of the supported difference listings formats.
+Garbage is any data that does not conform to
+the supported difference listing formats.
 .Nm
-supprts the all difference listings formats
+supports all difference listing formats
 specified in
 .Xr diff 1p
-except
+except for the
 .Fl f
 flag in
 .Xr diff 1p .
 .Pp
 .Nm
-shall, unless the
-.Ar file
-is specify, figure out from mentions of filenames
-in the
-.Ar patchfile
-which files to patch. As an extension to the
+patch shall figure out which files to patch from the
+mentions of filenames in the patch, unless the files
+are specified explicitly. As an extension to the
 standard, this implementation of
 .Nm
 can determine the filename by looking at the
 \fIdiff\fP-lines that are produced by
 .Xr diff 1
-when comparing directories. If however, the
+when comparing directories. However, if the
 .Ar file
 is specified, all patches in the
 .Ar patchfile
@@ -61,7 +58,7 @@ shall be applied to that
 .Sh OPTIONS
 .Bl -tag -width Ds
 .It Fl b
-Back up files before the first time they a patch
+Back up files before the first time that a patch
 is applied to them. The backups will have the
 suffix \fI.orig\fP.
 .It Fl c
@@ -105,7 +102,7 @@ can be \fI1\fP or \fI0\fP to use
 
 (swap those lines if
 .Ar define
-is \fI0\fP) instead of 
+is \fI0\fP) instead of
 .Bd -literal -offset left
 #ifdef \fIdefine\fP
 #ifndef \fIdefine\fP
@@ -128,12 +125,13 @@ and
 are swapped.
 
 .Nm
-does not guarantee that a patch C source code file
-will be at least a syntactically correct after patching
-as before patching. Despite this being implied by
-the standard. The syntactically correctness can be
-broken when edits are made on lines splitted using
-line continuation, made in comments, or span
+does not guarantee that a patched C source
+code file will be at least as syntactically
+correct after patching as it was before,
+despite this being implied by the standard.
+The syntactic correctness can be broken
+when edits are made on lines split using line
+continuation, made in comments, or spanning
 CPP conditional directives.
 .It Fl e
 Treat anything that is not conforming to the
@@ -145,13 +143,13 @@ Read the
 .Ar patchfile
 instead of standard output.
 .It Fl l
-Any sequnce of whitespace, of at least length 1,
-in the input file file shall match any sequnce
-of whitespace, of at least length 1 in the
+Any sequence of whitespace of at least length 1
+in the input file shall match any sequence
+of whitespace of at least length 1 in the
 difference script when testing if lines match.
-Additionally any whitespace at the beginning of
+Additionally, any whitespace at the beginning of
 a line or at the end of a line is ignored when
-matching lines, the former case is an extension
+matching lines. The former case is an extension
 of the standard.
 .It Fl n
 Treat anything that is not conforming to the
@@ -160,7 +158,7 @@ normal format as garbage.
 Ignore already applied hunks. POSIX specifies
 that already applied patches shall be ignored
 if this flag is used. A hunk is a contiguous
-portion of a patch. A patch is a signal
+portion of a patch. A patch is a single
 file-comparison output from
 .Xr diff 1 .
 .It Fl o Ar outfile
@@ -169,13 +167,13 @@ Store resulting files from patches to
 instead of to the patched file itself.
 If the patchfile patches multiple files,
 the results are concatenated. If a patchfile
-patches a file multiple times. Intermediary
+patches a file multiple times, intermediary
 results are also stored.
 
 As an extension to the standard, you may use
 non-regular files such as \fI/dev/stdout\fP
 and \fI/dev/null\fP. \fI/dev/null\fP can be
-used to preform a dryrun.
+used to perform a dryrun.
 .It Fl p Ar num
 Remove the first
 .Ar num
@@ -183,8 +181,8 @@ components from filenames that appear in the
 patchfile. Any leading / is regarded as the
 first component. If
 .Ar num
-is 0, the entire filename is used. If this flag
-is not used, only the basename is used.
+is 0, the entire filename is used. Without
+this flag only basename is used.
 .It Fl r Ar rejectfile
 Save rejected hunks to
 .Ar rejectfile
@@ -210,10 +208,10 @@ default even for unified context patches.
 .El
 .Sh NOTES
 Files that become empty as a result of a patch
-are not remove.
+are not removed.
 .Pp
 Symbolic links are treated as regular files,
-provided that they lead to regular files.
+provided that they link to regular files.
 .Pp
 Timestamps that appear

[hackers] [PATCH][sbase] Documentation and whitespace improvements to patch(1)

2017-09-11 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 patch.1 | 70 -
 patch.c |  9 -
 2 files changed, 38 insertions(+), 41 deletions(-)

diff --git a/patch.1 b/patch.1
index df2bf63..ba1cb35 100644
--- a/patch.1
+++ b/patch.1
@@ -22,37 +22,34 @@
 .Sh DESCRIPTION
 .Nm
 applies patches to files from difference listings
-produces by
+produced by
 .Xr diff 1 .
 .Pp
 .Nm
 will skip any garbage unless the
 .Ar patchfile
 consists entirely of garbage.
-Garbage is any data that does not conform to any
-of the supported difference listings formats.
+Garbage is any data that does not conform to
+the supported difference listing formats.
 .Nm
-supprts the all difference listings formats
+supports all difference listing formats
 specified in
 .Xr diff 1p
-except
+except for the
 .Fl f
 flag in
 .Xr diff 1p .
 .Pp
 .Nm
-shall, unless the
-.Ar file
-is specify, figure out from mentions of filenames
-in the
-.Ar patchfile
-which files to patch. As an extension to the
+patch shall figure out which files to patch from the
+mentions of filenames in the patch, unless the files
+are specified explicitly. As an extension to the
 standard, this implementation of
 .Nm
 can determine the filename by looking at the
 \fIdiff\fP-lines that are produced by
 .Xr diff 1
-when comparing directories. If however, the
+when comparing directories. However, if the
 .Ar file
 is specified, all patches in the
 .Ar patchfile
@@ -61,7 +58,7 @@ shall be applied to that
 .Sh OPTIONS
 .Bl -tag -width Ds
 .It Fl b
-Back up files before the first time they a patch
+Back up files before the first time that a patch
 is applied to them. The backups will have the
 suffix \fI.orig\fP.
 .It Fl c
@@ -105,7 +102,7 @@ can be \fI1\fP or \fI0\fP to use
 
 (swap those lines if
 .Ar define
-is \fI0\fP) instead of 
+is \fI0\fP) instead of
 .Bd -literal -offset left
 #ifdef \fIdefine\fP
 #ifndef \fIdefine\fP
@@ -128,12 +125,13 @@ and
 are swapped.
 
 .Nm
-does not guarantee that a patch C source code file
-will be at least a syntactically correct after patching
-as before patching. Despite this being implied by
-the standard. The syntactically correctness can be
-broken when edits are made on lines splitted using
-line continuation, made in comments, or span
+does not guarantee that a patched C source
+code file will be at least as syntactically
+correct after patching as it was before,
+despite this being implied by the standard.
+The syntactic correctness can be broken
+when edits are made on lines split using line
+continuation, made in comments, or spanning
 CPP conditional directives.
 .It Fl e
 Treat anything that is not conforming to the
@@ -145,13 +143,13 @@ Read the
 .Ar patchfile
 instead of standard output.
 .It Fl l
-Any sequnce of whitespace, of at least length 1,
-in the input file file shall match any sequnce
-of whitespace, of at least length 1 in the
+Any sequence of whitespace of at least length 1
+in the input file shall match any sequence
+of whitespace of at least length 1 in the
 difference script when testing if lines match.
-Additionally any whitespace at the beginning of
+Additionally, any whitespace at the beginning of
 a line or at the end of a line is ignored when
-matching lines, the former case is an extension
+matching lines. The former case is an extension
 of the standard.
 .It Fl n
 Treat anything that is not conforming to the
@@ -160,7 +158,7 @@ normal format as garbage.
 Ignore already applied hunks. POSIX specifies
 that already applied patches shall be ignored
 if this flag is used. A hunk is a contiguous
-portion of a patch. A patch is a signal
+portion of a patch. A patch is a single
 file-comparison output from
 .Xr diff 1 .
 .It Fl o Ar outfile
@@ -169,13 +167,13 @@ Store resulting files from patches to
 instead of to the patched file itself.
 If the patchfile patches multiple files,
 the results are concatenated. If a patchfile
-patches a file multiple times. Intermediary
+patches a file multiple times, intermediary
 results are also stored.
 
 As an extension to the standard, you may use
 non-regular files such as \fI/dev/stdout\fP
 and \fI/dev/null\fP. \fI/dev/null\fP can be
-used to preform a dryrun.
+used to perform a dryrun.
 .It Fl p Ar num
 Remove the first
 .Ar num
@@ -183,8 +181,8 @@ components from filenames that appear in the
 patchfile. Any leading / is regarded as the
 first component. If
 .Ar num
-is 0, the entire filename is used. If this flag
-is not used, only the basename is used.
+is 0, the entire filename is used. Without
+this flag only basename is used.
 .It Fl r Ar rejectfile
 Save rejected hunks to
 .Ar rejectfile
@@ -210,10 +208,10 @@ default even for unified context patches.
 .El
 .Sh NOTES
 Files that become empty as a result of a patch
-are not remove.
+are not removed.
 .Pp
 Symbolic links are treated as regular files,
-provided that they lead to regular files.
+provided that they link to regular files.
 .Pp
 Timestamps that appear

Re: [hackers] [PATCH][sbase] Add patch(1)

2017-09-11 Thread Mattias Andrée
On Mon, 11 Sep 2017 20:09:33 +0200
Silvan Jegen  wrote:

>> +when comparing directories. If however, the  
> 
> There should probably be an additional comma like this:
> 
> "If, however, the file..."

I think “However, if the file...” is better.

> > +portion of a patch. A patch is a signal
> > +file-comparison output from  
> 
> Not sure what a "signal file-comparison output" is... is this official
> POSIX/patch terminology?

s/signal/singel/, so a patch file includes
a patchset with is a number of patches, one
per file in the patch file.

>> +Symbolic links are treated as regular files,
>> +provided that they lead to regular files.  
> 
> maybe s/lead/link/ ?

Not sure, perhaps “link” sounds more natural,
but for me, ”link” means we are talking about
the step and not the final step when following
the link. However, I will add this change.


pgpFOH92wIueQ.pgp
Description: OpenPGP digital signature


[hackers] [PATCH][sbase] Add patch(1)

2017-09-03 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 Makefile   |2 +
 README |1 +
 TODO   |1 -
 libutil/asprintf.c |   74 +++
 libutil/getlines.c |   17 +-
 patch.1|  250 +++
 patch.c| 1835 
 text.h |4 +-
 util.h |5 +
 9 files changed, 2182 insertions(+), 7 deletions(-)
 create mode 100644 libutil/asprintf.c
 create mode 100644 patch.1
 create mode 100644 patch.c

diff --git a/Makefile b/Makefile
index 1c39fef..014db74 100644
--- a/Makefile
+++ b/Makefile
@@ -45,6 +45,7 @@ LIBUTFSRC =\
 
 LIBUTIL = libutil.a
 LIBUTILSRC =\
+   libutil/asprintf.c\
libutil/concat.c\
libutil/cp.c\
libutil/crypt.c\
@@ -132,6 +133,7 @@ BIN =\
nohup\
od\
paste\
+   patch\
pathchk\
printenv\
printf\
diff --git a/README b/README
index da2e500..6c94f2f 100644
--- a/README
+++ b/README
@@ -59,6 +59,7 @@ The following tools are implemented:
 0#*|o nl  .
 0=*|o nohup   .
 0=*|o od  .
+0=patch   .
 0#* o pathchk .
  #*|o paste   .
 0=*|x printenv.
diff --git a/TODO b/TODO
index 5edb8a3..fe2344e 100644
--- a/TODO
+++ b/TODO
@@ -8,7 +8,6 @@ awk
 bc
 diff
 ed manpage
-patch
 stty
 
 If you are looking for some work to do on sbase, another option is to
diff --git a/libutil/asprintf.c b/libutil/asprintf.c
new file mode 100644
index 000..929ed09
--- /dev/null
+++ b/libutil/asprintf.c
@@ -0,0 +1,74 @@
+/* See LICENSE file for copyright and license details. */
+#include 
+#include 
+#include 
+
+#include "../util.h"
+
+static int xenvasprintf(int, char **, const char *, va_list);
+
+int
+asprintf(char **strp, const char *fmt, ...)
+{
+   va_list ap;
+   int ret;
+
+   va_start(ap, fmt);
+   ret = xenvasprintf(-1, strp, fmt, ap);
+   va_end(ap);
+
+   return ret;
+}
+
+int
+easprintf(char **strp, const char *fmt, ...)
+{
+   va_list ap;
+   int ret;
+
+   va_start(ap, fmt);
+   ret = xenvasprintf(1, strp, fmt, ap);
+   va_end(ap);
+
+   return ret;
+}
+
+int
+enasprintf(int status, char **strp, const char *fmt, ...)
+{
+   va_list ap;
+   int ret;
+
+   va_start(ap, fmt);
+   ret = xenvasprintf(status, strp, fmt, ap);
+   va_end(ap);
+
+   return ret;
+}
+
+int
+xenvasprintf(int status, char **strp, const char *fmt, va_list ap)
+{
+   int ret;
+   va_list ap2;
+
+   va_copy(ap2, ap);
+   ret = vsnprintf(0, 0, fmt, ap2);
+   va_end(ap2);
+   if (ret < 0) {
+   if (status >= 0)
+   enprintf(status, "vsnprintf:");
+   *strp = 0;
+   return -1;
+   }
+
+   *strp = malloc(ret + 1);
+   if (!*strp) {
+   if (status >= 0)
+   enprintf(status, "malloc:");
+   return -1;
+   }
+
+   vsprintf(*strp, fmt, ap);
+   return ret;
+}
diff --git a/libutil/getlines.c b/libutil/getlines.c
index b912769..9af7684 100644
--- a/libutil/getlines.c
+++ b/libutil/getlines.c
@@ -7,7 +7,7 @@
 #include "../util.h"
 
 void
-getlines(FILE *fp, struct linebuf *b)
+ngetlines(int status, FILE *fp, struct linebuf *b)
 {
char *line = NULL;
size_t size = 0, linelen = 0;
@@ -16,17 +16,24 @@ getlines(FILE *fp, struct linebuf *b)
while ((len = getline(&line, &size, fp)) > 0) {
if (++b->nlines > b->capacity) {
b->capacity += 512;
-   b->lines = erealloc(b->lines, b->capacity * 
sizeof(*b->lines));
+   b->lines = enrealloc(status, b->lines, b->capacity * 
sizeof(*b->lines));
}
linelen = len;
-   b->lines[b->nlines - 1].data = memcpy(emalloc(linelen + 1), 
line, linelen + 1);
+   b->lines[b->nlines - 1].data = memcpy(enmalloc(status, linelen 
+ 1), line, linelen + 1);
b->lines[b->nlines - 1].len = linelen;
}
free(line);
-   if (b->lines && b->nlines && linelen && b->lines[b->nlines - 
1].data[linelen - 1] != '\n') {
-   b->lines[b->nlines - 1].data = erealloc(b->lines[b->nlines - 
1].data, linelen + 2);
+   b->nolf = b->lines && b->nlines && linelen && b->lines[b->nlines - 
1].data[linelen - 1] != '\n';
+   if (b->nolf) {
+   b->lines[b->nlines - 1].data = enrealloc(status, 
b->lines[b->nlines - 1].data, linelen + 2);
b->lines[b->nlines - 1].data[linelen] = '\n';
b->lines[b->nlines - 1].data[linelen + 1] = '\0';
b->lines[b->nlines - 1].len++;
  

Re: [hackers] [sbase][PATCH] concat: read(2): handle EINTR

2017-07-23 Thread Mattias Andrée
I don't think the #ifdef is necessary, EINTR if defined in POSIX
and POSIX specifies that EINTR can be returned by read(3). Additionally,
checking if EINTR is defined would be better.

However, there is no need to check for EINTR unless the program
catches signals. Is there any tools in sbase that both use concat
and catch signals?


On Sun, 23 Jul 2017 10:13:50 +0100
Richard Ipsum  wrote:

> On Linux we may receive EINTR if the read call is interrupted
> before any data is read, if this is the case we can try to read
> again.
> ---
>  libutil/concat.c | 22 +-
>  1 file changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/libutil/concat.c b/libutil/concat.c
> index 2e9aa52..5cbadd2 100644
> --- a/libutil/concat.c
> +++ b/libutil/concat.c
> @@ -1,4 +1,5 @@
>  /* See LICENSE file for copyright and license details. */
> +#include 
>  #include 
>  
>  #include "../util.h"
> @@ -9,15 +10,26 @@ concat(int f1, const char *s1, int f2, const char *s2)
>   char buf[BUFSIZ];
>   ssize_t n;
>  
> - while ((n = read(f1, buf, sizeof(buf))) > 0) {
> + for (;;) {
> + n = read(f1, buf, sizeof(buf));
> +
> + if (n == 0)
> + break;
> +
> + if (n < 0) {
> +#ifdef __linux__
> + if (errno == EINTR)
> + continue;
> +#endif
> + weprintf("read %s:", s1);
> + return -1;
> + }
> +
>   if (writeall(f2, buf, n) < 0) {
>   weprintf("write %s:", s2);
>   return -2;
>   }
>   }
> - if (n < 0) {
> - weprintf("read %s:", s1);
> - return -1;
> - }
> +
>   return 0;
>  }



pgpv4_D3_KvBt.pgp
Description: OpenPGP digital signature


Re: [hackers] [PATCH v3][sbase] libutil/unescape.c: simplify and add \E

2017-02-06 Thread Mattias Andrée
On Mon, 06 Feb 2017 15:50:04 -0800
evan.ga...@gmail.com (Evan Gates) wrote:

> Mattias Andrée  wrote:
> 
> > On Mon, 06 Feb 2017 15:05:32 -0800
> > evan.ga...@gmail.com (Evan Gates) wrote:
> >   
> > > Mattias Andrée  wrote:
> > >   
> > > > +   } else if (escapes[*r & 255]) {
> > > > +   *w++ = escapes[*r++ &
> > > > 255];
> > > 
> > > Why do you & 255 here? I think a cast to unsigned char
> > > would accomplish what you are trying to do and be more
> > > correct (as char can default to either signed or
> > > unsigned). Although I may misunderstand what is going
> > > on here.  
> > 
> > Yes. I used &255 because it's does clutter as much.  
> 
> OK. I'm going to change to a cast to be on the safe side
> due to the "implementation-defined and undefined aspects"
> mentioned in C99 6.5p4:
> 
> Some operators (the unary operator ~, and the binary
> operators <<, >>, &, ^, and |, collectively described as
> bitwise operators) are required to have operands that
> have integer type. These operators yield values that
> depend on the internal representations of integers, and
> have implementation-defined and undefined aspects for
> signed types.

Sure.

> 
> > > I think this is clearer, even though it adds a
> > > conditional:
> > > 
> > > q = q * 16 + isdigit(*r) ? *r - '0' : tolower(*r) -
> > > 'a' + 10;  
> > 
> > I think
> > 
> > if (isdigit(*r))
> > q = q * 16 + (*r - '0');
> > else
> > q = q * 16 + (tolower(*r) - 'a') + 10;
> > 
> > is clearer, or at least brackets around the ternary.  
> 
> You're right, the line was getting a bit long and
> convoluted with the ternary, bad habit of mine.
> 
> Another thought just came to mind, isoctal is a reserved
> name. 7.26p1:
> 
> The following names are grouped under individual headers
> for convenience. All external names described below are
> reserved no matter what headers are included by the
> program.
> 
> 7.26.2p1:
> 
> Function names that begin with either is or to, and a
> lowercase letter may be added to the declarations in the
>  header.
> 
> I don't know if it's worth being anal about that,
> especially as we have already limited ourselves to C99 do
> we need to worry about future directions? For now I've
> changed isoctal() to is_odigit() but I'm open to
> discussion.

Sure, I didn't think about that, however, I think we can
be pretty confident that if isoctal is added, it will do
that same thing as that macro, is will not make a difference.
However, I think isodigit probably what such a function
will be called since there already is isxdigit, so why not
change it to is_odigit or ODIGIT.

> 
> Updated patch for consideration:
> 
> - 8< - 8< - 8< -
> From 30fd43d7f3b8716054eb9867c835aadc423f652c Mon Sep 17
> 00:00:00 2001 From: =?UTF-8?q?Mattias=20Andr=C3=A9e?=
>  Date: Sun, 5 Feb 2017 00:44:35 +0100
> Subject: [PATCH] libutil/unescape.c: simplify and add \E
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Signed-off-by: Mattias Andrée 
> ---
>  libutil/unescape.c | 101
> ++--- 1
> file changed, 42 insertions(+), 59 deletions(-)
> 
> diff --git a/libutil/unescape.c b/libutil/unescape.c
> index d1503e6..7523db3 100644
> --- a/libutil/unescape.c
> +++ b/libutil/unescape.c
> @@ -1,74 +1,57 @@
>  /* See LICENSE file for copyright and license details. */
> +#include 
>  #include 
>  
>  #include "../util.h"
>  
> +#define is_odigit(c)  ('0' <= c && c <= '7')
> +
>  size_t
>  unescape(char *s)
>  {
> - size_t len, i, off, m, factor, q;
> -
> - len = strlen(s);
> + static const char escapes[256] = {
> + ['"'] = '"',
> + ['\''] = '\'',
> + ['\\'] = '\\',
> + ['a'] = '\a',
> + ['b'] = '\b',
> + ['E'] = 033,
> + ['e'] = 033,
> + ['f'] = '\f',
> + ['n'] = '\n',
> + ['r'] = '\r',
> + ['t'] = '\t',
> + ['v'] = '\v'
>

Re: [hackers] [PATCH v3][sbase] libutil/unescape.c: simplify and add \E

2017-02-06 Thread Mattias Andrée
On Mon, 06 Feb 2017 15:05:32 -0800
evan.ga...@gmail.com (Evan Gates) wrote:

> Mattias Andrée  wrote:
> 
> > +   } else if (escapes[*r & 255]) {
> > +   *w++ = escapes[*r++ & 255];  
> 
> Why do you & 255 here? I think a cast to unsigned char
> would accomplish what you are trying to do and be more
> correct (as char can default to either signed or
> unsigned). Although I may misunderstand what is going on
> here.

Yes. I used &255 because it's does clutter as much.

> 
> > +   q = q * 8 + (*r & 7);  
> 
> I think this is clearer:
> 
> q = q * 8 + (*r - '0');

Go for it!

> 
> > +   *w++ = q > 255 ? 255 : q;  
> 
> Probably use MIN(q, 255) instead of the explicit ternary.

Yeah, I forgot that we have MIN macro, so I didn't change that part.

> 
> > +   for (q = 0, m = 2; m &&
> > isxdigit(*r); m--, r++)
> > +   q = q * 16 + (*r & 15)
> > + 9 * !!isalpha(*r);  
> 
> I think this is clearer, even though it adds a
> conditional:
> 
> q = q * 16 + isdigit(*r) ? *r - '0' : tolower(*r) - 'a' +
> 10;

I think

if (isdigit(*r))
q = q * 16 + (*r - '0');
else
q = q * 16 + (tolower(*r) - 'a') + 10;

is clearer, or at least brackets around the ternary.

> 
> Great work, much much simpler implementation. Once I
> understand the &255 and we agree on the best solution for
> these few lines I'll go ahead and push this.



pgp_aIrTP5Ed3.pgp
Description: OpenPGP digital signature


Re: [hackers] [PATCH][sbase] libutil/unescape.c: add \E and simplify \x

2017-02-06 Thread Mattias Andrée
On Mon, 6 Feb 2017 13:29:54 -0800
Evan Gates  wrote:

> On Sat, Feb 4, 2017 at 1:32 PM, Mattias Andrée
>  wrote:
> > @@ -39,10 +39,8 @@ unescape(char *s)
> > off += m - i - 1;
> > for (--m, q = 0, factor = 1; m
> > > i + 1; m--) {
> > -   if (s[m] >= '0' && s[m]
> > <= '9')
> > -   q += (s[m] -
> > '0') * factor;
> > -   else if (s[m] >= 'A' &&
> > s[m] <= 'F')
> > -   q += ((s[m] -
> > 'A') + 10) * factor;
> > -   else if (s[m] >= 'a' &&
> > s[m] <= 'f')
> > -   q += ((s[m] -
> > 'a') + 10) * factor;
> > +   if (isdigit(s[m]))
> > +   q += (s[m] &
> > 15) * factor;
> > +   else
> > +   q += ((s[m] &
> > 15) + 9) * factor; factor *= 16;
> > }
> > --
> > 2.11.0
> >
> >  
> 
> I think this would be clearer as:
> 
> if (isdigit(s[m]))
> q += (s[m] - '0') * factor;
> else
> q += (tolower(s[m]) - 'a' + 10) * factor;
> 
> But it is just a personal style preference. Or, if we do
> keep the bit twiddling, I highly recommend using hex
> instead of decimal, e.g. (s[m] & 0xf).
> 
> Thoughts?

I don't really have a prefers.


pgpxmMeCTy3tP.pgp
Description: OpenPGP digital signature


[hackers] [sbase][PATCH v4] Add man(1) and manpp(1)

2017-02-05 Thread Mattias Andrée
manpp(1) preprocesses a man page which can be used
with installing man pages. If this done and man(1)
is configured to not preprocess files, less work will
be involved in displaying man pages. It can also
be used to preprocess man pages that need special
preprocessing, namely chem(1) and grap(1), or removing
conflicting preprocessing (which requires that man(1)
is configured to not preprocess install man pages).

Signed-off-by: Mattias Andrée 
---
 Makefile |   6 +
 README   |   2 +
 config.def.h |  21 ++
 man.1| 157 ++
 man.c| 687 +++
 manpp.1  |  39 
 manpp.c  | 120 +++
 7 files changed, 1032 insertions(+)
 create mode 100644 config.def.h
 create mode 100644 man.1
 create mode 100644 man.c
 create mode 100644 manpp.1
 create mode 100644 manpp.c

diff --git a/Makefile b/Makefile
index 9ec9990..9deed7a 100644
--- a/Makefile
+++ b/Makefile
@@ -6,6 +6,7 @@ include config.mk
 HDR =\
arg.h\
compat.h\
+   config.h\
crypt.h\
fs.h\
md5.h\
@@ -121,6 +122,8 @@ BIN =\
logger\
logname\
ls\
+   man\
+   manpp\
md5sum\
mkdir\
mkfifo\
@@ -192,6 +195,9 @@ $(BIN): $(LIB) $(@:=.o)
 
 $(OBJ): $(HDR) config.mk
 
+config.h:
+   cp config.def.h $@
+
 .o:
$(CC) $(LDFLAGS) -o $@ $< $(LIB)
 
diff --git a/README b/README
index da2e500..14f3343 100644
--- a/README
+++ b/README
@@ -50,6 +50,8 @@ The following tools are implemented:
 0=*|o logger  .
 0=*|o logname .
 0#* o ls  (-C, -k, -m, -p, -s, -x)
+0#* o man (-k)
+0=* x manpp   .
 0=*|x md5sum  .
 0=*|o mkdir   .
 0=*|o mkfifo  .
diff --git a/config.def.h b/config.def.h
new file mode 100644
index 000..d7c3da2
--- /dev/null
+++ b/config.def.h
@@ -0,0 +1,21 @@
+/* See LICENSE file for copyright and license details. */
+
+#define GROFF "/usr/bin/groff"
+
+#define GUNZIP "exec /usr/bin/gunzip"
+#define UNCOMPRESS "exec /usr/bin/uncompress"
+#define BUNZIP2"exec /usr/bin/bunzip2"
+#define UNXZ   "exec /usr/bin/unxz"
+#define UNLZMA "exec /usr/bin/unlzma"
+#define LUNZIP "exec /usr/bin/lzip -d"
+#define UNLZ4  "exec /usr/bin/unlz4"
+
+#define MANPATHTRANS "/=/usr"
+#define MANPATH "/usr/local/share/man:/usr/share/man"
+#define MANSECT 
"3p:3:0:2:1:8:n:l:5:4:6:7:x:9:a:b:c:d:e:f:g:h:i:j:k:l:m:o:p:q:r:s:t:u:v:w:u:y:z"
+#define MANBINSECT "1:8"
+#define MANETCSECT "5"
+#define MANCOMP 
"gz="GUNZIP":z="GUNZIP":Z="UNCOMPRESS":bz2="BUNZIP2":xz="UNXZ":lzma="UNLZMA":lz="LUNZIP":lz4="UNLZ4
+#define MANPAGER "exec /usr/bin/less"
+
+#define MANPAGES_ARE_PREPROCESSED 0
diff --git a/man.1 b/man.1
new file mode 100644
index 000..970c52d
--- /dev/null
+++ b/man.1
@@ -0,0 +1,157 @@
+.Dd 2017-02-06
+.Dt MAN 1
+.Os sbase
+.Sh NAME
+.Nm man
+.Nd display online reference manuals
+.Sh SYNOPSIS
+.Nm
+.Op Fl 7 | Fl 8
+.Op Ar section
+.Ar name
+.Op Ar subcommand ...
+.Sh DESCRIPTION
+.Nm
+displays the manual with the selected
+.Ar name .
+.Ar name
+is usually a command, C function or system call.
+All
+.Ar subcommand
+are appended to
+.Ar name
+with \fB-\fP joining the arguments.
+Multiple manuals have the same name, in this case
+.Ar section
+can be used to specify which of them to display.
+.P
+The table below shows common
+.Ar section
+numbers.
+.Bd -literal -offset left
+1  Executable commands, as implemented
+1p Executable commands, as specified by POSIX
+2  System calls
+3  Library calls, as implemented
+3p Library calls, as specified by POSIX
+4  Special files
+5  Configuration files, file formats and conventions
+6  Entertainment
+7  Miscellanea
+8  System administration commands
+9  Kernel routines
+0  Header files, as implemented
+0p Header files, as specified by POSIX
+l  Tcl/Tk commands
+n  SQL commands
+.Ed
+.Sh OPTIONS
+.Bl -tag -width Ds
+.It Fl 7
+Convert output to ASCII.
+.It Fl 8
+Allow UTF-8 output.
+.El
+.Sh ENVIRONMENT VARIABLES
+.Bl -tag -width Ds
+.It Ev MANPATH
+Colon-separated list of where manual are stored.
+Each listed directory shall directions for each
+section rather than the manuals themself.
+.It Ev MANPATHTRANS
+Colon-separated list of directory remappings used
+to find the manual when command is given by its
+path. Each entry is of the format
+.Va bindir Ns Cm = Ns Va mandir .
+If the parent directory of the directory
+.Ar name
+matches
+.Va bindir ,
+.Va mandir
+is used as
+.Ar MANPATH .
+.It Ev MANSECT
+Colon-separated list of
+.Ar section
+numbers. If a manual is found under two or more
+sections, the first named section in
+.Ev MANSECT
+is selected.
+.It Ev MANBINSECT

[hackers] [PATCH 2][sbase] libutil/unescape.c: add \u and \U; and correct and update printf.1

2017-02-04 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 libutil/unescape.c | 40 +++-
 printf.1   | 11 ++-
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/libutil/unescape.c b/libutil/unescape.c
index bed2c61..5845dd4 100644
--- a/libutil/unescape.c
+++ b/libutil/unescape.c
@@ -7,6 +7,32 @@
 #define isoctal(c)  ('0' <= c && c <= '7')
 
 size_t
+utf8encode(size_t cp, char *out)
+{
+   char head = 0, headmask = (char)0x80, buf[7] = {0, 0, 0, 0, 0, 0, 0};
+   size_t n = 0;
+
+   if (cp < 0x80)
+   return *out = (char)cp, 1;
+   if (cp > 0x7fffUL)
+   eprintf("invalid code point %X\n", cp);
+   while (cp) {
+   buf[6 - n] |= (char)0x80;
+   buf[6 - ++n] |= cp & 0x3f;
+   cp >>= 6;
+   head |= headmask;
+   headmask >>= 1;
+   }
+   if (buf[6 - n] & (head | headmask)) {
+   buf[6 - n] |= (char)0x80;
+   n++, head |= headmask;
+   }
+   buf[6 - n] |= head;
+   memcpy(out, buf + 6 - n, n);
+   return n;
+}
+
+size_t
 unescape(char *s)
 {
static const char escapes[256] = {
@@ -23,8 +49,9 @@ unescape(char *s)
['t'] = '\t',
['v'] = '\v'
};
+   static const char hexlen[256] = {['x'] = 2, ['u'] = 4, ['U'] = 8};
size_t m, q;
-   char *r, *w;
+   char *r, *w, hex;
 
for (r = w = s; *r;) {
if (*r != '\\') {
@@ -40,11 +67,14 @@ unescape(char *s)
for (q = 0, m = 4; m && isoctal(*r); m--, r++)
q = q * 8 + (*r & 7);
*w++ = q > 255 ? 255 : q;
-   } else if (*r == 'x' && isxdigit(r[1])) {
-   r++;
-   for (q = 0, m = 2; m && isxdigit(*r); m--, r++)
+   } else if (hexlen[*r & 255] && isxdigit(r[1])) {
+   m = hexlen[(hex = *r++) & 255];
+   for (q = 0; m && isxdigit(*r); m--, r++)
q = q * 16 + (*r & 15) + 9 * !!isalpha(*r);
-   *w++ = q;
+   if (hex == 'x')
+   *w++ = q;
+   else
+   w += utf8encode(q, w);
} else {
eprintf("invalid escape sequence '\\%c'\n", *r);
}
diff --git a/printf.1 b/printf.1
index 78ffb1e..00fa850 100644
--- a/printf.1
+++ b/printf.1
@@ -1,4 +1,4 @@
-.Dd 2015-10-08
+.Dd 2017-02-04
 .Dt PRINTF 1
 .Os sbase
 .Sh NAME
@@ -17,9 +17,9 @@ using each
 until drained.
 .Pp
 .Nm
-interprets the standard escape sequences \e\e, \e', \e", \ea, \eb, \ee,
-\ef, \en, \er, \et, \ev, \exH[H], \eO[OOO], the sequence \ec, which
-terminates further output if it's found inside
+interprets the standard escape sequences \e\e, \e', \e", \ea, \eb,
+\ef, \en, \er, \et, \ev, \exH[H], \eO[OOO], the sequences, \ee, \eE, \euH[HHH],
+\eUH[HHH], and \ec, which terminates further output if it's found inside
 .Ar format
 or a %b format string, the format specification %b for an unescaped string and 
all C
 .Xr printf 3
@@ -31,4 +31,5 @@ utility is compliant with the
 .St -p1003.1-2013
 specification.
 .Pp
-The possibility of specifying 4-digit octals is an extension to that 
specification.
+The escape sequences \ee, \eE, \euH[HHH], \eUH[HHH], \exH[H] and 
possibility of
+specifying 4-digit octals is an extension to that specification.
-- 
2.11.0




[hackers] [PATCH v3][sbase] libutil/unescape.c: simplify and add \E

2017-02-04 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 libutil/unescape.c | 98 ++
 1 file changed, 39 insertions(+), 59 deletions(-)

diff --git a/libutil/unescape.c b/libutil/unescape.c
index d1503e6..bed2c61 100644
--- a/libutil/unescape.c
+++ b/libutil/unescape.c
@@ -1,74 +1,54 @@
 /* See LICENSE file for copyright and license details. */
+#include 
 #include 
 
 #include "../util.h"
 
+#define isoctal(c)  ('0' <= c && c <= '7')
+
 size_t
 unescape(char *s)
 {
-   size_t len, i, off, m, factor, q;
-
-   len = strlen(s);
+   static const char escapes[256] = {
+   ['"'] = '"',
+   ['\''] = '\'',
+   ['\\'] = '\\',
+   ['a'] = '\a',
+   ['b'] = '\b',
+   ['E'] = 033,
+   ['e'] = 033,
+   ['f'] = '\f',
+   ['n'] = '\n',
+   ['r'] = '\r',
+   ['t'] = '\t',
+   ['v'] = '\v'
+   };
+   size_t m, q;
+   char *r, *w;
 
-   for (i = 0; i < len; i++) {
-   if (s[i] != '\\')
+   for (r = w = s; *r;) {
+   if (*r != '\\') {
+   *w++ = *r++;
continue;
-   off = 0;
-
-   switch (s[i + 1]) {
-   case '\\': s[i] = '\\'; off++; break;
-   case '\'': s[i] = '\'', off++; break;
-   case '"':  s[i] =  '"', off++; break;
-   case 'a':  s[i] = '\a'; off++; break;
-   case 'b':  s[i] = '\b'; off++; break;
-   case 'e':  s[i] =  033; off++; break;
-   case 'f':  s[i] = '\f'; off++; break;
-   case 'n':  s[i] = '\n'; off++; break;
-   case 'r':  s[i] = '\r'; off++; break;
-   case 't':  s[i] = '\t'; off++; break;
-   case 'v':  s[i] = '\v'; off++; break;
-   case 'x':
-   /* "\xH[H]" hexadecimal escape */
-   for (m = i + 2; m < i + 1 + 3 && m < len; m++)
-   if ((s[m] < '0' && s[m] > '9') &&
-   (s[m] < 'A' && s[m] > 'F') &&
-   (s[m] < 'a' && s[m] > 'f'))
-   break;
-   if (m == i + 2)
-   eprintf("invalid escape sequence '\\%c'\n", s[i 
+ 1]);
-   off += m - i - 1;
-   for (--m, q = 0, factor = 1; m > i + 1; m--) {
-   if (s[m] >= '0' && s[m] <= '9')
-   q += (s[m] - '0') * factor;
-   else if (s[m] >= 'A' && s[m] <= 'F')
-   q += ((s[m] - 'A') + 10) * factor;
-   else if (s[m] >= 'a' && s[m] <= 'f')
-   q += ((s[m] - 'a') + 10) * factor;
-   factor *= 16;
-   }
-   s[i] = q;
-   break;
-   case '\0':
+   }
+   r++;
+   if (!*r) {
eprintf("null escape sequence\n");
-   default:
-   /* "\O[OOO]" octal escape */
-   for (m = i + 1; m < i + 1 + 4 && m < len; m++)
-   if (s[m] < '0' || s[m] > '7')
-   break;
-   if (m == i + 1)
-   eprintf("invalid escape sequence '\\%c'\n", s[i 
+ 1]);
-   off += m - i - 1;
-   for (--m, q = 0, factor = 1; m > i; m--) {
-   q += (s[m] - '0') * factor;
-   factor *= 8;
-   }
-   s[i] = (q > 255) ? 255 : q;
+   } else if (escapes[*r & 255]) {
+   *w++ = escapes[*r++ & 255];
+   } else if (isoctal(*r)) {
+   for (q = 0, m = 4; m && isoctal(*r); m--, r++)
+   q = q * 8 + (*r & 7);
+   *w++ = q > 255 ? 255 : q;
+   } else if (*r == 'x' && isxdigit(r[1])) {
+   r++;
+   for (q = 0, m = 2; m && isxdigit(*r); m--, r++)
+   q = q * 16 + (*r & 15) + 9 * !!isalpha(*r);
+   *w++ = q;
+   } else {
+   eprintf("invalid escape sequence '\\%c'\n", *r);
}
-
-   for (m = i + 1; m <= len - off; m++)
-   s[m] = s[m + off];
-   len -= off;
}
 
-   return len;
+   return w - s;
 }
-- 
2.11.0




[hackers] [PATCH v2][sbase] libutil/unescape.c: simplify and add \E

2017-02-04 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 libutil/unescape.c | 98 ++
 1 file changed, 39 insertions(+), 59 deletions(-)

diff --git a/libutil/unescape.c b/libutil/unescape.c
index d1503e6..c0155ea 100644
--- a/libutil/unescape.c
+++ b/libutil/unescape.c
@@ -1,74 +1,54 @@
 /* See LICENSE file for copyright and license details. */
+#include 
 #include 
 
 #include "../util.h"
 
+#define isoctal(c)  ('0' <= c && c <= '7')
+
 size_t
 unescape(char *s)
 {
-   size_t len, i, off, m, factor, q;
-
-   len = strlen(s);
+   static const char escapes[256] = {
+   ['"'] = '"',
+   ['\''] = '\'',
+   ['\\'] = '\\',
+   ['a'] = '\a',
+   ['b'] = '\b',
+   ['E'] = 033,
+   ['e'] = 033,
+   ['f'] = '\f',
+   ['n'] = '\n',
+   ['r'] = '\r',
+   ['t'] = '\t',
+   ['v'] = '\v'
+   };
+   size_t m, q;
+   char *r, *w;
 
-   for (i = 0; i < len; i++) {
-   if (s[i] != '\\')
+   for (r = w = s; *r;) {
+   if (*r != '\\') {
+   *w++ = *r++;
continue;
-   off = 0;
-
-   switch (s[i + 1]) {
-   case '\\': s[i] = '\\'; off++; break;
-   case '\'': s[i] = '\'', off++; break;
-   case '"':  s[i] =  '"', off++; break;
-   case 'a':  s[i] = '\a'; off++; break;
-   case 'b':  s[i] = '\b'; off++; break;
-   case 'e':  s[i] =  033; off++; break;
-   case 'f':  s[i] = '\f'; off++; break;
-   case 'n':  s[i] = '\n'; off++; break;
-   case 'r':  s[i] = '\r'; off++; break;
-   case 't':  s[i] = '\t'; off++; break;
-   case 'v':  s[i] = '\v'; off++; break;
-   case 'x':
-   /* "\xH[H]" hexadecimal escape */
-   for (m = i + 2; m < i + 1 + 3 && m < len; m++)
-   if ((s[m] < '0' && s[m] > '9') &&
-   (s[m] < 'A' && s[m] > 'F') &&
-   (s[m] < 'a' && s[m] > 'f'))
-   break;
-   if (m == i + 2)
-   eprintf("invalid escape sequence '\\%c'\n", s[i 
+ 1]);
-   off += m - i - 1;
-   for (--m, q = 0, factor = 1; m > i + 1; m--) {
-   if (s[m] >= '0' && s[m] <= '9')
-   q += (s[m] - '0') * factor;
-   else if (s[m] >= 'A' && s[m] <= 'F')
-   q += ((s[m] - 'A') + 10) * factor;
-   else if (s[m] >= 'a' && s[m] <= 'f')
-   q += ((s[m] - 'a') + 10) * factor;
-   factor *= 16;
-   }
-   s[i] = q;
-   break;
-   case '\0':
+   }
+   r++;
+   if (!*r) {
eprintf("null escape sequence\n");
-   default:
-   /* "\O[OOO]" octal escape */
-   for (m = i + 1; m < i + 1 + 4 && m < len; m++)
-   if (s[m] < '0' || s[m] > '7')
-   break;
-   if (m == i + 1)
-   eprintf("invalid escape sequence '\\%c'\n", s[i 
+ 1]);
-   off += m - i - 1;
-   for (--m, q = 0, factor = 1; m > i; m--) {
-   q += (s[m] - '0') * factor;
-   factor *= 8;
-   }
-   s[i] = (q > 255) ? 255 : q;
+   } else if (escapes[*r & 255]) {
+   *w++ = escapes[*r++ & 255];
+   } else if (isoctal(*r)) {
+   for (q = 0, m = 4; m && isoctal(*r); m--, r++)
+   q = q * 8 + (*r & 7);
+   *w++ = q > 255 ? 255 : q;
+   } else if (*r == 'x' && isxdigit(r[1])) {
+   r++;
+   for (q = 0, m = 2; m && isxdigit(*r); m--, r++)
+   q = q * 16 + (*r & 15) + 9 * isalpha(*r);
+   *w++ = q;
+   } else {
+   eprintf("invalid escape sequence '\\%c'\n", *r);
}
-
-   for (m = i + 1; m <= len - off; m++)
-   s[m] = s[m + off];
-   len -= off;
}
 
-   return len;
+   return w - s;
 }
-- 
2.11.0




[hackers] [PATCH][sbase] libutil/unescape.c: add \E and simplify \x

2017-02-04 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 libutil/unescape.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/libutil/unescape.c b/libutil/unescape.c
index d1503e6..aae15cf 100644
--- a/libutil/unescape.c
+++ b/libutil/unescape.c
@@ -1,3 +1,4 @@
 /* See LICENSE file for copyright and license details. */
+#include 
 #include 
 
@@ -18,8 +19,9 @@ unescape(char *s)
switch (s[i + 1]) {
case '\\': s[i] = '\\'; off++; break;
-   case '\'': s[i] = '\'', off++; break;
-   case '"':  s[i] =  '"', off++; break;
+   case '\'': s[i] = '\''; off++; break;
+   case '"':  s[i] =  '"'; off++; break;
case 'a':  s[i] = '\a'; off++; break;
case 'b':  s[i] = '\b'; off++; break;
+   case 'E':
case 'e':  s[i] =  033; off++; break;
case 'f':  s[i] = '\f'; off++; break;
@@ -31,7 +33,5 @@ unescape(char *s)
/* "\xH[H]" hexadecimal escape */
for (m = i + 2; m < i + 1 + 3 && m < len; m++)
-   if ((s[m] < '0' && s[m] > '9') &&
-   (s[m] < 'A' && s[m] > 'F') &&
-   (s[m] < 'a' && s[m] > 'f'))
+   if (!isxdigit(s[m]))
break;
if (m == i + 2)
@@ -39,10 +39,8 @@ unescape(char *s)
off += m - i - 1;
for (--m, q = 0, factor = 1; m > i + 1; m--) {
-   if (s[m] >= '0' && s[m] <= '9')
-   q += (s[m] - '0') * factor;
-   else if (s[m] >= 'A' && s[m] <= 'F')
-   q += ((s[m] - 'A') + 10) * factor;
-   else if (s[m] >= 'a' && s[m] <= 'f')
-   q += ((s[m] - 'a') + 10) * factor;
+   if (isdigit(s[m]))
+   q += (s[m] & 15) * factor;
+   else
+   q += ((s[m] & 15) + 9) * factor;
factor *= 16;
}
-- 
2.11.0




[hackers] [PATCH][sbase] libutil/unescape.c: only print argv0 once on error

2017-02-04 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 libutil/unescape.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libutil/unescape.c b/libutil/unescape.c
index 90a62c3..d1503e6 100644
--- a/libutil/unescape.c
+++ b/libutil/unescape.c
@@ -35,7 +35,7 @@ unescape(char *s)
(s[m] < 'a' && s[m] > 'f'))
break;
if (m == i + 2)
-   eprintf("%s: invalid escape sequence '\\%c'\n", 
argv0, s[i + 1]);
+   eprintf("invalid escape sequence '\\%c'\n", s[i 
+ 1]);
off += m - i - 1;
for (--m, q = 0, factor = 1; m > i + 1; m--) {
if (s[m] >= '0' && s[m] <= '9')
@@ -49,14 +49,14 @@ unescape(char *s)
s[i] = q;
break;
case '\0':
-   eprintf("%s: null escape sequence\n", argv0);
+   eprintf("null escape sequence\n");
default:
/* "\O[OOO]" octal escape */
for (m = i + 1; m < i + 1 + 4 && m < len; m++)
if (s[m] < '0' || s[m] > '7')
break;
if (m == i + 1)
-   eprintf("%s: invalid escape sequence '\\%c'\n", 
argv0, s[i + 1]);
+   eprintf("invalid escape sequence '\\%c'\n", s[i 
+ 1]);
off += m - i - 1;
for (--m, q = 0, factor = 1; m > i; m--) {
q += (s[m] - '0') * factor;
-- 
2.11.0




[hackers] [PATCH v2][ubase] Add ul(1)

2017-02-03 Thread Mattias Andrée
This command will be used by the pager to format output
from groff, which is necessary for man pages.

I also added the option to remove formatting, which can
be used by programs that want to process the output of
program but don't want to bother with the formatting.

ul(1) in util-linux has the options -i and -t (alias: -T),
these are left out because I don't think they are useful.

-t is is completely unnecessary because you can just
use env(1) to set $TERM.

-i could be useful if you needed plain text but want
some way to know how it was intedend to be formatted.
If anyone needs this it could be added in the future,
however, I think that should be done in a seperate
tool.

Features that will be implemented in the future, but are
not implemented at the moment (not in util-linux either).

Recognition of ANSI escape sequences.

Remapping ANSI escape sequences. less(1) has this can
do this, but it would be better to have it in ul(1).

Support for combining diacritical marks.

Support for moving combining diacritical marks to
be in front of the character they combine with. This
is useful because terminals may not display text,
that use combining diacritical marks, correctly
otherwise.

(maybe) Add support to display control characters
(cat -v, but with highlighting), which I found to be
extremely useful. Of course, unlike cat -v, it should
not mess up non-ASCII characters.

ul(1) should also, in the future, not use hardcoded ANSI
escape sequences.

Signed-off-by: Mattias Andrée 
---
 Makefile  |  35 ++-
 libutf/Makefile   |   6 +
 libutf/fgetrune.c |  36 +++
 libutf/fputrune.c |  27 ++
 libutf/isalnumrune.c  |   9 +
 libutf/isalpharune.c  | 718 ++
 libutf/isblankrune.c  |   9 +
 libutf/iscntrlrune.c  |  18 ++
 libutf/isdigitrune.c  |  70 +
 libutf/isgraphrune.c  |   9 +
 libutf/isprintrune.c  |  10 +
 libutf/ispunctrune.c  |   9 +
 libutf/isspacerune.c  |  31 +++
 libutf/istitlerune.c  |  31 +++
 libutf/isxdigitrune.c |   9 +
 libutf/lowerrune.c| 334 +++
 libutf/mkrunetype.awk | 240 +
 libutf/rune.c | 148 +++
 libutf/runetype.c |  41 +++
 libutf/runetype.h |  26 ++
 libutf/upperrune.c| 251 ++
 libutf/utf.c  | 129 +
 libutf/utftorunestr.c |  13 +
 libutil/fshut.c   |  43 +++
 ul.1  |  39 +++
 ul.c  | 392 +++
 utf.h |  67 +
 util.h|   7 +
 28 files changed, 2755 insertions(+), 2 deletions(-)
 create mode 100644 libutf/Makefile
 create mode 100644 libutf/fgetrune.c
 create mode 100644 libutf/fputrune.c
 create mode 100644 libutf/isalnumrune.c
 create mode 100644 libutf/isalpharune.c
 create mode 100644 libutf/isblankrune.c
 create mode 100644 libutf/iscntrlrune.c
 create mode 100644 libutf/isdigitrune.c
 create mode 100644 libutf/isgraphrune.c
 create mode 100644 libutf/isprintrune.c
 create mode 100644 libutf/ispunctrune.c
 create mode 100644 libutf/isspacerune.c
 create mode 100644 libutf/istitlerune.c
 create mode 100644 libutf/isxdigitrune.c
 create mode 100644 libutf/lowerrune.c
 create mode 100644 libutf/mkrunetype.awk
 create mode 100644 libutf/rune.c
 create mode 100644 libutf/runetype.c
 create mode 100644 libutf/runetype.h
 create mode 100644 libutf/upperrune.c
 create mode 100644 libutf/utf.c
 create mode 100644 libutf/utftorunestr.c
 create mode 100644 libutil/fshut.c
 create mode 100644 ul.1
 create mode 100644 ul.c
 create mode 100644 utf.h

diff --git a/Makefile b/Makefile
index 453607c..3b6d219 100644
--- a/Makefile
+++ b/Makefile
@@ -12,8 +12,31 @@ HDR = \
reboot.h \
rtc.h\
text.h   \
+   utf.h\
util.h
 
+LIBUTF = libutf.a
+LIBUTFSRC =\
+   libutf/fgetrune.c\
+   libutf/fputrune.c\
+   libutf/isalnumrune.c\
+   libutf/isalpharune.c\
+   libutf/isblankrune.c\
+   libutf/iscntrlrune.c\
+   libutf/isdigitrune.c\
+   libutf/isgraphrune.c\
+   libutf/isprintrune.c\
+   libutf/ispunctrune.c\
+   libutf/isspacerune.c\
+   libutf/istitlerune.c\
+   libutf/isxdigitrune.c\
+   libutf/lowerrune.c\
+   libutf/rune.c\
+   libutf/runetype.c\
+   libutf/upperrune.c\
+   libutf/utf.c\
+   libutf/utftorunestr.c
+
 LIBUTIL = libutil.a
 LIBUTILSRC = \
libutil/agetcwd.c\
@@ -25,6 +48,7 @@ LIBUTILSRC = \
libutil/estrtol.c\
libutil/estrtoul.c   \
libutil/explicit_bzero.c \
+   libutil/fshut.c  \
libutil/passwd.c \
libutil/proc.c   \
libutil/putword.c\
@@ -34,7 +58,7 @@ LIBUTILSRC = \
libutil/strtonum.c   \
libutil/tty.c
 
-LIB = $(LIBUTIL)
+LIB = $(LIBUTF) $(LIBUTIL)
 
 BIN = \
chvt  

[hackers] [PATCH][ubase] Add ul(1)

2017-02-03 Thread Mattias Andrée
This command will be used by the pager to format output
from groff, which is necessary for man pages.

I also added the option to remove formatting, which can
be used by programs that want to process the output of
program but don't want to bother with the formatting.

ul(1) in util-linux has the options -i and -t (alias: -T),
these are left out because I don't think they are useful.

-t is is completely unnecessary because you can just
use env(1) to set $TERM.

-i could be useful if you needed plain text but want
some way to know how it was intedend to be formatted.
If anyone needs this it could be added in the future,
however, I think that should be done in a seperate
tool.

Features that will be implemented in the future, but are
not implemented at the moment (not in util-linux either).

Recognition of ANSI escape sequences.

Remapping ANSI escape sequences. less(1) has this can
do this, but it would be better to have it in ul(1).

Support for combining diacritical marks.

Support for moving combining diacritical marks to
be in front of the character they combine with. This
is useful because terminals may not display text,
that use combining diacritical marks, correctly
otherwise.

(maybe) Add support to display control characters
(cat -v, but with highlighting), which I found to be
extremely useful. Of course, unlike cat -v, it should
not mess up non-ASCII characters.

ul(1) should also, in the future, not use hardcoded ANSI
escape sequences.

Signed-off-by: Mattias Andrée 
---
 Makefile  |  35 ++-
 libutf/Makefile   |   6 +
 libutf/fgetrune.c |  36 +++
 libutf/fputrune.c |  27 ++
 libutf/isalnumrune.c  |   9 +
 libutf/isalpharune.c  | 718 ++
 libutf/isblankrune.c  |   9 +
 libutf/iscntrlrune.c  |  18 ++
 libutf/isdigitrune.c  |  70 +
 libutf/isgraphrune.c  |   9 +
 libutf/isprintrune.c  |  10 +
 libutf/ispunctrune.c  |   9 +
 libutf/isspacerune.c  |  31 +++
 libutf/istitlerune.c  |  31 +++
 libutf/isxdigitrune.c |   9 +
 libutf/lowerrune.c| 334 +++
 libutf/mkrunetype.awk | 240 +
 libutf/rune.c | 148 +++
 libutf/runetype.c |  41 +++
 libutf/runetype.h |  26 ++
 libutf/upperrune.c| 251 ++
 libutf/utf.c  | 129 +
 libutf/utftorunestr.c |  13 +
 libutil/fshut.c   |  43 +++
 ul.1  |  39 +++
 ul.c  | 384 +++
 utf.h |  67 +
 util.h|   7 +
 28 files changed, 2747 insertions(+), 2 deletions(-)
 create mode 100644 libutf/Makefile
 create mode 100644 libutf/fgetrune.c
 create mode 100644 libutf/fputrune.c
 create mode 100644 libutf/isalnumrune.c
 create mode 100644 libutf/isalpharune.c
 create mode 100644 libutf/isblankrune.c
 create mode 100644 libutf/iscntrlrune.c
 create mode 100644 libutf/isdigitrune.c
 create mode 100644 libutf/isgraphrune.c
 create mode 100644 libutf/isprintrune.c
 create mode 100644 libutf/ispunctrune.c
 create mode 100644 libutf/isspacerune.c
 create mode 100644 libutf/istitlerune.c
 create mode 100644 libutf/isxdigitrune.c
 create mode 100644 libutf/lowerrune.c
 create mode 100644 libutf/mkrunetype.awk
 create mode 100644 libutf/rune.c
 create mode 100644 libutf/runetype.c
 create mode 100644 libutf/runetype.h
 create mode 100644 libutf/upperrune.c
 create mode 100644 libutf/utf.c
 create mode 100644 libutf/utftorunestr.c
 create mode 100644 libutil/fshut.c
 create mode 100644 ul.1
 create mode 100644 ul.c
 create mode 100644 utf.h

diff --git a/Makefile b/Makefile
index 453607c..3b6d219 100644
--- a/Makefile
+++ b/Makefile
@@ -12,8 +12,31 @@ HDR = \
reboot.h \
rtc.h\
text.h   \
+   utf.h\
util.h
 
+LIBUTF = libutf.a
+LIBUTFSRC =\
+   libutf/fgetrune.c\
+   libutf/fputrune.c\
+   libutf/isalnumrune.c\
+   libutf/isalpharune.c\
+   libutf/isblankrune.c\
+   libutf/iscntrlrune.c\
+   libutf/isdigitrune.c\
+   libutf/isgraphrune.c\
+   libutf/isprintrune.c\
+   libutf/ispunctrune.c\
+   libutf/isspacerune.c\
+   libutf/istitlerune.c\
+   libutf/isxdigitrune.c\
+   libutf/lowerrune.c\
+   libutf/rune.c\
+   libutf/runetype.c\
+   libutf/upperrune.c\
+   libutf/utf.c\
+   libutf/utftorunestr.c
+
 LIBUTIL = libutil.a
 LIBUTILSRC = \
libutil/agetcwd.c\
@@ -25,6 +48,7 @@ LIBUTILSRC = \
libutil/estrtol.c\
libutil/estrtoul.c   \
libutil/explicit_bzero.c \
+   libutil/fshut.c  \
libutil/passwd.c \
libutil/proc.c   \
libutil/putword.c\
@@ -34,7 +58,7 @@ LIBUTILSRC = \
libutil/strtonum.c   \
libutil/tty.c
 
-LIB = $(LIBUTIL)
+LIB = $(LIBUTF) $(LIBUTIL)
 
 BIN = \
chvt  

Re: [hackers] [PATCH v2][sbase] Add man(1)

2017-02-02 Thread Mattias Andrée
On Thu, 2 Feb 2017 12:53:08 -0800
Michael Forney  wrote:

> On 2/1/17, Mattias Andrée  wrote:
> > I feel that it is kind of a given that man(1) is needed
> > (why else are we writing man pages) however, it can be
> > argued in which project to put it. The man-db
> > implementation everyone's using is just way too
> > overkill.  
> 
> mdocml is quite good and includes a man(1)
> implementation. It supports mdoc(7) and man(7), and does
> not depend on groff.
> 

Just like man-db, mdocml has stuff we probably can do without,
so I would still suggest adding my patch, but if its roff
implementation is less sucky than groff, that part could be
used instead of groff.

maandree


pgpzzq3pNuSuz.pgp
Description: OpenPGP digital signature


[hackers] [PATCH v3][sbase] Add man(1)

2017-02-01 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 Makefile |   5 +
 README   |   1 +
 config.def.h |  23 ++
 man.1| 156 +
 man.c| 726 +++
 5 files changed, 911 insertions(+)
 create mode 100644 config.def.h
 create mode 100644 man.1
 create mode 100644 man.c

diff --git a/Makefile b/Makefile
index 9ec9990..1e507f9 100644
--- a/Makefile
+++ b/Makefile
@@ -6,6 +6,7 @@ include config.mk
 HDR =\
arg.h\
compat.h\
+   config.h\
crypt.h\
fs.h\
md5.h\
@@ -121,6 +122,7 @@ BIN =\
logger\
logname\
ls\
+   man\
md5sum\
mkdir\
mkfifo\
@@ -192,6 +194,9 @@ $(BIN): $(LIB) $(@:=.o)
 
 $(OBJ): $(HDR) config.mk
 
+config.h:
+   cp config.def.h $@
+
 .o:
$(CC) $(LDFLAGS) -o $@ $< $(LIB)
 
diff --git a/README b/README
index da2e500..e518796 100644
--- a/README
+++ b/README
@@ -50,6 +50,7 @@ The following tools are implemented:
 0=*|o logger  .
 0=*|o logname .
 0#* o ls  (-C, -k, -m, -p, -s, -x)
+0#* o man (-k)
 0=*|x md5sum  .
 0=*|o mkdir   .
 0=*|o mkfifo  .
diff --git a/config.def.h b/config.def.h
new file mode 100644
index 000..fe5b8f8
--- /dev/null
+++ b/config.def.h
@@ -0,0 +1,23 @@
+/* See LICENSE file for copyright and license details. */
+
+#define GROFF "/usr/bin/groff"
+#define PRECONV "/usr/bin/preconv"
+
+#define GUNZIP "exec /usr/bin/gunzip"
+#define UNCOMPRESS "exec /usr/bin/uncompress"
+#define BUNZIP2"exec /usr/bin/bunzip2"
+#define UNXZ   "exec /usr/bin/unxz"
+#define UNLZMA "exec /usr/bin/unlzma"
+#define LUNZIP "exec /usr/bin/lzip -d"
+#define UNLZ4  "exec /usr/bin/unlz4"
+
+#define MANPATHTRANS "/=/usr"
+#define MANPATH "/usr/local/share/man:/usr/share/man"
+#define MANSECT 
"3p:3:0:2:1:8:n:l:5:4:6:7:x:9:a:b:c:d:e:f:g:h:i:j:k:l:m:o:p:q:r:s:t:u:v:w:u:y:z"
+#define MANBINSECT "1:8"
+#define MANETCSECT "5"
+#define MANCOMP 
"gz="GUNZIP":z="GUNZIP":Z="UNCOMPRESS":bz2="BUNZIP2":xz="UNXZ":lzma="UNLZMA":lz="LUNZIP":lz4="UNLZ4
+#define MANPAGER "exec /usr/bin/less"
+
+#define MANPAGES_ARE_PRECONVED 0
+#define MANPAGES_ARE_PREPROCESSED 0
diff --git a/man.1 b/man.1
new file mode 100644
index 000..da801c7
--- /dev/null
+++ b/man.1
@@ -0,0 +1,156 @@
+.Dd 2017-02-01
+.Dt MAN 1
+.Os sbase
+.Sh NAME
+.Nm man
+.Nd display online reference manuals
+.Sh SYNOPSIS
+.Nm
+.Op Fl 7 | Fl 8
+.Op Ar section
+.Ar name
+.Op Ar subcommand ...
+.Sh DESCRIPTION
+.Nm
+displays the manual with the selected
+.Ar name .
+.Ar name
+is usually a command, C function or system call.
+All
+.Ar subcommand
+are appended to
+.Ar name
+with \fB-\fP joining the arguments.
+Multiple manuals have the same name, in this case
+.Ar section
+can be used to specify which of them to display.
+.P
+The table below shows common
+.Ar section
+numbers.
+.Bd -literal -offset left
+1  Executable commands, as implemented
+1p Executable commands, as specified by POSIX
+2  System calls
+3  Library calls, as implemented
+3p Library calls, as specified by POSIX
+4  Special files
+5  Configuration files, file formats and conventions
+6  Entertainment
+7  Miscellanea
+8  System administration commands
+9  Kernel routines
+0  Header files, as implemented
+0p Header files, as specified by POSIX
+l  Tcl/Tk commands
+n  SQL commands
+.Ed
+.Sh OPTIONS
+.Bl -tag -width Ds
+.It Fl 7
+Convert output to ASCII.
+.It Fl 8
+Allow UTF-8 output.
+.El
+.Sh ENVIRONMENT VARIABLES
+.Bl -tag -width Ds
+.It Ev MANPATH
+Colon-separated list of where manual are stored.
+Each listed directory shall directions for each
+section rather than the manuals themself.
+.It Ev MANPATHTRANS
+Colon-separated list of directory remappings used
+to find the manual when command is given by its
+path. Each entry is of the format
+.Va bindir Ns Cm = Ns Va mandir .
+If the parent directory of the directory
+.Ar name
+matches
+.Va bindir ,
+.Va mandir
+is used as
+.Ar MANPATH .
+.It Ev MANSECT
+Colon-separated list of
+.Ar section
+numbers. If a manual is found under two or more
+sections, the first named section in
+.Ev MANSECT
+is selected.
+.It Ev MANBINSECT
+Similar to
+.Ev MANSECT ,
+but is used when
+.Ar name
+is the path to an executable file.
+.It Ev MANETCSECT
+Similar to
+.Ev MANSECT ,
+but is used when
+.Ar name
+is the path to file in
+.Pa /etc .
+.It Ev MANCOMP
+Colon-separated list of compressions. Each entry
+is formated as
+.Va extension Ns Cm = Ns Va command ,
+where
+.Va extension
+is the typical file name extension used on files
+compressed such that
+.Nm sh Fl c Li ' Ns Va command Ns Li '
+decompresses the files.
+Thi

Re: [hackers] [PATCH v2][sbase] Add man(1)

2017-02-01 Thread Mattias Andrée
On Wed, 1 Feb 2017 10:35:10 +0100
Laslo Hunhold  wrote:

> On Wed,  1 Feb 2017 10:28:06 +0100
> Mattias Andrée  wrote:
> 
> Hey Matthias,
> 
> > diff --git a/config.def.h b/config.def.h
> > new file mode 100644
> > index 000..14e1440
> > --- /dev/null
> > +++ b/config.def.h
> > @@ -0,0 +1,19 @@
> > +/* See LICENSE file for copyright and license details.
> > */ +
> > +#define GROFF "/usr/bin/groff"
> > +
> > +#define GUNZIP "exec /usr/bin/gunzip"
> > +#define UNCOMPRESS "exec /usr/bin/uncompress"
> > +#define BUNZIP2"exec /usr/bin/bunzip2"
> > +#define UNXZ   "exec /usr/bin/unxz"
> > +#define UNLZMA "exec /usr/bin/unlzma"
> > +#define LUNZIP "exec /usr/bin/lzip -d"
> > +#define UNLZ4  "exec /usr/bin/unlz4"
> > +
> > +#define MANPATHTRANS "/=/usr"
> > +#define MANPATH "/usr/local/share/man:/usr/share/man"
> > +#define MANSECT
> > "3p:3:0:2:1:8:n:l:5:4:6:7:x:9:a:b:c:d:e:f:g:h:i:j:k:l:m:o:p:q:r:s:t:u:v:w:u:y:z"
> > +#define MANBINSECT "1:8" +#define MANETCSECT "5"
> > +#define MANCOMP
> > "gz="GUNZIP":z="GUNZIP":Z="UNCOMPRESS":bz2="BUNZIP2":xz="UNXZ":lzma="UNLZMA":lz="LUNZIP":lz4="UNLZ4
> > +#define MANPAGER "exec /usr/bin/less" diff --git
> > a/man.1 b/man.1  
> 
> isn't adding man(1) a bit of an overkill? I mean, no
> offense, but you make a lot of assumptions about the
> environment there (e.g. the PATH, which you just
> "overwrite" as "/usr/bin").
> 
> Maybe you all discussed this IRC, but as I already said
> Matthias, I really am impressed every time you propose
> changes like these. Not everybody could churn out such a
> program. However, in case it is rejected, I fear you are
> discouraged from working on sbase or other suckless
> projects, especially given the magnitude of man(1) and
> other endeavours. Feel free to start a thread here on the
> ml, asking for the need of such tools before starting
> work on them.
> 
> With best regards
> 
> Laslo
> 

I feel that it is kind of a given that man(1) is needed (why
else are we writing man pages) however, it can be argued in
which project to put it. The man-db implementation everyone's
using is just way too overkill.

If the users doesn't use /usr/bin, he just change that in
config.h, and even remove the path from it and replace exec[lv]
with exec[lv]p. However, I specified the path as a (paranoid)
security measure. Apart from that, the the only assumption is
the path to the man pages, which is required, and /etc which
is standard enough and man(1) can do with this being incorrect.

I did think about asking this time, but I reckoned, if you
don't see any need for a man(1) at suckless at all, I'll just
upload as personal project; because I definitely want a simple
implementation myself.

I skipped apropos(1) (man -k) and whatis(1), because I never
use them, so this implementation doesn't have a cache of all
man pages, which makes this implementation very simple and
less the difficult parts are handles by less(1) and groff(1)
(which sadly also need a replacement, but I will probably
not to that myself, at least not this year).


pgpRZ52mJtiSa.pgp
Description: OpenPGP digital signature


[hackers] [PATCH v2][sbase] Add man(1)

2017-02-01 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 Makefile |   5 +
 README   |   1 +
 config.def.h |  19 ++
 man.1| 154 ++
 man.c| 677 +++
 5 files changed, 856 insertions(+)
 create mode 100644 config.def.h
 create mode 100644 man.1
 create mode 100644 man.c

diff --git a/Makefile b/Makefile
index 9ec9990..1e507f9 100644
--- a/Makefile
+++ b/Makefile
@@ -6,6 +6,7 @@ include config.mk
 HDR =\
arg.h\
compat.h\
+   config.h\
crypt.h\
fs.h\
md5.h\
@@ -121,6 +122,7 @@ BIN =\
logger\
logname\
ls\
+   man\
md5sum\
mkdir\
mkfifo\
@@ -192,6 +194,9 @@ $(BIN): $(LIB) $(@:=.o)
 
 $(OBJ): $(HDR) config.mk
 
+config.h:
+   cp config.def.h $@
+
 .o:
$(CC) $(LDFLAGS) -o $@ $< $(LIB)
 
diff --git a/README b/README
index da2e500..e518796 100644
--- a/README
+++ b/README
@@ -50,6 +50,7 @@ The following tools are implemented:
 0=*|o logger  .
 0=*|o logname .
 0#* o ls  (-C, -k, -m, -p, -s, -x)
+0#* o man (-k)
 0=*|x md5sum  .
 0=*|o mkdir   .
 0=*|o mkfifo  .
diff --git a/config.def.h b/config.def.h
new file mode 100644
index 000..14e1440
--- /dev/null
+++ b/config.def.h
@@ -0,0 +1,19 @@
+/* See LICENSE file for copyright and license details. */
+
+#define GROFF "/usr/bin/groff"
+
+#define GUNZIP "exec /usr/bin/gunzip"
+#define UNCOMPRESS "exec /usr/bin/uncompress"
+#define BUNZIP2"exec /usr/bin/bunzip2"
+#define UNXZ   "exec /usr/bin/unxz"
+#define UNLZMA "exec /usr/bin/unlzma"
+#define LUNZIP "exec /usr/bin/lzip -d"
+#define UNLZ4  "exec /usr/bin/unlz4"
+
+#define MANPATHTRANS "/=/usr"
+#define MANPATH "/usr/local/share/man:/usr/share/man"
+#define MANSECT 
"3p:3:0:2:1:8:n:l:5:4:6:7:x:9:a:b:c:d:e:f:g:h:i:j:k:l:m:o:p:q:r:s:t:u:v:w:u:y:z"
+#define MANBINSECT "1:8"
+#define MANETCSECT "5"
+#define MANCOMP 
"gz="GUNZIP":z="GUNZIP":Z="UNCOMPRESS":bz2="BUNZIP2":xz="UNXZ":lzma="UNLZMA":lz="LUNZIP":lz4="UNLZ4
+#define MANPAGER "exec /usr/bin/less"
diff --git a/man.1 b/man.1
new file mode 100644
index 000..9985b2b
--- /dev/null
+++ b/man.1
@@ -0,0 +1,154 @@
+.Dd 2017-02-01
+.Dt MAN 1
+.Os sbase
+.Sh NAME
+.Nm man
+.Nd display online reference manuals
+.Sh SYNOPSIS
+.Nm
+.Op Fl 7 | Fl 8
+.Op Ar section
+.Ar name
+.Op Ar subcommand ...
+.Sh DESCRIPTION
+.Nm
+displays the manual with the selected
+.Ar name .
+.Ar name
+is usually a command, C function or system call.
+All
+.Ar subcommand
+are appended to
+.Ar name
+with \fB-\fP joining the arguments.
+Multiple manuals have the same name, in this case
+.Ar section
+can be used to specify which of them to display.
+.P
+The table below shows common
+.Ar section
+numbers.
+.Bd -literal -offset left
+1  Executable commands, as implemented
+1p Executable commands, as specified by POSIX
+2  System calls
+3  Library calls, as implemented
+3p Library calls, as specified by POSIX
+4  Special files
+5  Configuration files, file formats and conventions
+6  Entertainment
+7  Miscellanea
+8  System administration commands
+9  Kernel routines
+0  Header files, as implemented
+0p Header files, as specified by POSIX
+l  Tcl/Tk commands
+n  SQL commands
+.Ed
+.Sh OPTIONS
+.Bl -tag -width Ds
+.It Fl 7
+Convert output to ASCII.
+.It Fl 8
+Allow to UTF-8.
+.El
+.Sh ENVIRONMENT VARIABLES
+.Bl -tag -width Ds
+.It Ev MANPATH
+Colon-separated list of where manual are stored.
+Each listed directory shall directions for each
+section rather than the manuals themself.
+.It Ev MANPATHTRANS
+Colon-separated list of directory remappings used
+to find the manual when command is given by its
+path. Each entry is of the format
+.Va bindir Ns Cm = Ns Va mandir .
+If the parent directory of the directory
+.Ar name
+matches
+.Va bindir ,
+.Va mandir
+is used as
+.Ar MANPATH .
+.It Ev MANSECT
+Colon-separated list of
+.Ar section
+numbers. If a manual is found under two or more
+sections, the first named section in
+.Ev MANSECT
+is selected.
+.It Ev MANBINSECT
+Similar to
+.Ev MANSECT ,
+but is used when
+.Ar name
+is the path to an executable file.
+.It Ev MANETCSECT ,
+but is used when
+.Ar name
+is the path to file in
+.Pa /etc .
+.It Ev MANCOMP
+Colon-separated list of compressions. Each entry
+is formated as
+.Va extension Ns Cm = Ns Va command ,
+where
+.Va extension
+is the typical file name extension used on files
+compressed such that
+.Nm sh Fl c Li ' Ns Va command Ns Li '
+decompresses the files.
+This variable is ignored when running as root.
+.It Ev MANPAGER
+The pager to used. By default
+.Xr less 1
+is used.
+.It Ev PAGER
+Used instead of
+.

[hackers] [PATCH][sbase] Add man(1)

2017-02-01 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 Makefile |   5 +
 README   |   1 +
 config.def.h |  17 ++
 man.1| 154 ++
 man.c| 677 +++
 5 files changed, 854 insertions(+)
 create mode 100644 config.def.h
 create mode 100644 man.1
 create mode 100644 man.c

diff --git a/Makefile b/Makefile
index 9ec9990..1e507f9 100644
--- a/Makefile
+++ b/Makefile
@@ -6,6 +6,7 @@ include config.mk
 HDR =\
arg.h\
compat.h\
+   config.h\
crypt.h\
fs.h\
md5.h\
@@ -121,6 +122,7 @@ BIN =\
logger\
logname\
ls\
+   man\
md5sum\
mkdir\
mkfifo\
@@ -192,6 +194,9 @@ $(BIN): $(LIB) $(@:=.o)
 
 $(OBJ): $(HDR) config.mk
 
+config.h:
+   cp config.def.h $@
+
 .o:
$(CC) $(LDFLAGS) -o $@ $< $(LIB)
 
diff --git a/README b/README
index da2e500..e518796 100644
--- a/README
+++ b/README
@@ -50,6 +50,7 @@ The following tools are implemented:
 0=*|o logger  .
 0=*|o logname .
 0#* o ls  (-C, -k, -m, -p, -s, -x)
+0#* o man (-k)
 0=*|x md5sum  .
 0=*|o mkdir   .
 0=*|o mkfifo  .
diff --git a/config.def.h b/config.def.h
new file mode 100644
index 000..503fd2d
--- /dev/null
+++ b/config.def.h
@@ -0,0 +1,17 @@
+/* See LICENSE file for copyright and license details. */
+
+#define GUNZIP "exec /usr/bin/gunzip"
+#define UNCOMPRESS "exec /usr/bin/uncompress"
+#define BUNZIP2"exec /usr/bin/bunzip2"
+#define UNXZ   "exec /usr/bin/unxz"
+#define UNLZMA "exec /usr/bin/unlzma"
+#define LUNZIP "exec /usr/bin/lzip -d"
+#define UNLZ4  "exec /usr/bin/unlz4"
+
+#define MANPATHTRANS "/=/usr"
+#define MANPATH "/usr/local/share/man:/usr/share/man"
+#define MANSECT 
"3p:3:0:2:1:8:n:l:5:4:6:7:x:9:a:b:c:d:e:f:g:h:i:j:k:l:m:o:p:q:r:s:t:u:v:w:u:y:z"
+#define MANBINSECT "1:8"
+#define MANETCSECT "5"
+#define MANCOMP 
"gz="GUNZIP":z="GUNZIP":Z="UNCOMPRESS":bz2="BUNZIP2":xz="UNXZ":lzma="UNLZMA":lz="LUNZIP":lz4="UNLZ4
+#define MANPAGER "exec /usr/bin/less"
diff --git a/man.1 b/man.1
new file mode 100644
index 000..9985b2b
--- /dev/null
+++ b/man.1
@@ -0,0 +1,154 @@
+.Dd 2017-02-01
+.Dt MAN 1
+.Os sbase
+.Sh NAME
+.Nm man
+.Nd display online reference manuals
+.Sh SYNOPSIS
+.Nm
+.Op Fl 7 | Fl 8
+.Op Ar section
+.Ar name
+.Op Ar subcommand ...
+.Sh DESCRIPTION
+.Nm
+displays the manual with the selected
+.Ar name .
+.Ar name
+is usually a command, C function or system call.
+All
+.Ar subcommand
+are appended to
+.Ar name
+with \fB-\fP joining the arguments.
+Multiple manuals have the same name, in this case
+.Ar section
+can be used to specify which of them to display.
+.P
+The table below shows common
+.Ar section
+numbers.
+.Bd -literal -offset left
+1  Executable commands, as implemented
+1p Executable commands, as specified by POSIX
+2  System calls
+3  Library calls, as implemented
+3p Library calls, as specified by POSIX
+4  Special files
+5  Configuration files, file formats and conventions
+6  Entertainment
+7  Miscellanea
+8  System administration commands
+9  Kernel routines
+0  Header files, as implemented
+0p Header files, as specified by POSIX
+l  Tcl/Tk commands
+n  SQL commands
+.Ed
+.Sh OPTIONS
+.Bl -tag -width Ds
+.It Fl 7
+Convert output to ASCII.
+.It Fl 8
+Allow to UTF-8.
+.El
+.Sh ENVIRONMENT VARIABLES
+.Bl -tag -width Ds
+.It Ev MANPATH
+Colon-separated list of where manual are stored.
+Each listed directory shall directions for each
+section rather than the manuals themself.
+.It Ev MANPATHTRANS
+Colon-separated list of directory remappings used
+to find the manual when command is given by its
+path. Each entry is of the format
+.Va bindir Ns Cm = Ns Va mandir .
+If the parent directory of the directory
+.Ar name
+matches
+.Va bindir ,
+.Va mandir
+is used as
+.Ar MANPATH .
+.It Ev MANSECT
+Colon-separated list of
+.Ar section
+numbers. If a manual is found under two or more
+sections, the first named section in
+.Ev MANSECT
+is selected.
+.It Ev MANBINSECT
+Similar to
+.Ev MANSECT ,
+but is used when
+.Ar name
+is the path to an executable file.
+.It Ev MANETCSECT ,
+but is used when
+.Ar name
+is the path to file in
+.Pa /etc .
+.It Ev MANCOMP
+Colon-separated list of compressions. Each entry
+is formated as
+.Va extension Ns Cm = Ns Va command ,
+where
+.Va extension
+is the typical file name extension used on files
+compressed such that
+.Nm sh Fl c Li ' Ns Va command Ns Li '
+decompresses the files.
+This variable is ignored when running as root.
+.It Ev MANPAGER
+The pager to used. By default
+.Xr less 1
+is used.
+.It Ev PAGER
+Used instead of
+.Ev MANPAGER
+if
+.Ev MANPAGER
+is not set.

[hackers] [PATCH][sbase] cp.1: source and dest are not optional

2017-01-29 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 cp.1 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/cp.1 b/cp.1
index 54126e2..f74127d 100644
--- a/cp.1
+++ b/cp.1
@@ -11,8 +11,8 @@
 .Fl R
 .Op Fl H | L | P
 .Oc
-.Op Ar source ...
-.Op Ar dest
+.Ar source ...
+.Ar dest
 .Sh DESCRIPTION
 .Nm
 copies
-- 
2.11.0




[hackers] [PATCH][sbase] getconf: fail if any other flag than -v is used

2017-01-26 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 getconf.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/getconf.c b/getconf.c
index e611659..d927f2d 100644
--- a/getconf.c
+++ b/getconf.c
@@ -33,6 +33,9 @@ main(int argc, char *argv[])
/* ignore */
EARGF(usage());
break;
+   default:
+   usage();
+   break;
} ARGEND
 
if (argc == 1) {
-- 
2.11.0




[hackers] [PATCH][farbfeld] Add ff2pam(1): convert farbfeld to 16-bit RGBA Portable Arbitrary Map

2017-01-08 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 Makefile |  2 +-
 ff2pam.1 | 41 +
 ff2pam.c | 70 
 3 files changed, 112 insertions(+), 1 deletion(-)
 create mode 100644 ff2pam.1
 create mode 100644 ff2pam.c

diff --git a/Makefile b/Makefile
index 2177d25..e2b5746 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@
 # See LICENSE file for copyright and license details
 include config.mk
 
-BIN = png2ff ff2png jpg2ff ff2jpg ff2ppm
+BIN = png2ff ff2png jpg2ff ff2jpg ff2ppm ff2pam
 SCRIPTS = 2ff
 SRC = ${BIN:=.c}
 HDR = arg.h
diff --git a/ff2pam.1 b/ff2pam.1
new file mode 100644
index 000..3224acd
--- /dev/null
+++ b/ff2pam.1
@@ -0,0 +1,41 @@
+.Dd 2017-01-09
+.Dt FF2PAM 1
+.Os suckless.org
+.Sh NAME
+.Nm ff2pam
+.Nd convert farbfeld to PAM
+.Sh SYNOPSIS
+.Nm
+.Sh DESCRIPTION
+.Nm
+reads a
+.Xr farbfeld 5
+image from stdin, converts it to a 16-bit RGBA PAM image and
+writes the result to stdout.
+.Pp
+In case of an error
+.Nm
+writes a diagnostic message to stderr.
+.Sh EXIT STATUS
+.Bl -tag -width Ds
+.It 0
+Image processed successfully.
+.It 1
+An error occurred.
+.El
+.Sh EXAMPLES
+$
+.Nm
+< image.ff > image.pam
+.Pp
+$ bunzip2 < image.ff.bz2 |
+.Nm
+> image.pam
+.Sh SEE ALSO
+.Xr 2ff 1 ,
+.Xr bunzip2 1 ,
+.Xr bzip2 1 ,
+.Xr png2ff 1 ,
+.Xr farbfeld 5
+.Sh AUTHORS
+.An Mattias Andrée Aq Mt maand...@kth.se
diff --git a/ff2pam.c b/ff2pam.c
new file mode 100644
index 000..b244c68
--- /dev/null
+++ b/ff2pam.c
@@ -0,0 +1,70 @@
+/* See LICENSE file for copyright and license details. */
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static char *argv0;
+
+int
+main(int argc, char *argv[])
+{
+   uint32_t hdr[4], width, height;
+   char buf[BUFSIZ];
+   ssize_t r, p, n;
+
+   argv0 = argv[0], argc--, argv++;
+
+   if (argc) {
+   fprintf(stderr, "usage: %s\n", argv0);
+   return 1;
+   }
+
+   /* header */
+   if (read(STDIN_FILENO, hdr, sizeof(hdr)) != sizeof(hdr))
+   goto readerr;
+   if (memcmp("farbfeld", hdr, sizeof("farbfeld") - 1)) {
+   fprintf(stderr, "%s: invalid magic value\n", argv0);
+   return 1;
+   }
+   width = ntohl(hdr[2]);
+   height = ntohl(hdr[3]);
+
+   /* write header */
+   printf("P7\n"
+  "WIDTH %lu\n"
+  "HEIGHT %lu\n"
+  "DEPTH 4\n" /* actually the number of channel */
+  "MAXVAL 65535\n"
+  "TUPLTYPE RGB_ALPHA\n"
+  "ENDHDR\n",
+  (unsigned long int)width, (unsigned long int)height);
+   fflush(stdout);
+   if (ferror(stdout))
+   goto writeerr;
+
+   /* write image */
+   for (;;) {
+   n = read(STDIN_FILENO, buf, sizeof(buf));
+   if (n < 0)
+   goto readerr;
+   if (!n)
+   break;
+   for (p = 0; p < n; p += r) {
+   r = write(STDOUT_FILENO, buf + p, n - p);
+   if (r < 0)
+   goto writeerr;
+   }
+   }
+
+   return 0;
+readerr:
+   fprintf(stderr, "%s: read: %s\n", argv0, strerror(errno));
+   return 1;
+writeerr:
+   fprintf(stderr, "%s: write: %s\n", argv0, strerror(errno));
+   return 1;
+}
-- 
2.11.0




Re: [hackers] [sbase][PATCH] grep: remove = flag from readme

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 09:26:34 -0800
Evan Gates  wrote:

> On Tue, Dec 27, 2016 at 3:09 AM, Laslo Hunhold
>  wrote:
> > On Wed, 30 Mar 2016 19:01:16 +0200
> > Mattias Andrée  wrote:
> >
> > Hey Mattias,
> >  
> >>   $ echo äö | ./grep [å]
> >>   äö
> >>
> >> This is not want one expects from
> >> a program that supports UTF-8.  
> >
> > as a general note, we may think about adding a
> > setlocale() when we access the regex-engine. What do
> > you guys think?
> >
> > Cheers
> >
> > Laslo
> >
> > --
> > Laslo Hunhold 
> >  
> 
> Given the two options of using setlocale() or writing our
> own regex engine, I think using setlocale() is the less
> sucky solution. If we want to revisit it in the future we
> can but it'll give us working tools now.
> 

We still need a new regex-engine to support NUL bytes,
but perhaps that can be circumvented. We also need a
faster engine, currently both musl and glibc is too
slow for any serious grepping.


pgpygnwUm5fjW.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] Add base64(1), base32(1), and base16(1)

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 12:01:01 +0100
Laslo Hunhold  wrote:

> On Tue, 29 Mar 2016 18:59:24 +0200
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> what justifies the existence of base16(1), base32(1) and
> base64(1)? I think there need to be good reasons, given
> it would add almost 500 lines of code to maintain to
> sbase.
> 
> I think I am not the only one here who really appreciates
> your work on sbase. So, maybe in the future, if you plan
> on working something, maybe you could post your ideas
> here, so you are not wasting time on things that get
> rejected anyway.
> 
> Cheers
> 
> Laslo
> 

I cannot remember right now, but think it's necessary
for sbase (but that doesn't make it less code to maintain).

But I guess base64(1) can be used for obfuscation or
escaping binary data in text files.


pgpI8x4XeBqug.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] Add nologin(8) (from ubase) and simplify it

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 12:14:58 +0100
Laslo Hunhold  wrote:

> On Mon,  4 Apr 2016 13:22:05 +0200
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> >  
> 
> do we really need the option to specify an "input file"?
> Even the shadow-utils do not support an input file.

util-linux support /etc/nologin.txt, but it doesn't
look like other implementations do. I can't think of
any use case that cannot be solved better with
existing tools, so it can be removed.

> 
> Cheers
> 
> Laslo
> 



pgpTkMvtaBB00.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] Add tac(1)

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 16:07:01 +0100
Laslo Hunhold  wrote:

> On Tue, 27 Dec 2016 16:05:19 +0100
> Mattias Andrée  wrote:
> 
> Hey Mattias,
>  
> > I actually use tac(1) a lot, but I can't think of
> > anything I have used it for right now. However, it can
> > be used for reversing the output after sort(1), however
> > POSIX specifies -r for sort(1p) which does this, but
> > chances are your common user is not aware of this.  
> 
> but sort(1) requires sorted input, whereas tac(1) can
> operate on any input.

You sort(1) sorts the input. And yes, tac(1) is more
flexible, but that was the only use case I could think
of off the top of my head. But now I remember that I
have used it a number of times to reverse the output
of find(1) to get directories listed after the files
without the directories.

> 
> > I can't see the rationale for adding this behaviour to
> > tail(1). If it is added to tail, the flag would do two
> > things instead of one thing: reversing the output, and
> > output the entire file. It would make more sense to add
> > it to cat(1), perhaps you men to write “cat”. I would
> > think that this is a good idea, but since tac(1)
> > already exists and -r for cat(1) doesn't, I think it is
> > better to go with tac(1), but I'm flexible.  
> 
> What do the others think?
> 
> Cheers
> 
> Laslo
> 



pgpa7gSNsKEh0.pgp
Description: OpenPGP digital signature


Re: [hackers] [ubase] [PATCH] install: ignore -s

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 13:56:07 +0100
Laslo Hunhold  wrote:

> On Sat,  3 Dec 2016 12:51:14 +0100
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> > The -s flag previously called strip(1) on the installed
> > file. This patch changes install(1)'s behaviour to
> > ignore -s.
> > 
> > Many makefiles use the -s flag, so it has to be
> > recognised for compatibility, however it does not have
> > to do anything because symbols do not negatively affect
> > the functionallity of binaries.
> > 
> > Ignoring -s have the added benefit that the user do not
> > need to edit makefiles if they want the symbols that
> > are useful for debugging. If the user wants to strip
> > away symbols, it can be done manually or automatically
> > by the package manager.  
> 
> thanks, applied! I favor just not documenting the s-flag
> in xinstall.1, but understand both sides of the argument.
> We have many examples in sbase, like sort -m, but also as
> we can see install -c and others.
> 
> Maybe we can discuss this here. Another alternative would
> be to put that in a NOTES section. I really don't want to
> see the manpages filled up with ignored options, making
> them harder to read. On the other hand, a manpage should
> reflect the code and no matter how we put it, we do
> handle the c and s flags specially here.
> 
> Cheers
> 
> Laslo
> 

What about adding an IGNORED OPTIONS section after OPTIONS,
where it is stated what the ignored options were intended
to do, and perhaps a rationale for why it is ignored? This
way users can be sure that it is safe to use a flag and
that we will not add a behaviour to it in the future that
they did not foresee.


pgp9TFQD_MCf0.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] Add tac(1)

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 11:29:58 +0100
Laslo Hunhold  wrote:

> On Sat, 26 Mar 2016 12:08:28 +0100
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> > Signed-off-by: Mattias Andrée 
> > ---
> >  Makefile   |  1 +
> >  README |  1 +
> >  libutil/getlines.c |  3 ++-
> >  tac.1  | 20 
> >  tac.c  | 69
> > + +
> > text.h |  3 ++-  
> 
> wouldn't it be better to add the r-flag to tail(1)? I
> personally don't see tac(1) at all in the wild, but prove
> me wrong in case I'm missing something.

I actually use tac(1) a lot, but I can't think of anything
I have used it for right now. However, it can be used for
reversing the output after sort(1), however POSIX specifies
-r for sort(1p) which does this, but chances are your common
user is not aware of this.

I can't see the rationale for adding this behaviour to tail(1).
If it is added to tail, the flag would do two things instead
of one thing: reversing the output, and output the entire file.
It would make more sense to add it to cat(1), perhaps you
men to write “cat”. I would think that this is a good idea,
but since tac(1) already exists and -r for cat(1) doesn't, I
think it is better to go with tac(1), but I'm flexible.

> 
> Cheers
> 
> Laslo
> 



pgpPapyJswiXV.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] basename, dirname, printf: recognise -- and fail if options are used.

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 15:30:18 +0100
Laslo Hunhold  wrote:

> On Tue, 27 Dec 2016 15:26:09 +0100
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> > POSIX explicitly states that echo(1) shall treat “--”
> > as a string operand and not as a delimiter. My guess
> > is that this is for historical reason, much like the
> > existence of echo(1) itself.  
> 
> ah now I remember, thanks. You know, sbase takes the
> liberty of moving away from Posix where consistency is a
> concern. In this case I think our result from a long
> debate was to say: Well, we just treat "--" as a normal
> string for tools that do not accept any flags.
> 
> For instance, if echo does not interpret --, why doesn't
> printf also follow echo in this behaviour?
> 
> Cheers
> 
> Laslo
> 

Okay, I personally do not agree with this and see echo(1)
as an abomination, it treats any unrecognised flags as
strings, but if had debate on it, keep to want you agreed
to.


pgpiGyH9ro0em.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] Improvements to sleep(1):

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 11:45:39 +0100
Laslo Hunhold  wrote:

> On Sat, 26 Mar 2016 18:31:57 +0100
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> > - Add support for floating pointer numbers,
> >   which you will find in other implementations.  
> 
> I do not favor this aspect, as Dimitris already
> established well.
> 
> > - Add support for suffixes [s|m|h|d], as
> >   found in GNU sleep, which is very useful.  
> 
> This is the only thing I could remotely live with, but
> this would require us to hand-fiddle with strtod() and
> have a "lookup-table" for residuums and multiplication
> factors (in seconds), e.g.
> 
>   "s" -> 1
>   "m" -> 60
>   "h" -> 60 * 60
>   "d" -> 24 * 60 * 60
> 
> and then do a
> 
>   for (... i ...) {
>   if (strcmp(endptr, lookuptable[i].str)) {
>   sleeptime *= lookuptable[i].fac
>   }
>   }
> 
> However, anything involving longer "waiting" times than 1
> hour should be done with cron(1). I assume that a naïve
> programmer would get the idea of having a rc-script that
> just idles for 24 hours in an infinite loop, and having
> flags in sleep encouraging this is not helpful to break
> this premise.
> 
> > - Add support for multiple arguments. Not
> >   really necessary, but brings compatibility
> >   at barely any cost.  
> 
> What is the purpose of this aspect?
> 
> Cheers
> 
> Laslo
> 

I have withdrawn this proposal.

But the idea of having multiple arguments is that you
can write for example “1h 30m”, of course, this is not
important is you can just as well write “90m”, and in
case where the calculations are that simple you can
use expr(1), but implementing it didn't really complicate
anything and brings compatibility.

Waits shorted that 1s is useful in things like rc-scripts
that need to wait for a short period. And longer waits
are useful in cases like if you are setting up a simple
alarm if you want to take a short rest, or if you have
a lot of things running in different terminals (because
you want to see the output of each) but you don't want
all of the to run a the same time.


pgpOf4LqtK14w.pgp
Description: OpenPGP digital signature


Re: [hackers] [ubase][PATCH] Add setsid(1) from sbase with -c added

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 12:05:37 +0100
Laslo Hunhold  wrote:

> On Tue, 29 Mar 2016 20:39:37 +0200
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> >  
> 
> can you give some motivation for adding the c-flag?

My sh-based getty replacement requires it.

> 
> Cheers
> 
> Laslo
> 



pgpi14s7D8bN8.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] test: add -E

2016-12-27 Thread Mattias Andrée
It is not use elsewhere, and I have learned that shells
implement their own version of if test(1) because they
add flags that can only be implemented inside the shell,
so adding new functionality to test(1) is meaningless
even if the functionality is useful.

On Tue, 27 Dec 2016 11:52:14 +0100
Laslo Hunhold  wrote:

> On Mon, 28 Mar 2016 23:50:42 +0200
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> > This is an non-standard extension.
> > 
> > It tests with a file exists, is a directory, and is
> > empty. Tests against non-existing files,
> > non-directories and non-empty directories causes exit
> > status 1.
> > 
> > It is a sane alternative to writing
> > 
> >   test -z "$(ls -A "$1")"  
> 
> I think as with any discussion about adding nonstandard
> extensions, the relevant question is wether this is
> widely used out there. Can you give some examples were it
> is used?
> 
> Cheers
> 
> Laslo
> 



pgpry0VN1CELp.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] basename, dirname, printf: recognise -- and fail if options are used.

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 15:21:11 +0100
Laslo Hunhold  wrote:

> On Tue, 27 Dec 2016 15:14:44 +0100
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> > POSIX says “--” should be supported unless stated
> > otherwise. I interpret that standard saying that this
> > also applies utilities that do not take any flags. And
> > to be on the safe side I think it is a good idea to
> > support “--” for two reasons: (1) existing scripts may
> > require it, and (2) if POSIX adds a flag to the utility
> > in the future we must support “--” now, otherwise
> > compatibility will be broken.  
> 
> so what about "echo --"?
> 
> Cheers
> 
> Laslo
> 

POSIX explicitly states that echo(1) shall treat “--”
as a string operand and not as a delimiter. My guess
is that this is for historical reason, much like the
existence of echo(1) itself.


pgpkMrgLX4Pr4.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] basename, dirname, printf: recognise -- and fail if options are used.

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 11:26:07 +0100
Laslo Hunhold  wrote:

> On Fri, 25 Mar 2016 21:31:37 +0100
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> >  basename.c |  5 -
> >  dirname.c  |  6 +-
> >  printf.c   | 12 
> >  3 files changed, 17 insertions(+), 6 deletions(-)  
> 
> I do not support this patch, as "--" only makes sense for
> tools that actually take flags.
> I pity that GNU basename(1), dirname(1) and printf(1)
> take flags, however, see great value in being able to do
> 
>   $ printf --
> 
> and get what you asked for instead of some error message
> 
>   printf: usage: printf [-v var] format [arguments]
> 
> Cheers
> 
> Laslo
> 

POSIX says “--” should be supported unless stated otherwise.
I interpret that standard saying that this also applies utilities
that do not take any flags. And to be on the safe side I think
it is a good idea to support “--” for two reasons: (1) existing
scripts may require it, and (2) if POSIX adds a flag to the
utility in the future we must support “--” now, otherwise
compatibility will be broken.


pgpNxFM3rLXhu.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase][PATCH] Add shuf(1)

2016-12-27 Thread Mattias Andrée
On Tue, 27 Dec 2016 11:32:32 +0100
Laslo Hunhold  wrote:

> On Sat, 26 Mar 2016 13:50:47 +0100
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> > +static int
> > +random_byte_file(void)
> > +{
> > +   unsigned char r;
> > +   ssize_t n = read(source, &r, 1);
> > +   if (n < 0)
> > +   eprintf("read %s:", sflag);
> > +   if (!n)
> > +   eprintf("read %s: end of file
> > reached\n");
> > +   return r;
> > +}
> > +
> > +static int
> > +random_byte_libc(void)
> > +{
> > +   double r;
> > +   r = rand();
> > +   r /= (double)RAND_MAX + 1;
> > +   r *= 256;
> > +   return (int)r;
> > +}  
> 
> is there a good reason for the existence of shuf(1)?
> Also, we may want to think about using more solid
> interfaces for randomness (like arc4random()) and remove
> the "file-source" altogether.
> 
> Cheers
> 
> Laslo
> 

Hi Laslo!

No, we don't really need shuf(1) in sbase, but I think we
should have a suckless implementation available, it can be
a useful utility. I have a few more utilities I fund useful
but I haven't bothered to set up a repository yet. I tried
to start a discussion with Dimitris some time ago, but I
didn't get a response. I think it might be a good idea to
have sextra for portable utilities and uextra for unportable
utilities, if you have any other suggestions I would like
to hear them.

For sextra I have written base16(1), base32(1), base64(1),
prune(1) which recursively removes empty directories, rev(1)
and shuf(1). For uextra I have written fsize(1) which
print the size of any regular file or block device (other
do not print the size of block devices so it can be quite
burdensome to find out how large one is), printenvx(1) which
is like printenv(1) but for other processes, and shred(1),
and I'm working on rescue(1) which is similar to ddrescue(1).

I'm not sure that arc4random() is portable, but my
understanding is that each bit in the output of rand()
have the same entropy in modern libc implementations, and
that is all that is needed in my opinion. I don't know
whether it is a good to include reading random data from
files, so it should probably be removed.


Mattias Andrée



pgpx8wQCYP9j2.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase] [PATCH] install: ignore -s

2016-12-03 Thread Mattias Andrée
On Sat, 3 Dec 2016 13:44:38 +0100
Quentin Rameau  wrote:

> > diff --git a/xinstall.1 b/xinstall.1
> > index 1a727d3..bda4d42 100644
> > --- a/xinstall.1
> > +++ b/xinstall.1
> > @@ -12,7 +12,7 @@
> >  .Po
> >  .Fl d Ar dir ...
> >  |
> > -.Op Fl sD
> > +.Op Fl D
> >  .Po
> >  .Fl t Ar dest
> >  .Ar source ...
> > @@ -62,10 +62,6 @@ is copied with
> >  Change the installed files' owner to
> >  .Ar owner .
> >  This may be a user name or a user identifier.
> > -.It Fl s
> > -Remove unnecessary symbols using
> > -.Xr strip 1 .
> > -Failure to strip a file does not imply failure to
> > install the file. .It Fl t Ar dest
> >  Copy files into the directory
> >  .Ar dest .  
> 
> Wouldn't it be wiser to document that this flag isn't
> unsupported but rather ignored?
> 

You're right, I just did the same thing as with -c, but
the -c should be listed as ignored in the manpage too.


pgpqKCUYqSHxj.pgp
Description: OpenPGP digital signature


[hackers] [ubase] [PATCH] install: ignore -s

2016-12-03 Thread Mattias Andrée
The -s flag previously called strip(1) on the installed file.
This patch changes install(1)'s behaviour to ignore -s.

Many makefiles use the -s flag, so it has to be recognised for
compatibility, however it does not have to do anything because
symbols do not negatively affect the functionallity of binaries.

Ignoring -s have the added benefit that the user do not need
to edit makefiles if they want the symbols that are useful for
debugging. If the user wants to strip away symbols, it can be
done manually or automatically by the package manager.

Signed-off-by: Mattias Andrée 
---
 xinstall.1 |  9 ++---
 xinstall.c | 24 ++--
 2 files changed, 4 insertions(+), 29 deletions(-)

diff --git a/xinstall.1 b/xinstall.1
index 1a727d3..bda4d42 100644
--- a/xinstall.1
+++ b/xinstall.1
@@ -12,7 +12,7 @@
 .Po
 .Fl d Ar dir ...
 |
-.Op Fl sD
+.Op Fl D
 .Po
 .Fl t Ar dest
 .Ar source ...
@@ -62,10 +62,6 @@ is copied with
 Change the installed files' owner to
 .Ar owner .
 This may be a user name or a user identifier.
-.It Fl s
-Remove unnecessary symbols using
-.Xr strip 1 .
-Failure to strip a file does not imply failure to install the file.
 .It Fl t Ar dest
 Copy files into the directory
 .Ar dest .
@@ -79,8 +75,7 @@ notation is used, the base mode is .
 .Xr chmod 1 ,
 .Xr chown 1 ,
 .Xr cp 1 ,
-.Xr mkdir 1 ,
-.Xr strip 1
+.Xr mkdir 1
 .Sh STANDARDS
 The
 .Nm
diff --git a/xinstall.c b/xinstall.c
index bf921fb..e91e703 100644
--- a/xinstall.c
+++ b/xinstall.c
@@ -13,7 +13,6 @@
 #include "text.h"
 
 static int Dflag = 0;
-static int sflag = 0;
 static gid_t group;
 static uid_t owner;
 static mode_t mode = 0755;
@@ -41,22 +40,6 @@ make_dirs(char *dir, int was_missing)
make_dir(dir, was_missing);
 }
 
-static void
-strip(const char *filename)
-{
-   pid_t pid = fork();
-   switch (pid) {
-   case -1:
-   eprintf("fork:");
-   case 0:
-   execlp("strip", "strip", "--", filename, (char *)0);
-   eprintf("exec: strip:");
-   default:
-   waitpid(pid, NULL, 0);
-   break;
-   }
-}
-
 static int
 install(const char *s1, const char *s2, int depth)
 {
@@ -125,9 +108,6 @@ install(const char *s1, const char *s2, int depth)
eprintf("fclose %s:", s2);
if (fclose(f1) == EOF)
eprintf("fclose %s:", s1);
-
-   if (sflag)
-   strip(s2);
}
 
if (lchown(s2, owner, group) < 0)
@@ -166,7 +146,7 @@ main(int argc, char *argv[])
Dflag = 1;
break;
case 's':
-   sflag = 1;
+   /* no-op for compatibility */
break;
case 'g':
gflag = EARGF(usage());
@@ -184,7 +164,7 @@ main(int argc, char *argv[])
usage();
} ARGEND
 
-   if (argc < 1 + (!tflag & !dflag) || dflag & (Dflag | sflag | !!tflag))
+   if (argc < 1 + (!tflag & !dflag) || dflag & (Dflag | !!tflag))
usage();
 
if (gflag) {
-- 
2.10.2




[hackers] Re: [sbase] [PATCH] install: ignore -s

2016-12-03 Thread Mattias Andrée
Whoops, write [ubase] instead of [sbase] by mistake.

On Sat,  3 Dec 2016 12:51:14 +0100
Mattias Andrée  wrote:

> The -s flag previously called strip(1) on the installed
> file. This patch changes install(1)'s behaviour to ignore
> -s.
> 
> Many makefiles use the -s flag, so it has to be
> recognised for compatibility, however it does not have to
> do anything because symbols do not negatively affect the
> functionallity of binaries.
> 
> Ignoring -s have the added benefit that the user do not
> need to edit makefiles if they want the symbols that are
> useful for debugging. If the user wants to strip away
> symbols, it can be done manually or automatically by the
> package manager.
> 
> Signed-off-by: Mattias Andrée 
> ---
>  xinstall.1 |  9 ++---
>  xinstall.c | 24 ++--
>  2 files changed, 4 insertions(+), 29 deletions(-)
> 
> diff --git a/xinstall.1 b/xinstall.1
> index 1a727d3..bda4d42 100644
> --- a/xinstall.1
> +++ b/xinstall.1
> @@ -12,7 +12,7 @@
>  .Po
>  .Fl d Ar dir ...
>  |
> -.Op Fl sD
> +.Op Fl D
>  .Po
>  .Fl t Ar dest
>  .Ar source ...
> @@ -62,10 +62,6 @@ is copied with
>  Change the installed files' owner to
>  .Ar owner .
>  This may be a user name or a user identifier.
> -.It Fl s
> -Remove unnecessary symbols using
> -.Xr strip 1 .
> -Failure to strip a file does not imply failure to
> install the file. .It Fl t Ar dest
>  Copy files into the directory
>  .Ar dest .
> @@ -79,8 +75,7 @@ notation is used, the base mode is .
>  .Xr chmod 1 ,
>  .Xr chown 1 ,
>  .Xr cp 1 ,
> -.Xr mkdir 1 ,
> -.Xr strip 1
> +.Xr mkdir 1
>  .Sh STANDARDS
>  The
>  .Nm
> diff --git a/xinstall.c b/xinstall.c
> index bf921fb..e91e703 100644
> --- a/xinstall.c
> +++ b/xinstall.c
> @@ -13,7 +13,6 @@
>  #include "text.h"
>  
>  static int Dflag = 0;
> -static int sflag = 0;
>  static gid_t group;
>  static uid_t owner;
>  static mode_t mode = 0755;
> @@ -41,22 +40,6 @@ make_dirs(char *dir, int was_missing)
>   make_dir(dir, was_missing);
>  }
>  
> -static void
> -strip(const char *filename)
> -{
> - pid_t pid = fork();
> - switch (pid) {
> - case -1:
> - eprintf("fork:");
> - case 0:
> - execlp("strip", "strip", "--", filename,
> (char *)0);
> - eprintf("exec: strip:");
> - default:
> - waitpid(pid, NULL, 0);
> - break;
> - }
> -}
> -
>  static int
>  install(const char *s1, const char *s2, int depth)
>  {
> @@ -125,9 +108,6 @@ install(const char *s1, const char
> *s2, int depth) eprintf("fclose %s:", s2);
>   if (fclose(f1) == EOF)
>   eprintf("fclose %s:", s1);
> -
> - if (sflag)
> - strip(s2);
>   }
>  
>   if (lchown(s2, owner, group) < 0)
> @@ -166,7 +146,7 @@ main(int argc, char *argv[])
>   Dflag = 1;
>   break;
>   case 's':
> - sflag = 1;
> + /* no-op for compatibility */
>   break;
>   case 'g':
>   gflag = EARGF(usage());
> @@ -184,7 +164,7 @@ main(int argc, char *argv[])
>   usage();
>   } ARGEND
>  
> - if (argc < 1 + (!tflag & !dflag) || dflag &
> (Dflag | sflag | !!tflag))
> + if (argc < 1 + (!tflag & !dflag) || dflag &
> (Dflag | !!tflag)) usage();
>  
>   if (gflag) {



pgpau5ScVnd2W.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase] [PATCH] xinstall: Fix broken memmove with -t

2016-12-02 Thread Mattias Andrée
On Fri, 2 Dec 2016 13:54:22 +0100
Anselm R Garbe  wrote:

> Hi there,
> 
> On 2 December 2016 at 13:34, Mattias Andrée
>  wrote:
> > If it's actually need, the package could call strip on
> > the binaries that fail, or the developer can call strip
> > explicitly. If it's even possible symbols could be
> > problem, it's so rare that it wouldn't be much of a
> > headache to deal with.  
> 
> IMHO install has always been a bad idea and a good
> example of violating the Unix philosophy. Hence my
> suggestion would be to ban it completely from sbase.

I would love to see install banned, but I think it's
unpractical. Virtual all makefiles must be modified,
and that is going to be a lot of work for any distro
that chooses to uses sbase and what to have a lot of
package available for their users.

> 
> -Anselm
> 



pgpJca9I0LNdq.pgp
Description: OpenPGP digital signature


Re: [hackers] [sbase] [PATCH] xinstall: Fix broken memmove with -t

2016-12-02 Thread Mattias Andrée
On Fri, 2 Dec 2016 13:22:16 +0100
Laslo Hunhold  wrote:

> On Thu,  1 Dec 2016 22:50:20 -0800
> Michael Forney  wrote:
> 
> Hey Michael,
> 
> > memmove moves a number of bytes, not pointers, so if
> > you passed a number of arguments that is larger than
> > the pointer byte size, you could end up crashing or
> > skipping the install of a file and installing another
> > twice.  
> 
> well-observed, nice find!
> 
> > Also, argv was never decreased to match the moved
> > arguments, so the -t parameter was added in the NULL
> > argv slot.  
> 
> > -   memmove(argv - 1, argv, argc);
> > +   argv = memmove(argv - 1, argv, argc *
> > sizeof(*argv));  
> 
> I got to admit that this piece of code is really ugly to
> begin with. We _must not_ use memmove here as we invoke
> undefined behaviour, given the two memory regions overlap.
> Also, it's really bad style to call the value "tflag",
> given it's not an int but actually a char pointer to the
> name of the target folder, so "tflag" should rather be
> called "target". Same applies to the other values.
> 
> I am wondering if we even need this. I mean, we already
> "consume" the target directory in ARGBEGIN ... ARGEND and
> thus are only left with sources in argv.
> 
> Moreover, I generally question the existence of some
> flags for install (1), like -s to strip symbols. Do we
> really need this?

-s needs to exist because people actually use it in makefiles.
But it could be made into a dummy flag, it might even be
preferable, because then you don't have to remove use of -s
from makefile if you want the package you install to retain
symbols.

> Especially with non-standardized tools
> like install(1), we need to be careful not to swallow the
> waste that has accumulated over the years. The usage of
> this tool has become so complicated that using it
> properly becomes harder and harder with the number of
> options growing. What are your thoughts? Is the -s flag
> direly needed for some applications? Should we just
> silently ignore it?

If it's actually need, the package could call strip on
the binaries that fail, or the developer can call strip
explicitly. If it's even possible symbols could be problem,
it's so rare that it wouldn't be much of a headache to
deal with.

> 
> Cheers
> 
> Laslo
> 



pgpNRibYdD7YF.pgp
Description: OpenPGP digital signature


[hackers] [PATCH][ubase] respawn: reopen the fifo at end of line, and use read-only

2016-09-25 Thread Mattias Andrée
Signed-off-by: Mattias Andrée 
---
 respawn.c | 35 +--
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/respawn.c b/respawn.c
index 77670f5..7fdd1ec 100644
--- a/respawn.c
+++ b/respawn.c
@@ -1,5 +1,4 @@
 /* See LICENSE file for copyright and license details. */
-#include 
 #include 
 #include 
 #include 
@@ -7,6 +6,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -37,9 +37,9 @@ main(int argc, char *argv[])
pid_t pid;
char buf[BUFSIZ];
int savederrno;
-   int fd;
ssize_t n;
-   fd_set rdfd;
+   struct pollfd pollset[1];
+   int polln;
 
ARGBEGIN {
case 'd':
@@ -63,26 +63,33 @@ main(int argc, char *argv[])
signal(SIGTERM, sigterm);
 
if (fifo) {
-   /* TODO: we should use O_RDONLY and re-open the fd on EOF */
-   fd = open(fifo, O_RDWR | O_NONBLOCK);
-   if (fd < 0)
+   pollset->fd = open(fifo, O_RDONLY | O_NONBLOCK);
+   if (pollset->fd < 0)
eprintf("open %s:", fifo);
+   pollset->events = POLLIN;
}
 
while (1) {
if (fifo) {
-   FD_ZERO(&rdfd);
-   FD_SET(fd, &rdfd);
-   n = select(fd + 1, &rdfd, NULL, NULL, NULL);
-   if (n < 0)
-   eprintf("select:");
-   if (n == 0 || FD_ISSET(fd, &rdfd) == 0)
-   continue;
-   while ((n = read(fd, buf, sizeof(buf))) > 0)
+   pollset->revents = 0;
+   polln = poll(pollset, 1, -1);
+   if (polln <= 0) {
+   if (polln == 0 || errno == EAGAIN)
+   continue;
+   eprintf("poll:");
+   }
+   while ((n = read(pollset->fd, buf, sizeof(buf))) > 0)
;
if (n < 0)
if (errno != EAGAIN)
eprintf("read %s:", fifo);
+   if (n == 0) {
+   close(pollset->fd);
+   pollset->fd = open(fifo, O_RDONLY | O_NONBLOCK);
+   if (pollset->fd < 0)
+   eprintf("open %s:", fifo);
+   pollset->events = POLLIN;
+   }
}
pid = fork();
if (pid < 0)
-- 
2.10.0




Re: [hackers] [suckless.org][PATCH] libzahl does not seem WIP anymore (1.0 was hit some time ago).

2016-09-19 Thread Mattias Andrée
1.0 marked it as usable, however there is still a lot
of work to be done for it be an acceptable replacement
for most users. This is why it is still marked WIP; so
people don't try it, thinks its too slow and otherwise
problematic, and never comes back to see if it is worth
using.

On Tue, 20 Sep 2016 00:40:54 +0200
pranomostro  wrote:

> ---
>  suckless.org/sucks/index.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/suckless.org/sucks/index.md
> b/suckless.org/sucks/index.md index 9e7135c..bf2be33
> 100644 --- a/suckless.org/sucks/index.md
> +++ b/suckless.org/sucks/index.md
> @@ -26,7 +26,7 @@ possible to avoid them. If you use
> them, consider looking for alternatives.
>  * [GMP][3] - GNU's bignum/arbitrary precision library.
> Quite bloated, slow and [calls abort() on failed
> malloc][4] 
> -  Alternatives: [libtommath][5], [TomsFastMath][6],
> [MPI][7], [libzahl][11] (WIP), [hebimath][12] (WIP)
> +  Alternatives: [libtommath][5], [TomsFastMath][6],
> [MPI][7], [libzahl][11], [hebimath][12] (WIP) 
>  
>  [1]: http://library.gnome.org/devel/glib/



pgpwEe2rt7oO9.pgp
Description: OpenPGP digital signature


Re: [hackers] [swerc] local suckless.org copyright notice || Anselm R Garbe

2016-06-13 Thread Mattias Andrée
On Mon, 13 Jun 2016 16:35:34 +0200
FRIGN  wrote:

> On Mon, 13 Jun 2016 14:17:01 +0200 (CEST)
> g...@suckless.org wrote:
> 
> > -   © 2006-2013 suckless.org community |  > href="http://garbe.us/Contact";>Impressum
> > +   © 2006-2015 suckless.org community |  > href="http://garbe.us/Contact";>Impressum  
> 
> Isn't it 2016?
> 

Not when he made the commit.
But now it is.


pgp7Z9ilM1u4K.pgp
Description: OpenPGP digital signature


Re: [hackers] [libzahl] Switch to ISC license. || Mattias Andrée

2016-06-02 Thread Mattias Andrée
On Thu, 2 Jun 2016 22:20:42 +0200
FRIGN  wrote:

> On Thu, 2 Jun 2016 22:01:47 +0200
> Mattias Andrée  wrote:
> 
> Hey Mattias,
> 
> > Do you think anyone have choked on pastries before and
> > sued? Do most countries actually require that you state
> > that your are not responsible for damages causes by
> > something you are warning about? The warning is enough
> > here, or at least I have never seen a disclaimer, only
> > a warning label, and we don't even have translation for
> > ‘disclaimer’.  
> 
> it's not about countries, but commercial law. If you hit
> a judge who is not very smart when it comes to computers
> you can have a bad day and actually get sued for such
> damages. Why take the risk?

Of course you should avoid the risk. Therefore, disclaimers
will be necessary until everyone agrees otherwise.

> 
> > Nice one, but I think the FreeBSD project does too much
> > GPL-bashing. GPL is a good license, at least if you
> > value free software higher than open source.  
> 
> GPL is not about freedom, it's about control. There are

Well, like free software, it is about whom should be in
control. Free software, and therefore the GPL, is about
removing control from the developer and put in into the
user's hand. They are of course welcome do disagree with
this philosophy, but bashing it every chance you get...

> hundreds of examples where companies contributed back
> to non-GPL projects because they are happily using it
> internally (closed) but still value the open source
> character of it.
> The GPL still has the mindset of evil corporations of
> the 90's. There are still evil players today, but
> everbody has to agree that the open source contributions
> of numerous companies cannot be ignored.
> The level of control the GPL forces on you, even if you
> want to write open source software, is insane and
> ridiculous. If you look at it closer, the GPL has
> the characteristics of cancer or a parasite.
> Additionally, by publishing your software under the GPL,
> most companies would not use your code and actually
> write their own version (which is most likely worse and
> full of bugs). This leads to the situation of many
> people actually having a really bad time with software,
> because the software they buy is actually full of
> horrible horrible code that could be avoided.
> 
> In theory, the wonderland the GPL proposes "works".
> In reality, nowadays, it doesn't make a lot of sense
> any more.
> Richard Stallman used to be right. The companies of
> the 90's were not yet accustomed to an Open Source
> environment, but nowadays, his radical claims are
> just borderline insane. He for instance calls
> OpenBSD a non-free distribution, because they link
> to non-free software in the ports tree.

I don't know how the OpenBSD ports tree look, but I
imagine that they do warn users that the software users
are about to install is non-free. Parabola GNU/Linux
have non-free software in the repository, but they do
warn you, and Parabola is endorsed by the FSF. This
does make since if you believe in free software.

> Keep in mind that OpenBSD is completely blob-free,
> which is only achieved in the Linux-camp by obscure
> distributions nobody uses.
> 
> So, the net-gain is this: The super-radical position
> of the FSF actually does more damage than it brings
> good, as people will never use the obscure FSF-
> distributions. They won't listen to rms's ramblings
> and songs either, because he still has not understood
> the changes the market has undergone in the last decade.
> 
> We generally have to ask ourselves the question if
> we really should ramp up on the FSF anti-propaganda.
> The FSF has the biggest funding of all Open Source
> non-profit organizations (afaik), but what do they
> achieve relative to their size?
> I sometimes regret imagining what the OpenBSD
> foundation would do with all that money. They actually
> write useful software and make a positive impact.
> The FSF's initiatives, especially in regard to
> Gender mainstreaming and other marxist ideologies
> is, to say it likely, a long reach to computer
> science.
> 
> Just food for thought, please don't start a discussion
> here about this. I don't care abour your opinion that
> much anyway.

But you didn't have to write all this. I do not
disagree with you. I just wanted to merely state
that GPL does achieve its goal, a goal there is
nothing wrong with, even if it a bit obsolescent.
The FreeBSD project also suffers from a delicious
of superiority and a compulsion to bash other
projects they believe themselves to be better than.

> 
> Cheers
> 
> FRIGN
> 



pgpOSR_dB5JwB.pgp
Description: OpenPGP digital signature


  1   2   >