Hi,

our locale(1) implementation is intentionally simplistic
and implements only a subset of this POSIX specification:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/locale.html

However, one feature is missing that is actually useful and arguably
also well-placed inside the locale(1) utility.  If you want to know
from within a C program which character encoding is actually being
used (as opposed to which one the user requested), you can use the
nl_langinfo(3) function.  But i'm not aware of a possibiliy to ask
the same from within a sh(1) program.

POSIX says that "locale charmap" should answer that question.

In the next release of textproc/groff, that feature of locale(1)
will be used in the test suite, and it seems reasonable to do so.

So, here is a very simple patch to support the "charmap" argument.

Testing:

   $ export LC_CTYPE=en_US.UTF-8
   $ locale
  LANG=
  LC_COLLATE="C"
  LC_CTYPE=en_US.UTF-8
  LC_MONETARY="C"
  LC_NUMERIC="C"
  LC_TIME="C"
  LC_MESSAGES="C"
  LC_ALL=

   $ locale -a | wc
      68      68     794
   $ locale -m
  UTF-8
   $ locale charmap
  UTF-8
   $ LC_ALL=C locale charmap
  US-ASCII
   $ LC_ALL=POSIX locale charmap
  US-ASCII

   $ LC_ALL=NonSense locale charmap
  US-ASCII
   $ locale -x
  locale: unknown option -- x
  usage: locale [-a | -m | charmap]
   $ locale nonsense
  usage: locale [-a | -m | charmap]
   $ locale -am 
  usage: locale [-a | -m | charmap]
   $ locale -a charmap
  usage: locale [-a | -m | charmap]
   $ locale -m charmap
  usage: locale [-a | -m | charmap]
   $ locale charmap nonsense
  usage: locale [-a | -m | charmap]

OK?
  Ingo


P.S.
It would be trivial to also support the POSIX -k option, as in
   $ locale -k charmap
  charmap="UTF-8"
but that doesn't actually feel useful and i'm not aware of anything
that might want to use it, so KISS and let's proceed one step at a time.
Supporting "name" arguments other than "charmap" would make little
sense on OpenBSD, nor would the -c option.


Index: locale.1
===================================================================
RCS file: /cvs/src/usr.bin/locale/locale.1,v
retrieving revision 1.7
diff -u -p -r1.7 locale.1
--- locale.1    26 Oct 2016 01:00:27 -0000      1.7
+++ locale.1    16 Apr 2020 19:04:25 -0000
@@ -1,6 +1,6 @@
 .\" $OpenBSD: locale.1,v 1.7 2016/10/26 01:00:27 schwarze Exp $
 .\"
-.\" Copyright 2016 Ingo Schwarze <schwa...@openbsd.org>
+.\" Copyright 2016, 2020 Ingo Schwarze <schwa...@openbsd.org>
 .\" Copyright 2013 Stefan Sperling <s...@openbsd.org>
 .\"
 .\" Permission to use, copy, modify, and distribute this software for any
@@ -23,7 +23,7 @@
 .Nd character encoding and localization conventions
 .Sh SYNOPSIS
 .Nm locale
-.Op Fl a | Fl m
+.Op Fl a | Fl m | Cm charmap
 .Sh DESCRIPTION
 If the
 .Nm
@@ -31,7 +31,7 @@ utility is invoked without any arguments
 configuration is shown.
 .Pp
 The options are as follows:
-.Bl -tag -width Ds
+.Bl -tag -width charmap
 .It Fl a
 Display a list of supported locales.
 .It Fl m
@@ -39,6 +39,11 @@ Display a list of supported character en
 On
 .Ox ,
 this always returns UTF-8 only.
+.It Cm charmap
+Display the currently selected character encoding.
+On
+.Ox ,
+this returns either US-ASCII or UTF-8.
 .El
 .Pp
 A locale is a set of environment variables telling programs which
Index: locale.c
===================================================================
RCS file: /cvs/src/usr.bin/locale/locale.c,v
retrieving revision 1.12
diff -u -p -r1.12 locale.c
--- locale.c    5 Feb 2016 12:59:12 -0000       1.12
+++ locale.c    16 Apr 2020 19:04:25 -0000
@@ -16,6 +16,7 @@
  */
 
 #include <err.h>
+#include <langinfo.h>
 #include <locale.h>
 #include <stdio.h>
 #include <stdlib.h>
@@ -169,7 +170,7 @@ show_locales(void)
 static void
 usage(void)
 {
-       fprintf(stderr, "usage: %s [-a | -m]\n", __progname);
+       fprintf(stderr, "usage: %s [-a | -m | charmap]\n", __progname);
        exit(1);
 }
 
@@ -203,12 +204,16 @@ main(int argc, char *argv[])
        argc -= optind;
        argv += optind;
 
-       if (argc != 0 || (aflag && mflag))
+       if (aflag + mflag + argc > 1)
                usage();
        else if (aflag)
                show_locales();
        else if (mflag)
                printf("UTF-8\n");
+       else if (strcmp(*argv, "charmap") == 0)
+               printf("%s\n", nl_langinfo(CODESET));
+       else
+               usage();
 
        return 0;
 }

Reply via email to