rs doesn't print nicely aligned columns with utf-8 inputs.
there's a few ways to handle this; here's just one.

note that the source is riddled with lots of code like:
                if (maxlen < p - *ep)   /* update maxlen */
                        maxlen = p - *ep;

I'm very scared to try counting chars vs bytes upfront in such code. However,
the code that prints spaces to pad the output is much simpler.

i'm not settled on this approach, but wanted people to look this and compare
it to other ways of doing things.

Index: rs.c
===================================================================
RCS file: /cvs/src/usr.bin/rs/rs.c,v
retrieving revision 1.27
diff -u -p -r1.27 rs.c
--- rs.c        9 Oct 2015 01:37:08 -0000       1.27
+++ rs.c        23 Oct 2015 11:40:11 -0000
@@ -198,6 +198,12 @@ putfile(void)
        }
 }
 
+int
+isu8cont(unsigned char c)
+{
+       return ((c & (0x80 | 0x40)) == 0x80);
+}
+
 void
 prints(char *s, int col)
 {
@@ -210,8 +216,11 @@ prints(char *s, int col)
        if (flags & RIGHTADJUST)
                while (n-- > 0)
                        putchar(osep);
-       for (p = s; *p; p++)
+       for (p = s; *p; p++) {
+               if (isu8cont(*p))
+                       n++;
                putchar(*p);
+       }
        while (n-- > 0)
                putchar(osep);
 }

Reply via email to