[dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-22 Thread noname
techo compares signed char to '\x20'. Any character with code less then
'\x20' is treated as control character.  This way characters with MSB
set to 1 are considered control characters too.

Also this patch makes techo display DEL character as ^?.

To reprocuce the bug, enable echo mode using printf '\e[12l',
then type DEL character or any non-ASCII character.
---
 st.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/st.c b/st.c
index 019f53c..b6cb01c 100644
--- a/st.c
+++ b/st.c
@@ -2316,9 +2316,9 @@ techo(char *buf, int len) {
for(; len > 0; buf++, len--) {
char c = *buf;
 
-   if(c < '\x20') { /* control code */
+   if(c < 0x20u || c == 0177) { /* control code */
if(c != '\n' && c != '\r' && c != '\t') {
-   c |= '\x40';
+   c ^= '\x40';
tputc("^", 1);
}
tputc(&c, 1);
-- 
1.8.4




Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-23 Thread Roberto E. Vargas Caballero
Applied with a small modification. Thanks

-- 
Roberto E. Vargas Caballero



Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-23 Thread noname
On Wed, Apr 23, 2014 at 08:26:32PM +0200, Roberto E. Vargas Caballero wrote:
> Applied with a small modification. Thanks

u suffix was here for a purpose.  The patch has comment about multibyte
characters.  As 0x20u is unsigned, c was converted to unsigned char
before comparison so multibyte characters were not considered control
characters.

Try enabling echo mode and typing non-ASCII characters (add some
non-latin layout or use compose key or copy and paste '®').



Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-23 Thread Roberto E. Vargas Caballero
Hello,

> u suffix was here for a purpose.  The patch has comment about multibyte
> characters.  As 0x20u is unsigned, c was converted to unsigned char
> before comparison so multibyte characters were not considered control
> characters.

I realized it after send the patch. I prefer use uchar instead of
suffix u. I known that gcc will complaint with a warning about pointer
to data with different signess, but I think we are the programmers
and not gcc, so using uchar is the way.

Thinking a bit more about this, I don't know why we are using -Wall,
because it moves people to follow the GNU criteria, that is the
more suck criteria I know (I think it is because they come from the
LISP world, and they want to see your code formatted as LISP, and
force you to put a lot of unneded parenthesis ;).

Regards,

-- 
Roberto E. Vargas Caballero



Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-24 Thread Christoph Lohmann
Greetings.

On Thu, 24 Apr 2014 18:20:20 +0200 "Roberto E. Vargas Caballero" 
 wrote:
> Thinking a bit more about this, I don't know why we are using -Wall,
> because it moves people to follow the GNU criteria, that is the
> more suck criteria I know (I think it is because they come from the
> LISP world, and they want to see your code formatted as LISP, and
> force you to put a lot of unneded parenthesis ;).

We  use  ‐Wall  to show all warnings. If the compiler complains, fix the
warning.  Easy. At least this keeps the code to a certain style. St  was
written  from  the  beginning in the current style and I kept it. For my
projects I moved towards the Linux kernel style.


Sincerely,

Christoph Lohmann




Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-25 Thread noname
On Thu, Apr 24, 2014 at 07:29:35AM +0200, Roberto E. Vargas Caballero wrote:
> Hello,
> 
> > u suffix was here for a purpose.  The patch has comment about multibyte
> > characters.  As 0x20u is unsigned, c was converted to unsigned char
> > before comparison so multibyte characters were not considered control
> > characters.
> 
> I realized it after send the patch. I prefer use uchar instead of
> suffix u. I known that gcc will complaint with a warning about pointer
> to data with different signess, but I think we are the programmers
> and not gcc, so using uchar is the way.

We can modify it as follows:
if(BETWEEN(c, 0x00, 0x1f) || c == 0x7f) { /* control code */

I also thought about using incntrl().  It should work the same way
everywhere but it is still possible to break it with locales.



Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-25 Thread Roberto E. Vargas Caballero
Hi,

> We  use  ‐Wall  to show all warnings. If the compiler complains, fix the
> warning.  Easy. At least this keeps the code to a certain style. St  was

The problem is Wall change from one system to another (this is something
OpenBSD users know with strcpy calls...), and it doesn't force how to
remove the warning, so at the end I think style is not really improved
with Wall (and guys, some of the warning are really, really stupid).

> projects I moved towards the Linux kernel style.

I moved to Linux kernel style some years ago, but I am beginning to break
some rules, basically the typedef rule. I accept typedef of struct always
that the new type begins with upper case.


-- 
Roberto E. Vargas Caballero



Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-25 Thread Christoph Lohmann
Greetings.

On Fri, 25 Apr 2014 16:51:15 +0200 "Roberto E. Vargas Caballero" 
 wrote:
> Hi,
> 
> > We  use  ‐Wall  to show all warnings. If the compiler complains, fix the
> > warning.  Easy. At least this keeps the code to a certain style. St  was
> 
> The problem is Wall change from one system to another (this is something
> OpenBSD users know with strcpy calls...), and it doesn't force how to
> remove the warning, so at the end I think style is not really improved
> with Wall (and guys, some of the warning are really, really stupid).

There is no solution to this but to either fix the platforms or the com‐
piler.  St does not have to adapt.


Sincerely,

Christoph Lohmann




Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-25 Thread Roberto E. Vargas Caballero
> We can modify it as follows:
>   if(BETWEEN(c, 0x00, 0x1f) || c == 0x7f) { /* control code */

I like this solution because make explicit what we want to do. With
the u suffix is a bit obscure and can generate some errors.

> I also thought about using incntrl().  It should work the same way
> everywhere but it is still possible to break it with locales.

Yeah, I have also thought about it, but there are 8 bit version of
the control characters [1]. Since we are emulating a piece of hardware
we should avoid the functions of ctype.

If you send a patch with the version with BETWEEN I will apply it.

Regards,


-- 
Roberto E. Vargas Caballero



Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-28 Thread Roberto E. Vargas Caballero
>> The problem is Wall change from one system to another (this is something 
>> OpenBSD users know with strcpy calls...), and it doesn't force how to 
>> remove the warning, so at the end I think style is not really improved 
>> with Wall (and guys, some of the warning are really, really stupid). 

> There is no solution to this but to either fix the platforms or the com‐ 
> piler. St does not have to adapt. 

I agree with you, but I think is we want to follow a style, we have to
write a style guide, and not follow the Wall of the compiler, that
can change, and that generate a lot of incorrect warnings. For example,
after my last patch, compiling in linux you do not get any warning,
but if you compile in OpenBSD you get this output:

In file included from /usr/X11R6/include/X11/Xlib.h:47,
 from st.c:25:
/usr/X11R6/include/X11/Xfuncproto.h:144:24: warning: ISO C does not permit 
named variadic macros
st.c: In function 'techo':
st.c:2303: warning: comparison is always false due to limited range of data type
st.c:2303: warning: comparison is always true due to limited range of data type
CC -o st
/usr/X11R6/lib/libX11.so.16.0: warning: strcpy() is almost always misused, 
please use strlcpy()
/usr/X11R6/lib/libX11.so.16.0: warning: strcat() is almost always misused, 
please use strlcat()
/usr/X11R6/lib/libX11.so.16.0: warning: sprintf() is often misused, please use 
snprintf()

What I should do here? Fix these warnings that are not shown in linux?, and
if I don't fix them, how do I know that a warning if generated only in
OpenBSD and it is not generated in Linux?, I have to compile in all
the possible systems and fix all the warning that every compiler
give me?. If we want to follow warning as a style guide we should
explititly put what warnings we want to honour and not rely in
Wall.

Another problem related to the style is you only can know what style is
used guessing it from the source, but in the source you can find
some things like:

if(term.c.x+width < term.col) {
tmoveto(term.c.x+width, term.c.y);
} else {
term.c.state |= CURSOR_WRAPNEXT;
}

but you also can find:

if(xwrite(cmdfd, s, n) == -1)
die("write error on tty: %s\n", strerror(errno));

So, is it mandatory parenthesis in one line if? I don't know it.

Regards,

-- 
Roberto E. Vargas Caballero



Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-28 Thread Christoph Lohmann
Greetings.

On Mon, 28 Apr 2014 18:05:54 +0200 "Roberto E. Vargas Caballero" 
 wrote:
> >> The problem is Wall change from one system to another (this is something 
> >> OpenBSD users know with strcpy calls...), and it doesn't force how to 
> >> remove the warning, so at the end I think style is not really improved 
> >> with Wall (and guys, some of the warning are really, really stupid). 
> 
> > There is no solution to this but to either fix the platforms or the com‐ 
> > piler. St does not have to adapt. 
> 
> I agree with you, but I think is we want to follow a style, we have to
> write a style guide, and not follow the Wall of the compiler, that
> can change, and that generate a lot of incorrect warnings. For example,
> after my last patch, compiling in linux you do not get any warning,
> but if you compile in OpenBSD you get this output:
> 
> In file included from /usr/X11R6/include/X11/Xlib.h:47,
>  from st.c:25:
> /usr/X11R6/include/X11/Xfuncproto.h:144:24: warning: ISO C does not permit 
> named variadic macros
> st.c: In function 'techo':
> st.c:2303: warning: comparison is always false due to limited range of data 
> type
> st.c:2303: warning: comparison is always true due to limited range of data 
> type
> CC -o st
> /usr/X11R6/lib/libX11.so.16.0: warning: strcpy() is almost always misused, 
> please use strlcpy()
> /usr/X11R6/lib/libX11.so.16.0: warning: strcat() is almost always misused, 
> please use strlcat()
> /usr/X11R6/lib/libX11.so.16.0: warning: sprintf() is often misused, please 
> use snprintf()
> 
> What I should do here? Fix these warnings that are not shown in linux?, and
> if I don't fix them, how do I know that a warning if generated only in
> OpenBSD and it is not generated in Linux?, I have to compile in all
> the possible systems and fix all the warning that every compiler
> give me?. If we want to follow warning as a style guide we should
> explititly put what warnings we want to honour and not rely in
> Wall.

That’s  too  vague.  We  don’t live in 1990 anymore, where people didn’t
have POSIX around. The tips given by OpenBSD are useful  and  should  be
fixed.   That’s  why  they  are  warnings. Of course st is not responsi‐
ble to fix problems with X11 headers.

People  still  need  to  report warnings, if they should be fixed. St is
written for its authors. Users have to be programmers. If  we  lose  the
fun and this gets too corporate the chance of introducing bloat gets too
high.

St is a small project. Every warning can be discussed. I don’t use Open‐
BSD, so I won’t fix these warnings, if they are not reported.

Btw.,  the  bloat addition to st is increasing. Do we really need all of
those features to reach the original goal to handle terminal  emulation?
Alternative  charsets?  Why?  I  know you need much of that crap to have
compatibility to old machines you are using. Why don’t  you  throw  them
away?

> Another problem related to the style is you only can know what style is
> used guessing it from the source, but in the source you can find
> some things like:
> 
> if(term.c.x+width < term.col) {
> tmoveto(term.c.x+width, term.c.y);
> } else {
> term.c.state |= CURSOR_WRAPNEXT;
> }
> 
> but you also can find:
> 
>   if(xwrite(cmdfd, s, n) == -1)
> die("write error on tty: %s\n", strerror(errno));
> 
> So, is it mandatory parenthesis in one line if? I don't know it.

It’s mandatory if you have else.


Sincerely,

Christoph Lohmann




Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-28 Thread Roberto E. Vargas Caballero
> > In file included from /usr/X11R6/include/X11/Xlib.h:47,
> >  from st.c:25:
> > /usr/X11R6/include/X11/Xfuncproto.h:144:24: warning: ISO C does not permit 
> > named variadic macros
> > st.c: In function 'techo':
> > st.c:2303: warning: comparison is always false due to limited range of data 
> > type
> > st.c:2303: warning: comparison is always true due to limited range of data 
> > type
> > CC -o st
> > /usr/X11R6/lib/libX11.so.16.0: warning: strcpy() is almost always misused, 
> > please use strlcpy()
> > /usr/X11R6/lib/libX11.so.16.0: warning: strcat() is almost always misused, 
> > please use strlcat()
> > /usr/X11R6/lib/libX11.so.16.0: warning: sprintf() is often misused, please 
> > use snprintf()
> > 
> 

> That’s  too  vague.  We  don’t live in 1990 anymore, where people didn’t
> have POSIX around. The tips given by OpenBSD are useful  and  should  be
> fixed.   That’s  why  they  are  warnings. 

Well, I strongly don't agree in some of this warnings, specially
in strcpy/strlcpy (please guys, I don't want to begin a flame war
about strcpy/strlcpy, it's only my personal opinion), and this is
the reason why I don't fix them. In the case of "comparision is
always...", is due to a macro that is build in order to work for
signed and unsigned numbers, so the only thing compiler must do
here is remove the unneded code and shut up. Another person who
agrees with me is Ken Thompson, who wrote in "A new compiler"
that conditional compilation should be replaced by "if" with
contstant values that compiler must remove in compile time.

> Alternative  charsets?  Why?  I  know you need much of that crap to have
> compatibility to old machines you are using. Why don’t  you  throw  them
> away?

Well, the number of lines related to alternative charsets is less of
20 lines of code, so I don't think it can be called bloat. Alternative
charset are needed because there are terminfo entries that use them
(ac, smacs, rmacs) and they are used for graphical characteres, although
since st understand unicode, graphical characters could be sent directly
using their unicode values.  After reviewing the code, we also can remove
some of the lines related to charset with this patch:

diff --git a/st.c b/st.c
index 5198749..0d1b674 100644
--- a/st.c
+++ b/st.c
@@ -94,12 +94,11 @@ enum glyph_attribute {
ATTR_REVERSE   = 1,
ATTR_UNDERLINE = 2,
ATTR_BOLD  = 4,
-   ATTR_GFX   = 8,
-   ATTR_ITALIC= 16,
-   ATTR_BLINK = 32,
-   ATTR_WRAP  = 64,
-   ATTR_WIDE  = 128,
-   ATTR_WDUMMY= 256,
+   ATTR_ITALIC= 8,
+   ATTR_BLINK = 16,
+   ATTR_WRAP  = 32,
+   ATTR_WIDE  = 64,
+   ATTR_WDUMMY= 128,
 };
 
 enum cursor_movement {
@@ -396,7 +395,6 @@ static void techo(char *, int);
 static bool tcontrolcode(uchar );
 static void tdectest(char );
 static int32_t tdefcolor(int *, int *, int);
-static void tselcs(void);
 static void tdeftran(char);
 static inline bool match(uint, uint);
 static void ttynew(void);
@@ -1535,7 +1533,7 @@ tsetchar(char *c, Glyph *attr, int x, int y) {
/*
 * The table is proudly stolen from rxvt.
 */
-   if(attr->mode & ATTR_GFX) {
+   if(term.trantbl[term.charset] == CS_GRAPHIC0) {
if(BETWEEN(c[0], 0x41, 0x7e) && vt100_0[c[0] - 0x41]) {
c = vt100_0[c[0] - 0x41];
}
@@ -2317,9 +2315,7 @@ void
 tdeftran(char ascii) {
char c, (*bp)[2];
static char tbl[][2] = {
-   {'0', CS_GRAPHIC0}, {'1', CS_GRAPHIC1}, {'A', CS_UK},
-   {'B', CS_USA},  {'<', CS_MULTI},{'K', CS_GER},
-   {'5', CS_FIN},  {'C', CS_FIN},
+   {'0', CS_GRAPHIC0}, {'B', CS_USA},
{0, 0}
};
 
@@ -2332,13 +2328,6 @@ tdeftran(char ascii) {
term.trantbl[term.icharset] = (*bp)[1];
 }
 
-void
-tselcs(void) {
-   MODBIT(term.c.attr.mode,
-  term.trantbl[term.charset] == CS_GRAPHIC0,
-  ATTR_GFX);
-}
-
 bool
 tcontrolcode(uchar ascii) {
static char question[UTF_SIZ] = "?";
@@ -2377,11 +2366,9 @@ tcontrolcode(uchar ascii) {
return 1;
case '\016': /* SO */
term.charset = 0;
-   tselcs();
break;
case '\017': /* SI */
term.charset = 1;
-   tselcs();
break;
case '\032': /* SUB */
tsetchar(question, &term.c.attr, term.c.x, term.c.y);
@@ -2506,7 +2493,6 @@ tputc(char *c, int len) {
return;
} else if(term.esc & ESC_ALTCHARSET) {
tdeftran(ascii);
-   tselcs();
} else if(term.esc & ESC_TEST) {
tdectest(ascii);
} else {
@@ -2593,7 +2579,7 @@ tputc(char *c, int len) {
/*
 * Display control codes only if we are in graphic 

Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-04-29 Thread Christoph Lohmann
Greetings.

On Tue, 29 Apr 2014 18:12:23 +0200 "Roberto E. Vargas Caballero" 
 wrote:
> > > In file included from /usr/X11R6/include/X11/Xlib.h:47,
> > >  from st.c:25:
> > > /usr/X11R6/include/X11/Xfuncproto.h:144:24: warning: ISO C does not 
> > > permit named variadic macros
> > > st.c: In function 'techo':
> > > st.c:2303: warning: comparison is always false due to limited range of 
> > > data type
> > > st.c:2303: warning: comparison is always true due to limited range of 
> > > data type
> > > CC -o st
> > > /usr/X11R6/lib/libX11.so.16.0: warning: strcpy() is almost always 
> > > misused, please use strlcpy()
> > > /usr/X11R6/lib/libX11.so.16.0: warning: strcat() is almost always 
> > > misused, please use strlcat()
> > > /usr/X11R6/lib/libX11.so.16.0: warning: sprintf() is often misused, 
> > > please use snprintf()
> > > 
> > 
> 
> > That’s  too  vague.  We  don’t live in 1990 anymore, where people didn’t
> > have POSIX around. The tips given by OpenBSD are useful  and  should  be
> > fixed.   That’s  why  they  are  warnings. 
> 
> Well, I strongly don't agree in some of this warnings, specially
> in strcpy/strlcpy (please guys, I don't want to begin a flame war
> about strcpy/strlcpy, it's only my personal opinion), and this is
> the reason why I don't fix them. In the case of "comparision is
> always...", is due to a macro that is build in order to work for
> signed and unsigned numbers, so the only thing compiler must do
> here is remove the unneded code and shut up. Another person who
> agrees with me is Ken Thompson, who wrote in "A new compiler"
> that conditional compilation should be replaced by "if" with
> contstant values that compiler must remove in compile time.

I  will fix all warnings I encounter on my machines. I can use strlcpy()
and strcpy() correctly. sprintf() is of course unsafe.

> > Alternative  charsets?  Why?  I  know you need much of that crap to have
> > compatibility to old machines you are using. Why don’t  you  throw  them
> > away?
> 
> Well, the number of lines related to alternative charsets is less of
> 20 lines of code, so I don't think it can be called bloat. Alternative
> charset are needed because there are terminfo entries that use them
> (ac, smacs, rmacs) and they are used for graphical characteres, although
> since st understand unicode, graphical characters could be sent directly
> using their unicode values.  After reviewing the code, we also can remove
> some of the lines related to charset with this patch:

That  stuff  does  nothing.  But  as seen in many other parts of st, the
source is a bit of a documentary of the vtXXX days, with  all  kinds  of
not  implemented  controls codes. I think it can stay. I asked the ques‐
tions about bloat to review feature additions a bit more, due to st  be‐
ing nearly complete.


Sincerely,

Christoph Lohmann




Re: [dev] [st] [PATCH] Fix techo handling of control and multibyte characters.

2014-05-02 Thread Markus Wichmann
On Fri, Apr 25, 2014 at 04:42:09PM +0200, Roberto E. Vargas Caballero wrote:
> The problem is Wall change from one system to another (this is something
> OpenBSD users know with strcpy calls...), and it doesn't force how to
> remove the warning, so at the end I think style is not really improved
> with Wall (and guys, some of the warning are really, really stupid).
> 

That's one of the problems with C and gcc: Too many messages fall into
the same category of "warning". gcc warns you about a comparison between
signed and unsigned variables even in places where that can't bite you,
like in

int s;
unsigned u;
/* ... */
if (s < 0 || s < u)

In this case the "s < u" is only evaluated if s's value is representable
in unsigned. gcc will warn anyway.

At the same time "implicit declaration of function X" is just a warning,
too. And it can crash your program not to declare a function. If a
function is supposed to return a pointer, and you don't declare it at
the call site, even if you cast the result to remove the "assignment
makes pointer from integer without cast" warning, and you are on x86-64,
and the pointer returned is bigger than INT_MAX (e.g. on the stack),
then the result's top 32 bits will get ignored, the lower 32 bits will
get a sign extension and your program will segfault while trying to
access kernel space. Which is very annoying when the program in question
is "login", which you yourself bothed the day before. This may have
happened to me.

And then there are the bizarre cases, like "variable X may not be
initialized": That is an incredibly helpful warning that is only
produced when gcc is optimizing (I guess it's a byproduct of the flow
analysis, which is part of the optimizer). However, when developing I
usually have the optimizer off so I can debug. And this warning is
removed in newer versions of gcc entirely. But not the command line
option for it. Nice! I'm switching to clang!

> I moved to Linux kernel style some years ago, but I am beginning to break
> some rules, basically the typedef rule. I accept typedef of struct always
> that the new type begins with upper case.
> 

I never typedef my structs. At least for private works. Professionally I
have to abide by the company rules (which are the kind of pussy-footing
rules you'd expect from a bunch of engineers trying to do programming,
like "don't use the ternary operator, it's hard to understand" or "don't
return from anywhere but the final statement of a function" that are
supposed to make the source code easier to read but completely miss that
they still allow 100-line structs and 3000-line functions), but in
private I think C desperately needs namespaces so I'm not going to
destroy the only namespaces C has.

Sorry this became quite a rant and a lot off topic.

Ciao,
Markus