Re: Anybody want to fix ash's backspace line wrap?

Rob Landley Thu, 14 May 2009 13:00:39 -0700

On Thursday 14 May 2009 09:39:17 Denys Vlasenko wrote:
> On Thu, May 14, 2009 at 12:24 AM, Rob Landley <[email protected]> wrote:
> > Wanna see a fun bug?  Try this:
> >
> >  echo -n blahblahblahblahblah
> >
> > Now type off the right edge of your terminal, and then hit backspace to
> > try to go back up.
> >
> > The problem is that ash assumes that its shell prompt always starts at
> > the left edge of the screen, then adds the length of the prompt it prints
> > to figure out where the cursor is.  If you didn't start at the left edge
> > of the screen, this doesn't work.
>
> Yes. bash suffers from the same problem.


Being better than the gnu tools has been a goal of mine all along.  That's why 
I upgraded busybox mount years ago so you didn't need to specify "-o loop" 
and just just autodetected the need for it...

> > The reason it needs to keep track of this is that unix terminals suck.
> > Specifically, backspace won't automatically back you up to the previous
> > line, the only way to do multiline interactive editing is to keep track
> > of your current cursor position and compare it to the width of your
> > screen (which you also have to know) in order know when to cursor _up_
> > and jump to the right edge of the screen.
> >
> > However, if your shell prompt didn't start at the left edge of the screen
> > (which happens surprisingly often), the resulting line editing is
> > completely horked once you go past the right edge of the screen.
> >  (Command history becomes a lot less fun to use, too.)
> >
> > It's easy enough to fix once you know what the problem is and decide to
> > address it.  We just haven't yet...
>
> How to address it?

Query the current cursor position at the start of an interactive shell prompt, 
which is very similar to querying the width and height o the screen.

The ansi escape mentioned previously consists of "save position", "jump to 
lower right", "query current cursor position", "restore saved position".  If 
you just send the query without the save/jump/restore, you should get current 
cursor position back.

In theory, there should also be a way to do it through the same pty querying 
mechanism that's giving us the width and height of the pty, but I haven't 
played with that much and it won't work across a serial console (including 
qemu's -nographic mode, where I personally tend to hit this).

> >  echo -e "\e[s\e[999C\e[999B\e[6n\e[u"
> >
> > That should trigger your ansi-compliant console to send back:
> >
> >  \e[LINES;COLUMNSR
>
> bbox has similar code in resize applet.
>
> http://git.busybox.net/busybox/tree/console-tools/resize.c
>
> Does it need robustify'ing btw?

There's a fundamental problem that applies to command shell prompts that 
doesn't apply quite so much to a resize applet, which is that people tend to 
type ahead in command shells, so there's often input pending before you even 
get around to displaying the prompt.  Thus your escape sequence reply isn't 
necessarily going to be the first thing in the input buffer.

The shell's command line editing and vi already have escape parsing logic, so 
it's relatively easy to fix this in two parts there.  (The escape reply 
should always come in as a single unit without interspersed characters, we 
just don't know _when_ we'll get it, or how long the delay might be.  But 
there could always be unrelated data before/after it, coming in 
asynchronously.)

A more general fix is to extend get_terminal_width_height() with the ANSI 
escape and parsing code so that it tries harder when it can't get useful info 
from the tty querying ioctl.  But this is fiddlier, because it can't always 
blindly output the sequence and hope for the best.

For example, the "ls" command calls get_terminal_width_height() but when 
running that in a pipe adding spurious escape output isn't a good thing.  
However, most of that could be cleaned up by grabbing the code from the tty 
applet to identify our controlling tty (and whether we have one) and write 
the probe string directly to the tty instead of to stdout, and simply not 
probe if we haven't got a tty.  (I just checked and the tty command does 
output /dev/ttyS0 or /dev/console when it's your controlling tty, even though 
neither knows your screen size.  The only time we _don't_ have a tty is when 
we're in a pipe.)

I suspect ls already does a little of this sort of thing for colors, but I 
haven't checked...

> Try running eval `resize`. Does it (a) work at all? and
> (b) does it set LINES and COLUMNS?

Setting columns and lines is a good thing, but A) it's a separate issue from 
detecting cursor position, B) I think it's something the shell should do 
automatically.  (It already does so in the pty case, just not for serial 
consoles.)

The more I think about it, the more the general fix to 
get_terminal_width_height() seems like the way to go, because there are 
actually _three_ race conditions with querying cursor position.

The first is of course that the user can press a key at any time, so the 
escape sequence reply you get may not be the first data in the input.  
Robustifying get_terminal_width_height() means adding at least an optional 
way to receive back all the other input that came in before the escape reply, 
so you can process it.  (Note that this could include other unrelated 
escapes, like "cursor up" for command history, so we can't just stop at the 
first escape sequence, it has to be the _right_ one.  Again, there's existing 
escape parsing logic in busybox...)

The second race condition is that the standard vi or shell escape parsing 
logic will print characters out as it receives them, and this moves the 
cursor position.  So when we _do_ get the cursor position escape back, 
presumably that's the cursor position at the point in the data stream 
_before_ we printed the extra characters that were pending in our input... 
but we've lost that context.  Seems like it's best not to print any output 
between sending the probe and receiving the reply (or timing out waiting for 
the reply), and then we can parse the other input afterwards.

The third race condition is that if you're querying both screen size _and_ 
cursor position, you get back the same reply string and it means two 
different things depending on context.  (This is only a race condition if you 
alter the vi/shell escape parsing logic and respond to it asynchornously in a 
different context than the probe was sent from.  Yeah you could have a global 
variable indicate which one you were waiting for but that's ugly.)

So yeah, leaning towards a get_terminal_width_height() upgrade that can return 
arbitrary "extra" data it got while waiting.

A tty_ungets() that actually _worked_ would be kind of nice, but Unix has 
never had that, and implementing it would be kind of brain-bending anyway.  
(And no, FILE * doesn't count.)

> Anyway. This trick can be used for (a) determining screen size
> by moving cursor to the end (ESC [ 999 ; 999 H) and then
> asking "where is the cursor now?", which solves "serial console
> does not know its size" problem,
> and (b) determining current cursor position by just asking
> "where is the cursor now?", which solves "shell after echo -n blabla"
> problem. Which problem of these two do you want to solve?

Both, but they're separate problems.

> > So you'd then want to teach your ansi escape sequence parsing logic (the
> > stuff currently handling cursor up/down/left/right and so on) to
> > understand esc[YY;XXR escapes  and use them to set $LINES and $COLUMNS
> > respectively.
>
> Can be done. But parser would be confused by the two different usages.
> What do you want to do, (a) or (b)?

Both, but in a different place as described above.

> > Note you probably only want to send that escape sequence if there _isn't_
> > already input pending.
>
> If terminal is not so dumb and it never intermixes this sequence
> with real user input, we can just sit and wait for that code to show up
> anytime.

All escape sequences an xterm sends us will be in a block without any other 
data interspersed between them.  The probe response string telling us are 
current cursor position (which is all the width/height probe string is 
actually doing, we just moved the cursor first) is no different than "cursor 
up" in that regard.  The xterm won't start sending us the next input event 
until it's finished sending us this one.

Alas, "real user input" is asynchronous so the xterm could handle keypresses 
before it gets our probe string and have some pending right after it finishes 
with our probe string, so the block of data we read could have other data 
before and after the escape sequence we're looking for, and that includes 
other escape sequences.  You don't want to discard other data read at the 
beginning, and don't want to read extra data past the end.

The really _fun_ part is where the user hits cursor up followed by capital R.  
If our escape parser doesn't understand the sequences it's seeing, then it 
gets esc[AR and goes "right, esc through R, _this_ is the bit I need to 
parse"...  Which eats unrelated data and returns nonsense.

Perhaps libbb/read_key.c could be modified to do what we want?  Hmmm...  
Except I don't like sucking in that much code for something like "ls" that 
isn't going to use it.  (Yeah, it's shared, but nobody guaranteed we were 
going to build "vi" or a shell in this busybox...)

> > (Has the escape sequence parsing logic I tweaked in
> > the VI code last year been genericized yet?)
>
> Yes.
>
> http://git.busybox.net/busybox/tree/libbb/read_key.c

Coolness.

> > The reason to do this is it'll drill through to your xterm even through a
> > serial console, and return the width and height of the console even when
> > the remote system busybox has NO clue what your console size is because
> > that information lives on a separate computer.  The shell could then set
> > LINES and COLUMNS based on that for the next command it runs, although it
> > should remember that they weren't set when the shell was originally run
> > so it shouldn't trust those values, and should probably re-query it every
> > prompt in that case because we don't get WINCH signals through a serial
> > console either...
>
> Basic sanity testing would be necessary in the parser, yes.

I don't see where I said that, but if you say so...

> We do not want to find ourself in e.g.  -5x999999999 window.

Don't write the parser to understand negative numbers?

Are you assuming the xterm is broken and sending crazy escape sequences, or 
are you assuming someone is trying to feed in specially crafted escape 
sequences to hack busybox to make it segfault or go out of memory?

Look at the probe sequence; we set our position to 999 by 999.  If the 
window's larger than that, we won't get the correct size anyway.  (Although 
from a UI perspective such a terminal would be a bit unwieldy for humans to 
use even with a monitor the size of your wall, so I'm not too worried.)

So if you really want a sanity test, if either coordinate is <1 or >999 then 
somebody's feeding in hand-crafted data to hack us.

Also, we're not _setting_ window size.  Just querying the current one.

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Re: Anybody want to fix ash's backspace line wrap?

Reply via email to