Re: delete ligature support for Arabic "la" from the less(1) command line

2019-09-09 Thread Ali Farzanrad
Hi Ingo,

Thanks for your effort in unicode support.  I hope my feedback as a
native Persian would be helpful.

Ingo Schwarze  wrote:
> If i understand correctly, xterm(1) does indeed have that problem.
> I prepared a test file that contains, in this order,
> 
>  - some Latin characters
>  - the Arabic word "la" ("no"), i.e. first LAM, then ALEF
>  - some more Latin characters
>  - the Arabic word "al" ("the"), i.e. first ALEF, then LAM
>  - some final Latin characters
> 
> And indeed, xterm(1) does not respect the writing direction of the
> individual words.  When cat(1)'ing the file to stdout, both xterm(1)
> and konsole(1) show all the words from left to right, but *inside*
> each word, konsole(1) uses the correct writing direction: right to
> left for Arabic and left to right for Latin.  For example, in the
> Arabic word "al", konsole(1) correctly shows the ALEF right of the
> LAM, whereas xterm(1) wrongly shows the ALEF left of the LAM.
> 

There are many rules.  Each letter / character has a direction by
itself.  For example English letters are LTR (left-to-right), Arabic /
Persian letters are RTL, but some characters, say symbols, have no
direction.  For example, when you write:

'A' '+' 'B'

It should be displayed as is ('+' is LTR), but when you write:

'A' ALEF '+' LAM 'B'

The '+' should be displayed in the left side of ALEF ('+' is RTL):

'A' LAM '+' ALEF 'B'

I think you need to detect all maximal non-LTR substrings (which don't
start or end with a symbol) inside LTR strings to render them correctly.
There are also RTL / LTR control characters in Unicode which manipulate
this behaviour.

> I'm not entirely sure this has much to do with ligatures, though.
> What matters for building ligatures is only the logical ordering,
> the ordering in *time* so to speak, i.e. what comes before and what
> comes after.  LAM before ALEF has to become the ligature glyph "al",
> whereas ALEF before LAM remains two glyphs.  Technically, the
> question of ordering in space, whether glyphs are painted onto the
> screen right to left or left to right, only comes into play after
> characters have already been combined into glyphs.
> 
> Actually, now that you bring up the topic, i see another situation
> where less(1) causes an issue.  Let's use konsole(1) and not xterm(1)
> such that we get the correct writing direction, and let's put the
> word "al" onto the screen.  No ligature here, so that part of the
> topic is suspended for a moment.  Now let's slowly scroll right in
> one-column steps.  All is fine as long as the word "al" is completely
> visible on screen.  But when the final letter LAM of "al" is in the
> last (leftmost) column of the screen and you scroll right one more
> column, something weird happens, even in konsole(1).  You would
> expect the final letter LAM to scroll off screen first and the initial
> letter ALEF to remain on the screen for a little longer.  Instead,
> less(1) incorrectly thinks the *initial* letter of the word scrolls
> off screen first, and it tells xterm(1) to display the ALEF in the
> leftmost column of the screen while the LAM just went off-screen.
> That looks weird because there is no word in that text beginning
> with ALEF.
> 

It's a difficult problem.  You need to consider all maximal non-LTR
substrings, and all LTR / RTL modifiers.  Also consider a file with long
RTL lines; user prefer to see the beginig of lines (in all languages,
readers read from start), so less(1) should display right-most part of
each line, and when user scrolls the text to right, less(1) should
display left-side of each line.

I think that if xterm had a complete RTL mode with swapped right and
left keys, it might solve many problems.  In your example in RTL xterm,
there will be no right scroll (because of swapped keys) and when you
scroll less(1) to the left, less(1) will correctly scrolls off the
initial letter.  Of course it will not work on complex mixed RTL / LTR
texts, but it solves the problem in most common situations.

> This means that being able to properly view Arabic or Farsi text
> with the default OpenBSD terminal emulator and parser would require
> 
>  1. bidi support in xterm(1)
> to render Farsi words with the correct writing direction
>  2. ligature support in xterm(1)
> to correctly connect letters
>  3. bidi support in less(1)
> to correctly scroll parts of words on and off screen, horizontally

According to previous example (a file with long RTL lines), I don't
agree with bidi support in less(1).

>  4. ligature support in less(1)
> for correct columnation
> 
> As far as i understand, you are saying that the extremely fragmentary
> support for item 4 which we happen to have right now is not really
> useful without items 1-3, and even when using konsole(1), which does
> have items 1 and 2, implementing item 3 before item 4 would make
> sense because item 3 is more importrant.
> 
> So my understanding is that you are not objecting to the patch because

Re: delete ligature support for Arabic "la" from the less(1) command line

2019-09-01 Thread Ingo Schwarze
Hello Mohammadreza,

Mohammadreza Abdollahzadeh wrote on Sun, Sep 01, 2019 at 09:40:16AM +0430:

> Persian is my native language and I think that the major problem that
> all RTL (Right-To-Left) languages like Persian and Arabic currentlly suffer
> from is the lack of BiDi (Bidirectionality) support in console and terminal
> environment like xterm(1). KDE konsole(1) support bidi and that's why it
> show ligatures correctly.
> I think any attempt to fix such problems must first start with adding bidi
> support to xterm and other terminal environment.

Thank you for your feedback!

If i understand correctly, xterm(1) does indeed have that problem.
I prepared a test file that contains, in this order,

 - some Latin characters
 - the Arabic word "la" ("no"), i.e. first LAM, then ALEF
 - some more Latin characters
 - the Arabic word "al" ("the"), i.e. first ALEF, then LAM
 - some final Latin characters

And indeed, xterm(1) does not respect the writing direction of the
individual words.  When cat(1)'ing the file to stdout, both xterm(1)
and konsole(1) show all the words from left to right, but *inside*
each word, konsole(1) uses the correct writing direction: right to
left for Arabic and left to right for Latin.  For example, in the
Arabic word "al", konsole(1) correctly shows the ALEF right of the
LAM, whereas xterm(1) wrongly shows the ALEF left of the LAM.

I'm not entirely sure this has much to do with ligatures, though.
What matters for building ligatures is only the logical ordering,
the ordering in *time* so to speak, i.e. what comes before and what
comes after.  LAM before ALEF has to become the ligature glyph "al",
whereas ALEF before LAM remains two glyphs.  Technically, the
question of ordering in space, whether glyphs are painted onto the
screen right to left or left to right, only comes into play after
characters have already been combined into glyphs.

Actually, now that you bring up the topic, i see another situation
where less(1) causes an issue.  Let's use konsole(1) and not xterm(1)
such that we get the correct writing direction, and let's put the
word "al" onto the screen.  No ligature here, so that part of the
topic is suspended for a moment.  Now let's slowly scroll right in
one-column steps.  All is fine as long as the word "al" is completely
visible on screen.  But when the final letter LAM of "al" is in the
last (leftmost) column of the screen and you scroll right one more
column, something weird happens, even in konsole(1).  You would
expect the final letter LAM to scroll off screen first and the initial
letter ALEF to remain on the screen for a little longer.  Instead,
less(1) incorrectly thinks the *initial* letter of the word scrolls
off screen first, and it tells xterm(1) to display the ALEF in the
leftmost column of the screen while the LAM just went off-screen.
That looks weird because there is no word in that text beginning
with ALEF.

This means that being able to properly view Arabic or Farsi text
with the default OpenBSD terminal emulator and parser would require

 1. bidi support in xterm(1)
to render Farsi words with the correct writing direction
 2. ligature support in xterm(1)
to correctly connect letters
 3. bidi support in less(1)
to correctly scroll parts of words on and off screen, horizontally
 4. ligature support in less(1)
for correct columnation

As far as i understand, you are saying that the extremely fragmentary
support for item 4 which we happen to have right now is not really
useful without items 1-3, and even when using konsole(1), which does
have items 1 and 2, implementing item 3 before item 4 would make
sense because item 3 is more importrant.

So my understanding is that you are not objecting to the patch because
the fragmentary support for item 4 is practically useless in isolation.


The following is not related to this patch, but i think it makes
sense to mention it here: regarding the future, i think items 1 and
3 are much easier to support than items 2 and 4 because bidi support,
if i understand correctly, only needs one bit of information per
character because it only needs to know whether the character is
part of a right to left or left to right script, so the complexity
on the libc level, where we want complexity least of all places,
is comparable to other boolean character properties like those
listed in the iswalnum(3) manual page.  Realistically, though,
bidi support would still be a large project, and i don't think it
makes sense to tackle it any time soon.

Ligature support feels much worse than bidi support because the
mapping required is not merely character -> boolean but (character +
character) -> character, which is more complicated than even the
(character + character) -> -1/0/+1 mapping required for collation
support - and we decided that we don't want collation support in
libc because it would cause excessive complexity.  Admittedly,
collations are strongly locale-dependent, while i'm not sure ligatures
are locale-depe

Re: delete ligature support for Arabic "la" from the less(1) command line

2019-08-31 Thread Mohammadreza Abdollahzadeh
Hi Ingo,
Persian is my native language and I think that the major problem that
all RTL (Right-To-Left) languages like Persian and Arabic currentlly suffer
from is the lack of BiDi (Bidirectionality) support in console and terminal
environment like xterm(1). KDE konsole(1) support bidi and that's why it
show ligatures correctly.
I think any attempt to fix such problems must first start with adding bidi
support to xterm and other terminal environment.

best regards.



Re: delete ligature support for Arabic "la" from the less(1) command line

2019-08-31 Thread Evan Silberman
Ingo Schwarze  wrote:
> I have no idea how many of those work in konsole(1) - but i'm sure
> none of those, except the four LAM WITH ALEF discussed here, work
> with less(1), so i think support for LAM WITH ALEF provided no value
> in the first place.  The way it is implemented, with an ad-hoc table
> inside less(1) of character combinations that form ligatures, is
> just wrong and not sustainable by any stretch of the imagination,
> i think.
> 
> On top of that, how characters combine in Arabic is strongly context
> dependent; even the syllable "la" forms a different ligature depending
> on whether it is isolated or at the end of a longer word, and none
> of the context dependencies are implemented in less(1) anyway.
> 
> And finally, people say the situation in many Indian languages is
> even more dire than in Arabic, so what our less(1) tries to do is
> almost certainly completely useless for those languages, even if
> we would expand the ad-hoc table.
> 
> So, i propose to delete support for combining characters into
> ligatures from our less(1): at this point, it is only used for
> typing at the less prompt anyway (and not for the file displayed),
> only for Arabic, and only for the single ligature "la".  If we ever
> want better ligature support in the future, i think we would have
> to make a fresh start anyway - and i think there are many other
> things to do before that.

I did less practical research than you did when I looked at this bit of
code but your conclusions match mine: this is an attempt at an
implementation of a tiny subset of the vastly complex problem of digital
typesetting of the Arabic alef-bet. Keeping the code is probably worse
than no solution at all, because (as you noted) it's the wrong
implementation in the wrong place and "improving" it by adding more
combination rules here would be a mistake.

--Evan Silberman



delete ligature support for Arabic "la" from the less(1) command line

2019-08-31 Thread Ingo Schwarze
Hi,

i have to admit that i am neither able to speak nor to write nor
to understand the Arabic language nor the Arabic script, but here
is my current, probably incomplete understanding of what our less(1)
program is trying to do with Arabic ligatures.

If somebody is reading this who is able to read and write Arabic
or an Indian language heavily using ligatures, feedback is highly
welcome.

Arabic is a cursive script, which means that when writing Arabic,
characters do not map 1:1 to glyphs.  Instead, there are rules about
how adjacent characters attach to each other, forming ligatures.

As an extremely simple example, consider the Arabic adverb "la",
which means the same as the English adverb "no".  It consists of
the two letters U+0644 LAM and U+0627 ALEF, the LAM appearing before
(i.e. to the right of) the ALEF.  However, you do not write both
letters separately.  Instead, the ALEF leans forward (to the left)
and attaches to the LAM, forming the glyph U+FEFB, ARABIC LIGATURE
LAM WITH ALEF ISOLATED FORM.  When displayed in a fixed width font,
that ligature only occupies a single display column just like any
other Arabic or Latin glyph.  The LAM WITH ALEF glyph is not a
double-width glyph like Japanese or Chinese characters typically
are.

So, when this happens, you have four bytes of UTF-8 forming two
Unicode characters, and *together*, these two characters occupy
only one single display column.

Note that in the default configuration, our xterm(1) is not able
to display Arabic characters at all.  But even when you run
  xterm -fa arabic
or
  xterm -fa fixed
which uses FreeType support instead of the default X toolkit font
support, such that xterm(1) does become able to display single
Arabic characters, it still displays the word "la" incorrectly,
failing to generate the required ligature and instead displaying
the two characters LAM and ALEF separately.

So i installed konsole-18.12.0p1 for testing (which pulls in
ridiculous amounts of dependencies, dozens of them, but oh well,
i guess support for advanced Unicode features isn't trivial).
The konsole(1) program does display the word "la" correctly, as a
ligature.

Now, running less(1) inside konsole(1), i found that columnation
is already subtly broken.  As long as the "la" ligature is visible
on screen, all is fine.  Now scroll to the right until the "la"
appears in the first screen column.  Then scroll one more column
to the right by pressing "1 RIGHTARROW".  Now you see *half* the
ligature, i.e. an isolated ALEF, in the first column of the screen,
even though the Arabic word does not contain an isolated ALEF.
Besides, we just attempted to scroll the "la" off screen, so the
ALEF now appears in the column one to the right of where the "la"
should actually be, and all the rest of the line is shifted one
column to the right, too, so columnation is now off by one.
Scrolling back left, columnation recovers to correct display.

I strongly suspect i broke that during my previous UTF-8 cleanup
work on less(1).

However, LAM WITH ALEF is literally the only ligature that less(1)
supports, together with three variations (with MADDA above, with
HAMZA above, and with HAMZA below).  But there are hundreds of
ligatures in Arabic, see

  https://www.unicode.org/charts/PDF/UFB50.pdf
  https://www.unicode.org/charts/PDF/UFE70.pdf

I have no idea how many of those work in konsole(1) - but i'm sure
none of those, except the four LAM WITH ALEF discussed here, work
with less(1), so i think support for LAM WITH ALEF provided no value
in the first place.  The way it is implemented, with an ad-hoc table
inside less(1) of character combinations that form ligatures, is
just wrong and not sustainable by any stretch of the imagination,
i think.

On top of that, how characters combine in Arabic is strongly context
dependent; even the syllable "la" forms a different ligature depending
on whether it is isolated or at the end of a longer word, and none
of the context dependencies are implemented in less(1) anyway.

And finally, people say the situation in many Indian languages is
even more dire than in Arabic, so what our less(1) tries to do is
almost certainly completely useless for those languages, even if
we would expand the ad-hoc table.

So, i propose to delete support for combining characters into
ligatures from our less(1): at this point, it is only used for
typing at the less prompt anyway (and not for the file displayed),
only for Arabic, and only for the single ligature "la".  If we ever
want better ligature support in the future, i think we would have
to make a fresh start anyway - and i think there are many other
things to do before that.

Note that this only removes support for combining characters into
ligatures that can also stand on their own; support for purely
combining accents like U+300 COMBINING GRAVE ACCENT and U+3099
COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK remains intact.

OK?
  Ingo


Index: charset.c
=

Re: less(1): `!' command

2017-12-23 Thread Jeremie Courreges-Anglas
On Fri, Dec 22 2017, Stuart Henderson  wrote:
> On 2017/12/22 19:47, Nicholas Marriott wrote:
>> I don't think we should bring ! back.
>> 
>> I wanted to remove v and | (and some other stuff) shortly afterwards, but
>> several people objected.
>> 
>> I did suggest having a lightweight less in base for most people and adding
>> the full upstream less to ports for the stuff we don't want to maintain
>> (like we do for eg libevent) but other people didn't like that idea.
>
> less(1) can already be made more lightweight by setting LESSSECURE=1.
> (I quite like this even without the reduced pledge, my biggest annoyance
> with less is when I accidentally press 'v').
>
> Any opinions on switching the default?

Makes sense to me, I can live without the 's' command.  ok jca@

> Index: main.c
> ===
> RCS file: /cvs/src/usr.bin/less/main.c,v
> retrieving revision 1.35
> diff -u -p -u -1 -2 -r1.35 main.c
> --- main.c17 Sep 2016 15:06:41 -  1.35
> +++ main.c22 Dec 2017 22:19:04 -
> @@ -87,17 +87,17 @@ main(int argc, char *argv[])
>  
> - secure = 0;
> + secure = 1;
>   s = lgetenv("LESSSECURE");
> - if (s != NULL && *s != '\0')
> - secure = 1;
> + if (s != NULL && strcmp(s, "0") == 0)
> + secure = 0;
>  
>   if (secure) {
>   if (pledge("stdio rpath wpath tty", NULL) == -1) {
>   perror("pledge");
>   exit(1);
>   }
>   } else {
>   if (pledge("stdio rpath wpath cpath fattr proc exec tty", NULL) 
> == -1) {
>   perror("pledge");
>   exit(1);
>   }
>   }
> Index: less.1
> ===
> RCS file: /cvs/src/usr.bin/less/less.1,v
> retrieving revision 1.52
> diff -u -p -r1.52 less.1
> --- less.124 Oct 2016 13:46:58 -  1.52
> +++ less.122 Dec 2017 22:17:28 -
> @@ -1674,9 +1674,7 @@ differences in invocation syntax, the
>  .Ev LESSEDIT
>  variable can be changed to modify this default.
>  .Sh SECURITY
> -When the environment variable
> -.Ev LESSSECURE
> -is set to 1,
> +Normally,
>  .Nm
>  runs in a "secure" mode.
>  This means these features are disabled:
> @@ -1698,6 +1696,10 @@ Metacharacters in filenames, such as "*"
>  .It " "
>  Filename completion (TAB, ^L).
>  .El
> +.Pp
> +To enable these features, set the environment variable
> +.Ev LESSSECURE
> +to 0.
>  .Sh COMPATIBILITY WITH MORE
>  If the environment variable
>  .Ev LESS_IS_MORE
>

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: less(1): `!' command

2017-12-23 Thread kshe
On Fri, 22 Dec 2017 22:21:12 +, Stuart Henderson wrote:
> On 2017/12/22 19:47, Nicholas Marriott wrote:
> > I don't think we should bring ! back.
> >
> > I wanted to remove v and | (and some other stuff) shortly afterwards, but
> > several people objected.
> >
> > I did suggest having a lightweight less in base for most people and adding
> > the full upstream less to ports for the stuff we don't want to maintain
> > (like we do for eg libevent) but other people didn't like that idea.
>
> less(1) can already be made more lightweight by setting LESSSECURE=1.
> (I quite like this even without the reduced pledge, my biggest annoyance
> with less is when I accidentally press 'v').
>
> Any opinions on switching the default?

I thought about that possibility too, and I mostly agree with the idea
as I also run less(1) in secure mode very often, but it is nevertheless
quite irrelevant to the original concern, which is that, when one
chooses not to run less(1) in secure mode, whether that mode is the
default one or not, it is inconsistent, for multiple reasons, to have
removed the `!' command, but not `v' nor `|'.

Until some form of agreement can be reached on that issue, I have
reverted the removal of `!' in my personal tree, so I still pay the
exact same price as everybody else ("proc exec"), but at least I now get
something useful out of that.

Regards,

kshe



Re: less(1): `!' command

2017-12-23 Thread Alexander Hall


On December 22, 2017 11:21:12 PM GMT+01:00, Stuart Henderson 
 wrote:
>On 2017/12/22 19:47, Nicholas Marriott wrote:
>> I don't think we should bring ! back.
>> 
>> I wanted to remove v and | (and some other stuff) shortly afterwards,
>but
>> several people objected.
>> 
>> I did suggest having a lightweight less in base for most people and
>adding
>> the full upstream less to ports for the stuff we don't want to
>maintain
>> (like we do for eg libevent) but other people didn't like that idea.
>
>less(1) can already be made more lightweight by setting LESSSECURE=1.
>(I quite like this even without the reduced pledge, my biggest
>annoyance
>with less is when I accidentally press 'v').
>
>Any opinions on switching the default?

An interesting twist on this is that if someone is currently (mistakenly) using
LESSECURE=0, 
e.g. for not having their system "less secure", they would currently aquire the 
intended goal, while after this change, that would change.

Not sure if misconfigured systems are our main focus, but given the name 
"less", the aforementioned mistake doesn't strike me as totally unreasonable.

/Alexander

>Index: main.c
>===
>RCS file: /cvs/src/usr.bin/less/main.c,v
>retrieving revision 1.35
>diff -u -p -u -1 -2 -r1.35 main.c
>--- main.c 17 Sep 2016 15:06:41 -  1.35
>+++ main.c 22 Dec 2017 22:19:04 -
>@@ -87,17 +87,17 @@ main(int argc, char *argv[])
> 
>-  secure = 0;
>+  secure = 1;
>   s = lgetenv("LESSSECURE");
>-  if (s != NULL && *s != '\0')
>-  secure = 1;
>+  if (s != NULL && strcmp(s, "0") == 0)
>+  secure = 0;
> 
>   if (secure) {
>   if (pledge("stdio rpath wpath tty", NULL) == -1) {
>   perror("pledge");
>   exit(1);
>   }
>   } else {
>   if (pledge("stdio rpath wpath cpath fattr proc exec tty", NULL) 
> ==
>-1) {
>   perror("pledge");
>   exit(1);
>   }
>   }
>Index: less.1
>===
>RCS file: /cvs/src/usr.bin/less/less.1,v
>retrieving revision 1.52
>diff -u -p -r1.52 less.1
>--- less.1 24 Oct 2016 13:46:58 -  1.52
>+++ less.1 22 Dec 2017 22:17:28 -
>@@ -1674,9 +1674,7 @@ differences in invocation syntax, the
> .Ev LESSEDIT
> variable can be changed to modify this default.
> .Sh SECURITY
>-When the environment variable
>-.Ev LESSSECURE
>-is set to 1,
>+Normally,
> .Nm
> runs in a "secure" mode.
> This means these features are disabled:
>@@ -1698,6 +1696,10 @@ Metacharacters in filenames, such as "*"
> .It " "
> Filename completion (TAB, ^L).
> .El
>+.Pp
>+To enable these features, set the environment variable
>+.Ev LESSSECURE
>+to 0.
> .Sh COMPATIBILITY WITH MORE
> If the environment variable
> .Ev LESS_IS_MORE



Re: less(1): `!' command

2017-12-22 Thread Stuart Henderson
On 2017/12/22 19:47, Nicholas Marriott wrote:
> I don't think we should bring ! back.
> 
> I wanted to remove v and | (and some other stuff) shortly afterwards, but
> several people objected.
> 
> I did suggest having a lightweight less in base for most people and adding
> the full upstream less to ports for the stuff we don't want to maintain
> (like we do for eg libevent) but other people didn't like that idea.

less(1) can already be made more lightweight by setting LESSSECURE=1.
(I quite like this even without the reduced pledge, my biggest annoyance
with less is when I accidentally press 'v').

Any opinions on switching the default?

Index: main.c
===
RCS file: /cvs/src/usr.bin/less/main.c,v
retrieving revision 1.35
diff -u -p -u -1 -2 -r1.35 main.c
--- main.c  17 Sep 2016 15:06:41 -  1.35
+++ main.c  22 Dec 2017 22:19:04 -
@@ -87,17 +87,17 @@ main(int argc, char *argv[])
 
-   secure = 0;
+   secure = 1;
s = lgetenv("LESSSECURE");
-   if (s != NULL && *s != '\0')
-   secure = 1;
+   if (s != NULL && strcmp(s, "0") == 0)
+   secure = 0;
 
if (secure) {
if (pledge("stdio rpath wpath tty", NULL) == -1) {
perror("pledge");
exit(1);
}
} else {
if (pledge("stdio rpath wpath cpath fattr proc exec tty", NULL) 
== -1) {
perror("pledge");
exit(1);
}
}
Index: less.1
===
RCS file: /cvs/src/usr.bin/less/less.1,v
retrieving revision 1.52
diff -u -p -r1.52 less.1
--- less.1  24 Oct 2016 13:46:58 -  1.52
+++ less.1  22 Dec 2017 22:17:28 -
@@ -1674,9 +1674,7 @@ differences in invocation syntax, the
 .Ev LESSEDIT
 variable can be changed to modify this default.
 .Sh SECURITY
-When the environment variable
-.Ev LESSSECURE
-is set to 1,
+Normally,
 .Nm
 runs in a "secure" mode.
 This means these features are disabled:
@@ -1698,6 +1696,10 @@ Metacharacters in filenames, such as "*"
 .It " "
 Filename completion (TAB, ^L).
 .El
+.Pp
+To enable these features, set the environment variable
+.Ev LESSSECURE
+to 0.
 .Sh COMPATIBILITY WITH MORE
 If the environment variable
 .Ev LESS_IS_MORE



Re: less(1): `!' command

2017-12-22 Thread Nicholas Marriott
I don't think we should bring ! back.

I wanted to remove v and | (and some other stuff) shortly afterwards, but
several people objected.

I did suggest having a lightweight less in base for most people and adding
the full upstream less to ports for the stuff we don't want to maintain
(like we do for eg libevent) but other people didn't like that idea.



On 17 December 2017 at 15:48, kshe  wrote:

> On Sat, 16 Dec 2017 21:52:44 +, Theo de Raadt wrote:
> > > On Sat, 16 Dec 2017 19:39:27 +, Theo de Raadt wrote:
> > > > > On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote:
> > > > > > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > Would a patch to bring back the `!' command to less(1) be
> accepted?  The
> > > > > > > commit message for its removal explains that ^Z should be used
> instead,
> > > > > > > but that obviously does not work if less(1) is run from
> something else
> > > > > > > than an interactive shell, for example when reading manual
> pages from a
> > > > > > > vi(1) instance spawned directly by `xterm -e vi' in a window
> manager or
> > > > > > > by `neww vi' in a tmux(1) session.
> > > > > >
> > > > > > Why should less be able to spawn another programs? This would
> undermine
> > > > > > all pledge work.
> > > > >
> > > > > Because of at least `v' and `|', less(1) already is able to invoke
> > > > > arbitrary programs, and accordingly needs the "proc exec" promise,
> so
> > > > > bringing `!' back would not change anything from a security
> perspective
> > > > > (otherwise, I would obviously not have made such a proposition).
> > > > >
> > > > > In fact, technically, what I want to do is still currently
> possible:
> > > > > from any less(1) instance, one may use `v' to invoke vi(1), and
> then use
> > > > > vi(1)'s own `!' command as desired.  So the functionality of `!' is
> > > > > still there; it was only made more difficult to reach for no
> apparent
> > > > > reason.
> > > >
> > > > No apparent reason?
> > > >
> > > > Good you have an opinion.  I have a different opinion: We should look
> > > > for rarely used functionality and gut it.
> > >
> > > I completely agree, and I also completely agree with the rest of what
> > > you said.  However, in this particular case, the functionality of `!'
> is
> > > still fully (albeit indirectly) accessible, as shown above, and this is
> > > why its deletion, when not immediately followed by that of `|' and `v',
> > > made little sense for me.
> >
> > Oh, so you don't agree.  Or do you.  I can't tell.  You haven't made up
> > your mind enough to have a final position?
>
> In the case of less(1), the underlying functionality of `!' (invoking
> arbitrary programs) has not been removed at all, as `!' itself was only
> one way amongst others of doing that.  Therefore, I would have prefered
> that such an endeavour be conducted in steps at least as large as a
> pledge(2) category.  You may say this is absolutist, but, in the end,
> users might actually be more inclined to accept such removals if they
> come with, and thus are justified by, a real and immediate security
> benefit, like stricter pledge(2) promises, rather than some vague
> theoretical explanation about the global state of their software
> environment.
>
> > [...]
> >
> > > May I go ahead and prepare a patch to remove "proc exec" entirely?
> >
> > Sure you could try, and see who freaks out.  Exactly what the plan was
> > all along.
>
> The minimal diff below does that.  If it is accepted, further cleanups
> would need to follow (in particular, removing a few unused variables and
> functions), and of course the manual would also need some adjustments.
>
> Index: cmd.h
> ===
> RCS file: /cvs/src/usr.bin/less/cmd.h,v
> retrieving revision 1.10
> diff -u -p -r1.10 cmd.h
> --- cmd.h   6 Nov 2015 15:58:01 -   1.10
> +++ cmd.h   17 Dec 2017 12:23:00 -
> @@ -42,12 +42,12 @@
>  #defineA_FF_LINE   29
>  #defineA_BF_LINE   30
>  #defineA_VERSION   31
> -#defineA_VISUAL32
> +/* 32 unused */
>  #defineA_F_WINDOW  33
>  #defineA_B_WINDOW  34
>  #defineA_F_BRACKET 35
>  #defineA_B_BRACKET 36
> -#defineA_PIPE  37
> +/* 37 unused */
>  #defineA_INDEX_FILE38
>  #defineA_UNDO_SEARCH   39
>  #defineA_FF_SCREEN 40
> Index: command.c
> ===
> RCS file: /cvs/src/usr.bin/less/command.c,v
> retrieving revision 1.31
> diff -u -p -r1.31 command.c
> --- command.c   12 Jan 2017 20:32:01 -  1.31
> +++ command.c   17 Dec 2017 12:23:00 -
> @@ -241,12 +241,6 @@ exec_mca(void)
> /* If tag structure is loaded then clean it up. */
> cleantag

Re: less(1): `!' command

2017-12-17 Thread kshe
On Sat, 16 Dec 2017 21:52:44 +, Theo de Raadt wrote:
> > On Sat, 16 Dec 2017 19:39:27 +, Theo de Raadt wrote:
> > > > On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote:
> > > > > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Would a patch to bring back the `!' command to less(1) be accepted? 
> > > > > >  The
> > > > > > commit message for its removal explains that ^Z should be used 
> > > > > > instead,
> > > > > > but that obviously does not work if less(1) is run from something 
> > > > > > else
> > > > > > than an interactive shell, for example when reading manual pages 
> > > > > > from a
> > > > > > vi(1) instance spawned directly by `xterm -e vi' in a window 
> > > > > > manager or
> > > > > > by `neww vi' in a tmux(1) session.
> > > > >
> > > > > Why should less be able to spawn another programs? This would 
> > > > > undermine
> > > > > all pledge work.
> > > >
> > > > Because of at least `v' and `|', less(1) already is able to invoke
> > > > arbitrary programs, and accordingly needs the "proc exec" promise, so
> > > > bringing `!' back would not change anything from a security perspective
> > > > (otherwise, I would obviously not have made such a proposition).
> > > >
> > > > In fact, technically, what I want to do is still currently possible:
> > > > from any less(1) instance, one may use `v' to invoke vi(1), and then use
> > > > vi(1)'s own `!' command as desired.  So the functionality of `!' is
> > > > still there; it was only made more difficult to reach for no apparent
> > > > reason.
> > >
> > > No apparent reason?
> > >
> > > Good you have an opinion.  I have a different opinion: We should look
> > > for rarely used functionality and gut it.
> >
> > I completely agree, and I also completely agree with the rest of what
> > you said.  However, in this particular case, the functionality of `!' is
> > still fully (albeit indirectly) accessible, as shown above, and this is
> > why its deletion, when not immediately followed by that of `|' and `v',
> > made little sense for me.
>
> Oh, so you don't agree.  Or do you.  I can't tell.  You haven't made up
> your mind enough to have a final position?

In the case of less(1), the underlying functionality of `!' (invoking
arbitrary programs) has not been removed at all, as `!' itself was only
one way amongst others of doing that.  Therefore, I would have prefered
that such an endeavour be conducted in steps at least as large as a
pledge(2) category.  You may say this is absolutist, but, in the end,
users might actually be more inclined to accept such removals if they
come with, and thus are justified by, a real and immediate security
benefit, like stricter pledge(2) promises, rather than some vague
theoretical explanation about the global state of their software
environment.

> [...]
>
> > May I go ahead and prepare a patch to remove "proc exec" entirely?
>
> Sure you could try, and see who freaks out.  Exactly what the plan was
> all along.

The minimal diff below does that.  If it is accepted, further cleanups
would need to follow (in particular, removing a few unused variables and
functions), and of course the manual would also need some adjustments.

Index: cmd.h
===
RCS file: /cvs/src/usr.bin/less/cmd.h,v
retrieving revision 1.10
diff -u -p -r1.10 cmd.h
--- cmd.h   6 Nov 2015 15:58:01 -   1.10
+++ cmd.h   17 Dec 2017 12:23:00 -
@@ -42,12 +42,12 @@
 #defineA_FF_LINE   29
 #defineA_BF_LINE   30
 #defineA_VERSION   31
-#defineA_VISUAL32
+/* 32 unused */
 #defineA_F_WINDOW  33
 #defineA_B_WINDOW  34
 #defineA_F_BRACKET 35
 #defineA_B_BRACKET 36
-#defineA_PIPE  37
+/* 37 unused */
 #defineA_INDEX_FILE38
 #defineA_UNDO_SEARCH   39
 #defineA_FF_SCREEN 40
Index: command.c
===
RCS file: /cvs/src/usr.bin/less/command.c,v
retrieving revision 1.31
diff -u -p -r1.31 command.c
--- command.c   12 Jan 2017 20:32:01 -  1.31
+++ command.c   17 Dec 2017 12:23:00 -
@@ -241,12 +241,6 @@ exec_mca(void)
/* If tag structure is loaded then clean it up. */
cleantags();
break;
-   case A_PIPE:
-   if (secure)
-   break;
-   (void) pipe_mark(pipec, cbuf);
-   error("|done", NULL);
-   break;
}
 }
 
@@ -1396,35 +1390,6 @@ again:
c = getcc();
goto again;
 
-   case A_VISUAL:
-   /*
-* Invoke an editor on the input file.
-*/
-   if (secure) {
-

Re: less(1): `!' command

2017-12-16 Thread Theo de Raadt
> On Sat, 16 Dec 2017 19:39:27 +, Theo de Raadt wrote:
> > > On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote:
> > > > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote:
> > > > > Hi,
> > > > >
> > > > > Would a patch to bring back the `!' command to less(1) be accepted?  
> > > > > The
> > > > > commit message for its removal explains that ^Z should be used 
> > > > > instead,
> > > > > but that obviously does not work if less(1) is run from something else
> > > > > than an interactive shell, for example when reading manual pages from 
> > > > > a
> > > > > vi(1) instance spawned directly by `xterm -e vi' in a window manager 
> > > > > or
> > > > > by `neww vi' in a tmux(1) session.
> > > >
> > > > Why should less be able to spawn another programs? This would undermine
> > > > all pledge work.
> > >
> > > Because of at least `v' and `|', less(1) already is able to invoke
> > > arbitrary programs, and accordingly needs the "proc exec" promise, so
> > > bringing `!' back would not change anything from a security perspective
> > > (otherwise, I would obviously not have made such a proposition).
> > >
> > > In fact, technically, what I want to do is still currently possible:
> > > from any less(1) instance, one may use `v' to invoke vi(1), and then use
> > > vi(1)'s own `!' command as desired.  So the functionality of `!' is
> > > still there; it was only made more difficult to reach for no apparent
> > > reason.
> >
> > No apparent reason?
> >
> > Good you have an opinion.  I have a different opinion: We should look
> > for rarely used functionality and gut it.
> 
> I completely agree, and I also completely agree with the rest of what
> you said.  However, in this particular case, the functionality of `!' is
> still fully (albeit indirectly) accessible, as shown above, and this is
> why its deletion, when not immediately followed by that of `|' and `v',
> made little sense for me.

Oh, so you don't agree.  Or do you.  I can't tell.  You haven't made up
your mind enough to have a final position?

> Either the commands that require "proc exec" should all be removed along
> with that promise, or `!' should be brought back without any pledge(2)
> modifications.

That is pretty absolutist.

The universe is not always consistant, and neither is OpenBSD.

The final decisions haven't been made yet, because we haven't gauged
the usage patterns.

> But currently it really feels like a big waste (for both
> parties) to request such high privileges, and then to do almost nothing
> useful with them.

Request?  pledge isn't a "request" system.  It is a 2nd specification
of the program about maximum it believes it will use, and therefore it
is a hard brake.  At the moment the featureset still needs "proc exec".
So the specification isn't a waste, it is accurate.

> If the plan really was to get rid of all such commands eventually, what
> exactly is preventing that from happening now?  

The plan was to get rid of ! in a few commands, then later get rid of
a few more of them, and see where we end up.  With such plans, we
don't always act all on one step, because then it is too easy to get
embroiled in just that one battle and forget about the other things
which also need doing.  Also it is impossible to ask the community
because petty fights result and provide innaccurate usage assessments.

There are many other things to do.  As a result, our universe is not
always consistant.  This is an example.

> May I go ahead and prepare a patch to remove "proc exec" entirely?

Sure you could try, and see who freaks out.  Exactly what the plan was
all along.



Re: less(1): `!' command

2017-12-16 Thread kshe
On Sat, 16 Dec 2017 19:39:27 +, Theo de Raadt wrote:
> > On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote:
> > > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote:
> > > > Hi,
> > > >
> > > > Would a patch to bring back the `!' command to less(1) be accepted?  The
> > > > commit message for its removal explains that ^Z should be used instead,
> > > > but that obviously does not work if less(1) is run from something else
> > > > than an interactive shell, for example when reading manual pages from a
> > > > vi(1) instance spawned directly by `xterm -e vi' in a window manager or
> > > > by `neww vi' in a tmux(1) session.
> > >
> > > Why should less be able to spawn another programs? This would undermine
> > > all pledge work.
> >
> > Because of at least `v' and `|', less(1) already is able to invoke
> > arbitrary programs, and accordingly needs the "proc exec" promise, so
> > bringing `!' back would not change anything from a security perspective
> > (otherwise, I would obviously not have made such a proposition).
> >
> > In fact, technically, what I want to do is still currently possible:
> > from any less(1) instance, one may use `v' to invoke vi(1), and then use
> > vi(1)'s own `!' command as desired.  So the functionality of `!' is
> > still there; it was only made more difficult to reach for no apparent
> > reason.
>
> No apparent reason?
>
> Good you have an opinion.  I have a different opinion: We should look
> for rarely used functionality and gut it.

I completely agree, and I also completely agree with the rest of what
you said.  However, in this particular case, the functionality of `!' is
still fully (albeit indirectly) accessible, as shown above, and this is
why its deletion, when not immediately followed by that of `|' and `v',
made little sense for me.

Either the commands that require "proc exec" should all be removed along
with that promise, or `!' should be brought back without any pledge(2)
modifications.  But currently it really feels like a big waste (for both
parties) to request such high privileges, and then to do almost nothing
useful with them.

If the plan really was to get rid of all such commands eventually, what
exactly is preventing that from happening now?  May I go ahead and
prepare a patch to remove "proc exec" entirely?

Regards,

kshe



Re: less(1): `!' command

2017-12-16 Thread Theo de Raadt
> > Would a patch to bring back the `!' command to less(1) be accepted?  The
> > commit message for its removal explains that ^Z should be used instead,
> > but that obviously does not work if less(1) is run from something else
> > than an interactive shell, for example when reading manual pages from a
> > vi(1) instance spawned directly by `xterm -e vi' in a window manager or
> > by `neww vi' in a tmux(1) session.
> 
> Why should less be able to spawn another programs? This would undermine
> all pledge work.

It does not undermine any pledge work at all.

The strategy is reduction of "many programs have ways to break out to
full system call operation, but why?"

Fixing all of these concerns won't happen in a day.  We are boiling this
frog slowly.



Re: less(1): `!' command

2017-12-16 Thread Theo de Raadt
> On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote:
> > On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote:
> > > Hi,
> > >
> > > Would a patch to bring back the `!' command to less(1) be accepted?  The
> > > commit message for its removal explains that ^Z should be used instead,
> > > but that obviously does not work if less(1) is run from something else
> > > than an interactive shell, for example when reading manual pages from a
> > > vi(1) instance spawned directly by `xterm -e vi' in a window manager or
> > > by `neww vi' in a tmux(1) session.
> >
> > Why should less be able to spawn another programs? This would undermine
> > all pledge work.
> 
> Because of at least `v' and `|', less(1) already is able to invoke
> arbitrary programs, and accordingly needs the "proc exec" promise, so
> bringing `!' back would not change anything from a security perspective
> (otherwise, I would obviously not have made such a proposition).
> 
> In fact, technically, what I want to do is still currently possible:
> from any less(1) instance, one may use `v' to invoke vi(1), and then use
> vi(1)'s own `!' command as desired.  So the functionality of `!' is
> still there; it was only made more difficult to reach for no apparent
> reason.

No apparent reason?

Good you have an opinion.  I have a different opinion: We should look
for rarely used functionality and gut it.  Over the last 40 years
people have felt a desire to add all possible features and options to
all commands, and noone ever considered the impact of having all
programs above to reach all system calls, and that these features are
being installed in all program operating environents.  Then someone
adds less(1) to a script which requires security, and just like that
it has none.

The entire environment is poisoned, and people are pushed to jump to
other environments which aren't poisoned in this way, until enough
people arrive there, the feature explosion happens there also
resulting in "reach all the system calls", and we're stuck in the same
rut again.

I don't think all programs should be able to run all other programs.

As a result I support the idea of trying to find the things people
don't actually use, and removing them incrementally.  '|' should be on
the list next.

But you don't.  Luckily you have other choices.

Are you prepared to die on this hill that less must support '!'?  If
so, there's that FreeBSD hill over there..



Re: less(1): `!' command

2017-12-16 Thread kshe
On Sat, 16 Dec 2017 18:13:16 +, Jiri B wrote:
> On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote:
> > Hi,
> >
> > Would a patch to bring back the `!' command to less(1) be accepted?  The
> > commit message for its removal explains that ^Z should be used instead,
> > but that obviously does not work if less(1) is run from something else
> > than an interactive shell, for example when reading manual pages from a
> > vi(1) instance spawned directly by `xterm -e vi' in a window manager or
> > by `neww vi' in a tmux(1) session.
>
> Why should less be able to spawn another programs? This would undermine
> all pledge work.

Because of at least `v' and `|', less(1) already is able to invoke
arbitrary programs, and accordingly needs the "proc exec" promise, so
bringing `!' back would not change anything from a security perspective
(otherwise, I would obviously not have made such a proposition).

In fact, technically, what I want to do is still currently possible:
from any less(1) instance, one may use `v' to invoke vi(1), and then use
vi(1)'s own `!' command as desired.  So the functionality of `!' is
still there; it was only made more difficult to reach for no apparent
reason.

Regards,

kshe



Re: less(1): `!' command

2017-12-16 Thread Jiri B
On Sat, Dec 16, 2017 at 04:55:44PM +, kshe wrote:
> Hi,
> 
> Would a patch to bring back the `!' command to less(1) be accepted?  The
> commit message for its removal explains that ^Z should be used instead,
> but that obviously does not work if less(1) is run from something else
> than an interactive shell, for example when reading manual pages from a
> vi(1) instance spawned directly by `xterm -e vi' in a window manager or
> by `neww vi' in a tmux(1) session.

Why should less be able to spawn another programs? This would undermine
all pledge work.

IIUC your vi scenario, you are not spawing 'vi' from less but the opposite
way. That should work.

j.



less(1): `!' command

2017-12-16 Thread kshe
Hi,

Would a patch to bring back the `!' command to less(1) be accepted?  The
commit message for its removal explains that ^Z should be used instead,
but that obviously does not work if less(1) is run from something else
than an interactive shell, for example when reading manual pages from a
vi(1) instance spawned directly by `xterm -e vi' in a window manager or
by `neww vi' in a tmux(1) session.

If not, then at least documentation for this command should be removed
properly (I cannot provide a diff as this file contains raw backspace
characters):

$ cd /usr/src/usr.bin/less/
$ printf '99d\nwq\n' | ed - less.hlp

Regards,

kshe