[dev] [vis] dw near the end of the line
Recently, the question of the correctness of vim's behavior of 2dw on the first of three lines of one word each came up on the vim mailing list (it turns out that it's not correct according to POSIX, but is shared with traditional vi). At that time, I wasn't able to build vis to see what it does. I've since figured out my build problem, and tested vis's behavior in this situation. When you delete the last word of a line in vis with the dw command, it always deletes the newline and all following spaces and newlines (i.e. placing the content of the next non-blank line on the current one). This behavior differs from most other vi clones, matching only elvis-tiny. Is this behavior intended?
[dev] Re: st: keys and infocmp
Greg Reagle writes: > Hello. If there are any man pages or articles or FAQs about this topic > that would be good to read, please refer to them. > > Running Xubuntu 12.04 and the latest st on a ThinkPad laptop, these are > the results I get, correlated with the results of infocmp. I got the > output from the keys by running cat and hitting the keys. > > st, TERM is st-256color > > | home | end | insert | delete | up | down | left | right | > | ^[[H | ^[[4~ | ^[[4h | ^[[P | ^[[A | ^[[B | ^[[D | ^[[C | > | home | kc1, kend | smir | dch1 | cuu1 | | | cuf1 | > > Why do the escape sequences produced by down and left arrow keys have no > match in infocmp? Why does home key not produce khome (\E[1~) escape > sequence? The character sequences in the terminfo entry are meant to match those which are sent when keypad mode (tput smkx) is enabled. Try "tput smkx; cat". I note that you called out down and left, I suspect this is because you've incorrectly matched up and right against cuu1 and cuf1, whereas the latter are control sequences, not input sequences. You should be looking exclusively at terminfo strings whose name begins with "k" (khome, kend, kich1, kdch1, kcuu1, kcud1, kcub1, kcuf1).
[dev] Re: Bug in join.c
Mattias Andrée writes: > I think this patch should be included. But I don't see > how it is of substance. It will never occur with two's > complement or ones' complement. Only, signed magnitude > representatiion. Any sensible C compiler for POSIX > systems will only use two's complement; otherwise > int[0-9]*_t cannot be implemented. I had assumed that comparing an unsigned value with a negative number resulted in a comparison that is unconditionally false, rather than converting one to the type of the other. Maybe that's because I've gotten too used to non-C languages that don't have fixed-size integers. Sorry for the confusion.
[dev] Bug in join.c
I was going through sbase checking with -Wall -Wextra -pedantic -Werror, and among a bunch of noise errors relating to signed/unsigned comparisons, I found one with actual substance: the result of getline is being converted to size_t before comparing to -1 to check for error. diff --git a/join.c b/join.c index 1a08927..6828cf4 100644 --- a/join.c +++ b/join.c @@ -261,7 +261,8 @@ static int addtospan(struct span *sp, FILE *fp, int reset) { char *newl = NULL; - size_t len, size = 0; + ssize_t len; + size_t size = 0; if ((len = getline(&newl, &size, fp)) == -1) { if (ferror(fp)) I also couldn't quite figure out if this line of tail.c is correct or not. n = MIN(llabs(estrtonum(numstr, LLONG_MIN + 1, MIN(LLONG_MAX, SIZE_MAX))), SIZE_MAX);
[dev] Re: [sbase] Portability
Dimitris Papastamos writes: > sbase should only contain code that runs on POSIX systems (with some > minor exceptions) and fallback implementations for non-standardized > interfaces that can be implemented portably on top of POSIX interfaces. So there's no place for fallback implementations _of_ POSIX interfaces on top of either older POSIX interfaces or non-standard ones? Anyway, here's a patch for some data type issues that came up - more to do with compiling with all warnings, though the fact that clock_t is unsigned on OSX helped catch one of them. diff --git a/du.c b/du.c index 41e4380..3dc3545 100644 --- a/du.c +++ b/du.c @@ -25,7 +25,7 @@ printpath(off_t n, const char *path) if (hflag) printf("%s\t%s\n", humansize(n * blksize), path); else - printf("%ju\t%s\n", n, path); + printf("%jd\t%s\n", (intmax_t)n, path); } static off_t diff --git a/split.c b/split.c index f15e925..ee24556 100644 --- a/split.c +++ b/split.c @@ -48,7 +48,7 @@ int main(int argc, char *argv[]) { FILE *in = stdin, *out = NULL; - size_t size = 1000, n; + off_t size = 1000, n; int ret = 0, ch, plen, slen = 2, always = 0; char name[NAME_MAX + 1], *prefix = "x", *file = NULL; @@ -69,7 +69,7 @@ main(int argc, char *argv[]) break; case 'l': always = 0; - size = estrtonum(EARGF(usage()), 1, MIN(LLONG_MAX, SIZE_MAX)); + size = estrtonum(EARGF(usage()), 1, MIN(LLONG_MAX, OFF_MAX)); break; default: usage(); diff --git a/time.c b/time.c index 4af0352..60a8c8d 100644 --- a/time.c +++ b/time.c @@ -36,7 +36,7 @@ main(int argc, char *argv[]) if ((ticks = sysconf(_SC_CLK_TCK)) <= 0) eprintf("sysconf _SC_CLK_TCK:"); - if ((r0 = times(&tms)) < 0) + if ((r0 = times(&tms)) == (clock_t)-1) eprintf("times:"); switch ((pid = fork())) { @@ -52,7 +52,7 @@ main(int argc, char *argv[]) } waitpid(pid, &status, 0); - if ((r1 = times(&tms)) < 0) + if ((r1 = times(&tms)) == (clock_t)-1) eprintf("times:"); if (WIFSIGNALED(status)) {
[dev] [sbase] Portability
I downloaded and built sbase for my OSX system to test the cal program, and noticed (and fixed locally) several issues. Before posting any patches, I wanted to ask - philosophically speaking, how much effort should sbase put towards supporting systems that don't support the latest-and-greatest POSIX functions? Three functions were missing (utimensat, clock_gettime, and fmemopen), and fmemopen in particular required an extensive implementation, which I found online (https://github.com/NimbusKit/memorymapping) rather than writing myself. Also, if these are added should they go in libutil or a new "libcompat"?
[dev] Re: [farbfeld] announce
FRIGN writes: > I guess a better way to do that would be to use greyscale-farbfeld > files There doesn't appear to be such a thing, unless you mean just have R=G=B and A=65535. Which, to me, seems to suck about as much as using ASCII for a header that can be parsed with fscanf. I think it'd be more elegant to _only_ have a "grayscale" format, and store RGBA images as a quartet of these.
[dev] Re: [farbfeld] announce
Andrew Gwozdziewycz writes: > Well, for one, it's a binary encoding, not ASCII. I'm not sure why that makes it better, unless you meant for space consumption (which I suppose is somehow very important for uncompressed raster image formats) in which case you're ignoring the fact that PPM has a format where only the header is ASCII.
[dev] Re: [farbfeld] announce
FRIGN writes: > Hello fellow hackers, > > I'm very glad to announce farbfeld to the public, a lossless image > format as a successor to "imagefile" with a better name and some > format-changes reflecting experiences I made since imagefile has > been released. (snip description of format) How is this better than PPM?
[dev] Re: a suckless hex editor
Greg Reagle writes: > I agree that it is a "poor man's" hex editor. I am having fun with it, even > if > it is a toy. I don't have the desire to write a sophisticated hex editor > (besides they already exist). > > I like that the small shell script can turn any editor into a hex editor. > BTW, > if od is replaced with hexdump -C or xxd or GNU od -tx1z, then the ascii will > be in the dump too. It being in the dump isn't really "enough" - in a real hex editor, you can make changes on the ASCII side and expect them to be reflected in the hex side (and ultimately the binary file), whereas using xxd [etc] means the ASCII side is static and is ignored when read back in. This does have its place, though... It's basically an editor-portable version of the recipe that vim provides for using xxd to "edit" binary files. Which is itself a compelling enough use case for xxd to be included with vim in the first place (as far as I know xxd has no other vim-related purpose). But it's not a hex editor.
Re: [dev] st: selecting text affects both primary and clipbaord
On Fri, Feb 20, 2015, at 12:38, sta...@cs.tu-berlin.de wrote: > * k...@shike2.com 2015-02-20 17:39 > > I agree here, it shouldn't modiy the CLIPBOARD seletction. Sometime > > is good to have different things in both selections. If nobady claims > > about it I will apply your patch. > > I'd leave it as is, in order not to break scrips which expect to read > something from CLIPBOARD. > > In other programms, you might have the choice to send something to > selection or CLIPBOARD by different means. In st, however, you don't. > Thus, the current behaviour seems to me more consistent and intuitive. Another reason to leave it as-is is that, while in other applications it is reasonable to select text for some purpose other than copying it (e.g. to delete it or replace it), people will not want their clipboard obliterated in this case. However, in a terminal emulator, the only thing you can do with selected text is copy it. PuTTY on MS Windows puts selected text immediately in the clipboard and apparently no-one has ever objected to this behavior - if anyone had I'm sure they would have added it to the dozens of configurable options it already has.
Re: [dev] surf questions
On Thu, Jan 22, 2015, at 16:47, Raphaël Proust wrote: > When you have a vertical line in a text, indicating where the > character you type will appear: it's called a caret. Or, more relevantly to a (mostly) read-only application like a web browser, to enable you to precisely position it with the keyboard to begin a selection for copying. The arrow keys move the caret rather than scrolling the page.
Re: [dev] [sbase] [PATCH-UPDATE] Rewrite tr(1) in a sane way
On Sat, Jan 10, 2015, at 19:11, Ian D. Scott wrote: > On Sat, Jan 10, 2015 at 06:56:45PM -0500, random...@fastmail.us wrote: > Actually, ẞ, capital of ß, was added in Unicode 5.1. There are probably > others letters with this issue, however. My main point was that you've got to be careful that the order of the classes matches the counterparts with each other, which there is not otherwise a guarantee of. A naive interpretation's main problem is that ß puts everything after it off by one.
Re: [dev] [sbase] [PATCH-UPDATE] Rewrite tr(1) in a sane way
On Sat, Jan 10, 2015, at 16:47, Markus Wichmann wrote: > You wanted to be Unicode compatible, right? Because in that case I > expect [:alpha:] to be the class of all characters in General Category L > (that is, Lu, Ll, Lt, Lm, or Lo). That includes a few more characters > than just A-Z and a-z. And I don't see you add any other character to > that class later. Note that translating between [:upper:] and [:lower:] requires using the toupper and tolower mapping, rather than just dumping the character classes (since otherwise you'll run into there being something like ß that is in [:lower:] and has no counterpart in [:upper:], or they're in a different order)
Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way
On Fri, Jan 9, 2015, at 18:39, FRIGN wrote: > C3B6 is 'ö' and makes sense to allow specifying it as \50102 (in the pure > UTF-8-sense of course, nothing to do with collating). Why would someone want to use the decimal value of the UTF-8 bytes, rather than the unicode codepoint? Why are you using decimal for a syntax that _universally_ means octal? UTF-8 is an encoding of Unicode. No-one actually thinks of the character as being "C3B6" - it's 00F6, even if it happens to be encoded as C3 B6 or F6 00 whatever. Nobody thinks of UTF-8 sequences as a single integer unit. The sensible thing to do would be to extend the syntax with \u00F6 (and \U0001 for non-BMP characters) the way many other languages have done it) This also avoids repeating the mistake of variable-length escapes - \u is exactly 4 digits, and \U is exactly 8. > Well, probably I misunderstood the matter. Sometimes this stuff gets > above my head. ;) > At the end of the day, you want software to work as expected: > > GNU tr: > $ echo ελληνική | tr [α-ω] [Α-Ω] > ® > > our tr: > $ echo ελληνικη | ./tr [α-ω] [Α-Ω] > ΕΛΛΗΝΙΚΗ And that's fine. Actually I think POSIX actually _requires_ for it to work the way yours does, and GNU fails to comply. As a data point, OSX and FreeBSD both work the same way as sbase for this test case. GNU actually has a history of being behind the curve on UTF-8/multibyte characters, so it's not a great example of "what POSIX requires". Cut is another notable command with the same problem.
Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way
On Fri, Jan 9, 2015, at 18:08, FRIGN wrote: > > This is madness. If you want the bytes to be collated, I don't see where you're getting that either of us want the bytes to be collated. I don't even know what you mean by "collated", since collating is not what tr does, except when ordering ranges. > you just write the > literal \50102. Even if octal values could be more than three digits, I have no idea what you think 50102 is. Its decimal value is 20546. Its hex value is 0x5042. I have no idea what it has to do with character U+00F6 whose UTF-8 representation is 0xC3 0xB6. I just realized what you're doing, 0xC3B6 has the _decimal_ value 50102, I have no idea why you would think _that_ is a representation people would want to use. If you're so pro-unicode, make it accept \u00F6 - that's a valid extension. But reusing the syntax POSIX uses for three-digit octal literals, for arbitrarily long decimal literals that aren't even unicode code points, makes no sense at all. In what universe is that intuitive? > POSIX often is a solution to a problem that doesn't exist > in the first place when you just use UTF-8. > > > They have nothing to do with UTF-8. > > That's exactly the point. Collating elements are depending on the current > locale which is too much of a mess to deal with. Huh? > So when the Spanish "ll" collates before "m" and after "l" in a given > locale, we don't give a fuck. > So please give me the point why you are torturing me with this > information. Because collating elements are the thing POSIX forbids which you appear to have _misinterpreted_ as forbidding multibyte characters. Otherwise I have _no idea_ what in POSIX you interpret as preventing reasonable behavior with UTF-8 multibyte characters. > I stated that I did not implement collating elements into this tr(1) at > the beginning and that it's a POSIX-nightmare to do so, bringing harm > to anybody who is interested in a consistent, usable tool. tl;dr: Collating elements = POSIX forbids them = You don't want them anyway. Multibyte characters = POSIX allows/requires them = You like them too. What is the problem? I don't know what you want to do that you think POSIX doesn't allow.
Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way
On Fri, Jan 9, 2015, at 17:48, FRIGN wrote: > Did you read what I said? I explicitly went away from POSIX in this > regard, > because no human would write ""tr '\303\266o' 'o\303\266'". POSIX doesn't require people to write it, it just requires that it works. POSIX has no problem with also allowing a literally typed multibyte character to refer to itself. It's basically saying that if someone _does_ write '\303\266o' 'o\303\266', you have to treat it the same as öo oö, and not as the individual bytes. > The reason why POSIX prohibits collating elements is only because they > are > inhibited by their own overload of different character sets and locales. > Given assuming a UTF-8-locale is a very sane way to go (see Plan 9), this > limit can easily be thrown off and makes life easier. I don't think you're understanding the difference between multi-character collating elements and multibyte characters. Multi-character collating elements are things like "ch" in some Spanish locales. They have nothing to do with UTF-8.
Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way
On Fri, Jan 9, 2015, at 16:44, Nick wrote: > Quoth FRIGN: > > - UTF-8: not allowed in POSIX, but in my opinion a must. This > > finally allows you to work with UTF-8 streams without > > problems or unexpected behaviour. > > I fully agree (unsurprisingly). Anything that relies on the POSIX > behaviour to do weird things involving multibyte characters is > insane. Er... http://pubs.opengroup.org/onlinepubs/009696899/utilities/tr.html has very little mention of the issue one way or another, but does use the term "characters" rather than "bytes" in all relevant places, and talks about "multi-byte characters" in a tone that suggests they should be supported properly when LC_CTYPE has them. The only _questionable_ bits are some of the language surrounding the use of octal sequences: For single characters: "Multi-byte characters require multiple, concatenated escape sequences of this type, including the leading '\' for each byte." I read this as meaning that multi-byte characters are supported, and in fact that "tr '\303\266o' 'o\303\266' means that \303\266 [two escape sequences representing one multi-byte character] and o will be swapped - and that it is not possible to specify multibyte characters with octal values a dash-separated range specification (but they can be included as literals). Or, is it possible that FRIGN misinterpreted the prohibition on "multi-character collating elements" ?
Re: [dev] problem report for sbase/cal
On Mon, Dec 15, 2014, at 11:47, Greg Reagle wrote: > January 2015 is supposed to start on a Thursday. January 2014 started on a Wednesday - maybe it's worth investigating whether cal -3 that spans two years isn't using the correct year for some of the months.
Re: [dev] Object-Oriented C for interface safety?
On Thu, Nov 27, 2014, at 07:27, koneu wrote: > Greetings. > > The two things that really make OO languages worthwhile in my opinion > are polymorphism and inheritance. Doing polymorphism and data/code > hiding in C is easy enough with const function pointers. You can just > define public interfaces in their own header like > > struct interface { > void * const this; > int (* const get_foo)(void *this); > void (* const set_foo)(void *this, int foo); > char * (* const get_bar)(void *this); > void (* const set_bar)(void *this, char *bar); > }; > > and implement them in "classes" like > > struct class { > int foo; > char *bar; > }; In general when this is done in real life, you do it the other way around, so you only need one copy of the interface structure per class.
Re: [dev] why avoid install?
On Thu, Nov 20, 2014, at 14:40, Markus Wichmann wrote: > Not always. One thing that reliably gets on people's nerves here is > shared libraries. And those aren't protected with that ETXTBSY thing. > > The reason is that the MAP_DENYWRITE flag became the irrecoverable > source of a DoS attack and had to be removed from the syscall. It can > still be used in the kernel, which is why overwriting a running binary > will fail, but it can't be used in userspace (or rather, is ignored), Why not give ld-linux.so a capability that allows it? Wait, no, that wouldn't solve it for dlopen(). Why not allow it for files that have execute permission? What are the details of the DOS attack?
Re: [dev] [sbase] style
On Wed, Nov 19, 2014, at 16:44, k...@shike2.com wrote: > > C90, or any version of standard C, does not have a concept of "system > > headers", other than giving implementations permission to place their > > own implementation-defined files in places searched by #include > > . > > At this point I was talking about POSIX of course. C90 doesn't give > implementations permission to place their own implementation-defined. > If your program relays on that, and include some ot these > implementation headers, then your program is not C90 compliant, > and the behaviour is undefined (from C90 point of view, not from > POSIX point of view). Er, by "permission" I meant it doesn't make the _implementation_ non-compliant. And implementation-defined is not the same as undefined. > - Each header declares and defines only those > identifiers listed in its associated section: If the header includes > another header then it will break this rule. I think this is meant as a statement that strictly conforming programs may not rely on them defining anything else. Most of these identifiers are reserved, and a strictly conforming program therefore cannot do anything with them without including the header they are documented as being defined in.
Re: [dev] [sbase] style
On Wed, Nov 19, 2014, at 13:51, k...@shike2.com wrote: > > > system headers should come first, then a newline, then libc headers > > then a newline then local headers. > > > I usually do just the inverse, first libc headers and later system > headers. > > > the libc headers are guaranteed to work regardless of the order of > > inclusion but need to come after the system headers. From what I > > Are you sure about that?. I know that C90 guarantees that any > standard header will not include any other standard header (althought > it is sad that a lot of compilers ignore this rule), but I have never > read anything about dependences between standard and system headers. C90, or any version of standard C, does not have a concept of "system headers", other than giving implementations permission to place their own implementation-defined files in places searched by #include . POSIX does not, as far as I can tell, allow systems to require headers to be included in any certain order. I have no idea what the categories "system headers" and "libc headers" refer to in the post you are replying to, what operating system he is using (certainly not POSIX - I think when I saw the post I got the vague impression he was talking about Plan 9), or which category the standard C headers or POSIX headers might fall into. There are such order-dependencies on some non-POSIX unix systems (I once had to move sys/types.h above socket.h to get a program to compile on 2.11BSD), and it may or may not make sense to order headers in line with those as a matter of tradition. In general, both standards require all headers to declare, for example, any typedefs that are present in the signature, without implying the inclusion of any other header that also defines the same types, and leaving it up to the implementation to determine how to accomplish this. For example, unistd.h cannot require that sys/types.h be included first just because it uses off_t which is also found in sys/types.h; the author of the header files has to figure out how to make them both define off_t without any conflict if both are included. I couldn't find the guarantee you mentioned, that one header shall not include another header, and I can't think of how doing so would affect the behavior of any strictly conforming program.
Re: [dev] why avoid install?
On Wed, Nov 19, 2014, at 09:55, Dimitris Papastamos wrote: > Regarding your question on cp -f then the answer is not quite. > > cp -f will try to unlink the destination if it fails to open it for > whatever > reason. And if the target is running and writing to a running binary is a problem, opening it will fail with [ETXTBSY], meaning it will be unlinked. You can argue about whether that is the purpose or something else (permission errors within a directory you own) is the purpose, but it will certainly solve that problem.
Re: [dev] Patches to fix . for insert and change commands
On Tue, Nov 18, 2014, at 17:59, Stephen Paul Weber wrote: > I've written up patches to make it so that I, a, A, s, ce, etc can be > repeated properly with . -- not sure if I'm doing this the Right Way, > but > it seems to work in my tests. Feedback appreciated. Patches attached. Haven't looked at your patch, but vim stores the inserted keystrokes (not text - it'll happily let you repeat an inserted sequence of backspaces that deleted over the beginning of the insertion region, arrows that moved the cursor, etc) in a read-only register named with the period character. Pasting it with ^R. or ^A in insert-mode plays back the keystrokes and adds them to the text which will be in the register the next time you leave insert mode. I don't know offhand if this register is used for the . command or not.
Re: [dev] fsbm
On Fri, Nov 7, 2014, at 05:11, Dimitris Papastamos wrote: > It is generally unlikely that the string has been validated to > be an integer before getting to atoi(). With atoi() you cannot > distinguish between an invalid integer and 0. > > Generally speaking, it should never be used. What if you don't care?
Re: [dev] fsbm
On Fri, Nov 7, 2014, at 02:03, k...@shike2.com wrote: > I disagree, check the size before of calling strcpy. If you want to > avoid security risk you also have to check the output of strlcpy > to detect truncations, so you don't win anything. In both cases > you have to add a comparision, so it is better to use strcpy that > is standard. There are numerous scenarios where an overflow has security implications but a truncation does not. For example, if an attacker can supply any string, they could supply the shorter one to begin with, and therefore don't benefit from truncation.
Re: [dev] c++-style comments [was fsbm]
On Thu, Nov 6, 2014, at 16:47, Sylvain BERTRAND wrote: > Linus T. does let closed source modules live (even so the GNU GPLv2 gives > legal > power to open the code, or block binary blob distribution, like what > happens > with mpeg video or 3D texture compression), There's a significant amount of debate over what constitutes an 'arms length' interaction between two pieces of code and what makes them effectively a single piece of code. GNU takes the position that sharing the same address space in any way is the latter, and that normal interaction through files/pipes/sockets is the former (because it would be politically inconvenient for them to push too far) so long as it's not a specially defined protocol that only exists for that single pair of programs. The kernel people as far as I know take the position that sharing the same address space is okay so long as they only use certain approved APIs intended for use by modules - and that userspace-kernel interaction via normal system calls is always okay. None of this has been examined by a court.
Re: [dev] c++-style comments [was fsbm]
On Thu, Nov 6, 2014, at 12:34, Louis Santillan wrote: > In a color syntax highlighting editor, doSomething(); takes on normal > highlighting when enabled, and takes on comment colored highlighting > when > disabled. Visually, that's slightly improved over something like > >#ifdef DEBUG >doSomething(); >#endif In the editor *I* use, it has comment colored highlighting for #if 0, and for the #else of #if 1, and the same for anything with #if 0 && and #if 1 ||.
Re: [dev] [sbase] [PATCH 1/2] Fix symbolic mode parsing in parsemode
On Sun, Nov 2, 2014, at 17:24, Michael Forney wrote: > I found quite a lot of bugs, so I ended up pretty much rewriting as I > followed the spec¹. How about +X? I noticed there were no test cases for that. +X acts like +x provided either the file is a directory or the file already has at least one execute bit set. The function doesn't seem to be able to know if the file is a directory. Same for =X, and -X is identical to -x.
Re: [dev] [sbase] [PATCH 1/4] tar: Don't crash when get{pw,gr}uid fails
On Sat, Nov 1, 2014, at 18:01, Michael Forney wrote: > It looks like GNU tar does¹, but BSD tar uses the string > representation of the UID/GID. > > ¹ http://git.savannah.gnu.org/cgit/tar.git/tree/src/names.c#n66 I didn't think to look at a modern BSD (the relevant function is name_uid in pax/cache.c). Either way, any tar should be able to cope with either output, assuming no system has the pathological case of a user account with a numeric name different from its uid, but a blank string seems to be more POSIX-correct. There's another tar (possibly actually called bsdtar) in contrib/libarchive that I couldn't make heads or tails of (it uses some kind of modular design and I couldn't find the real implementation of everything)
Re: [dev] [sbase] [PATCH 1/4] tar: Don't crash when get{pw,gr}uid fails
On Sat, Nov 1, 2014, at 16:57, Dimitris Papastamos wrote: > On Sat, Nov 01, 2014 at 08:36:37PM +, Michael Forney wrote: > > - snprintf(h->uname, sizeof h->uname, "%s", pw->pw_name); > > - snprintf(h->gname, sizeof h->gname, "%s", gr->gr_name); > > + snprintf(h->uname, sizeof h->uname, "%s", pw ? pw->pw_name : ""); > > + snprintf(h->gname, sizeof h->gname, "%s", gr ? gr->gr_name : ""); > > The patches look good, thanks! > > Just a small clarification on this one, do other tar implementations > do the same here? Yes. I looked at heirloom (both tar and cpio), 4.4BSD pax, GNU tar, and star. Heirloom prints an error, none of the rest do, and all seem to put in an empty string (or do nothing and the field is initialized earlier with null bytes).
Re: [dev] [PATCH] [st] Use inverted defaultbg/fg for selection when bg/fg are the same
On Mon, Oct 27, 2014, at 10:54, Martti Kühne wrote: > This may sound trivial, but. > How about you paste it somewhere else? Requires having another window already open that can accept arbitrary text (and not attempt to execute it as commands).
Re: [dev] [PATCH] [st] Use inverted defaultbg/fg for selection when bg/fg are the same
On Mon, Oct 27, 2014, at 08:20, FRIGN wrote: > There's simply no reason to break consistency for some quirky irc-gag. But there's no compelling reason in the first place to visualize selection by inverting the colors. If you want "consistency" it can be achieved by having an actual selection color pair, or by _always_ using the default colors, but that's bikeshed painting. The reason why someone might have same-fg-as-bg on their screen is beside the point - having no way to make that text visible is a usability issue.
Re: [dev] SGI Irix look (4Dwm)
On Wed, Oct 22, 2014, at 14:01, Peter Hofmann wrote: > I'm pretty sure that most people on the list will agree on this being > just plain crazy. :-) It's a hack, it's ugly and it's anything but > suckless. > > I won't go into further detail. This causes many, many problems. The > only reason why dwm-vain still has these kind of borders is that I don't > find the time to either turn dwm into a reparenting WM or write a > reparenting WM from scratch. Why not just draw the title in a separate window? This is how I always assumed a suckless window manager with title bars would act.
Re: [dev] [st][PATCH] Add support for utmp in st (DISREGARD LAST)
Sorry I accidentally hit shift-enter and apparently that makes my email client send. On Mon, Oct 13, 2014, at 14:38, k...@shike2.com wrote: > But, why do you think is better DELETE than BACKSPACE? Because that is the character sent by the key in this position with this expected function (i.e. the What ascii codes are supposed they should send? (Home sends Home, not > Find). What is Home? { XK_Home, ShiftMask, "\033[2J", 0, -1, 0}, { XK_Home, ShiftMask, "\033[1;2H", 0, +1, 0}, { XK_Home, XK_ANY_MOD, "\033[H",0, -1, 0}, { XK_Home, XK_ANY_MOD, "\033[1~", 0, +1, 0}, { XK_End, ControlMask,"\033[J", -1,0, 0}, { XK_End, ControlMask,"\033[1;5F",+1,0, 0}, { XK_End, ShiftMask, "\033[K", -1,0, 0}, { XK_End, ShiftMask, "\033[1;2F",+1,0, 0}, { XK_End, XK_ANY_MOD, "\033[4~", 0,0, 0}, Find is ESC[1~, Select is ESC[4~. No other codes here are from DEC terminals.
Re: [dev] [st][PATCH] Add support for utmp in st
On Mon, Oct 13, 2014, at 14:38, k...@shike2.com wrote: > > On Sun, Oct 12, 2014, at 14:32, k...@shike2.com wrote: > > That doesn't mean that the question of what the default should be is not > > worth discussing. > > Default configuration was discussed here some time ago, and suckless > developers agreed with current configuration. Both options, Backspace > generates BACKSPACE and Backspace generates DELETE have advantages and > problems, and usually emulators have some way of changing between them. > Xterm uses 3 resources for it: backarrowKeyIsErase, backarrowKey and > ptyInitialErase, and has an option in the mouse menu: Backarrow. > Putty has an option to select BACKSPACE or DELETE. But for example, > vt(1) in Plan 9 always generates BACKSPACE, and it is not configurable. > > If the user wants another configuration the suckless way is config.h. > > But, why do you think is better DELETE than BACKSPACE? > > > The fact that you claim that matching the key codes to the labels on the > > keys are so important, yet you send the key codes associated with the > > VT220 Prior/Next/Find/Select keys when the user presses > > PageUp/PageDown/Home/End. > > What ascii codes are supposed they should send? (Home sends Home, not > Find). What is Home? > > That is not from the section where COLUMNS and LINES are defined (scroll > > further down the page). The table which the sentence you pasted is > > attached to also includes other variables that are _definitely_ defined > > by the standard, like TZ and HOME. LINES and COLUMNS are, for that > > matter, defined in the same section that TERM is defined in. > > I just have seen them. You are right, they are defined by the standard, > but the standard doesn't define how they must be updated (this was > the part I knew, shells are not forced to set them). > > > I think we > > have been talking at cross purposes, though... I was not saying, > > precisely, that it was the terminal emulator's responsibility to _set_ > > them, merely to ensure they are _not_ set to values inherited from a > > different terminal, which you appeared to be rejecting. > > Ok, I understand you now. Yes I agree in this point with you, and st > already unsets LINES, COLUMNS and TERMCAP. If the user needs them he > must set them in some way (maybe in his profile if he runs a login > shell), or use some program that sets them. As we already have said, > it is imposible for the terminal to set them each time the size is > changed. > > We could say something similar for TERM, but it is impossible for the > system to put this variable in a terminal emulator (system take it > fomr /etc/ttys or /etc/inittab in real terminals), so I think terminal > must set it. > > For the original problem, the incorrect setting of SHELL to utmp > this is the patch: > > diff --git a/st.c b/st.c > index c61b90a..bcf96e9 100644 > --- a/st.c > +++ b/st.c > @@ -1146,7 +1146,7 @@ die(const char *errstr, ...) { > > void > execsh(void) { > - char **args, *sh; > + char **args, *sh, *prog; > const struct passwd *pw; > char buf[sizeof(long) * 8 + 1]; > > @@ -1158,13 +1158,15 @@ execsh(void) { > die("who are you?\n"); > } > > - if (utmp) > - sh = utmp; > - else if (pw->pw_shell[0]) > - sh = pw->pw_shell; > + sh = (pw->pw_shell[0]) ? pw->pw_shell : shell; > + if(opt_cmd) > + prog = opt_cmd[0]; > + else if(utmp) > + prog = utmp; > else > - sh = shell; > - args = (opt_cmd) ? opt_cmd : (char *[]){sh, NULL}; > + prog = sh; > + args = (opt_cmd) ? opt_cmd : (char *[]) {prog, NULL}; > + > snprintf(buf, sizeof(buf), "%lu", xw.win); > > unsetenv("COLUMNS"); > @@ -1172,7 +1174,7 @@ execsh(void) { > unsetenv("TERMCAP"); > setenv("LOGNAME", pw->pw_name, 1); > setenv("USER", pw->pw_name, 1); > - setenv("SHELL", args[0], 1); > + setenv("SHELL", sh, 1); > setenv("HOME", pw->pw_dir, 1); > setenv("TERM", termname, 1); > setenv("WINDOWID", buf, 1); > @@ -1184,7 +1186,7 @@ execsh(void) { > signal(SIGTERM, SIG_DFL); > signal(SIGALRM, SIG_DFL); > > - execvp(args[0], args); > + execvp(prog, args); > exit(EXIT_FAILURE); > } > > Guys, what do you think about it? > > Regards, > > -- Random832
Re: [dev] [st][PATCH] Add support for utmp in st
On Sun, Oct 12, 2014, at 14:32, k...@shike2.com wrote: > If the user doesn't like the key assignation on st he is free of changing > it > in his config.h (maybe we could add it to the FAQ). That doesn't mean that the question of what the default should be is not worth discussing. > > You didn't comment on the prior/next/find/select issue, either. > > I don't unerstand what you mean here. The fact that you claim that matching the key codes to the labels on the keys are so important, yet you send the key codes associated with the VT220 Prior/Next/Find/Select keys when the user presses PageUp/PageDown/Home/End. > Quoting from this page: > > It is unwise to conflict with certain variables that are > frequently exported by widely used command interpreters and > applications: That is not from the section where COLUMNS and LINES are defined (scroll further down the page). The table which the sentence you pasted is attached to also includes other variables that are _definitely_ defined by the standard, like TZ and HOME. LINES and COLUMNS are, for that matter, defined in the same section that TERM is defined in. I think we have been talking at cross purposes, though... I was not saying, precisely, that it was the terminal emulator's responsibility to _set_ them, merely to ensure they are _not_ set to values inherited from a different terminal, which you appeared to be rejecting.
Re: [dev] [st][PATCH] Add support for utmp in st
On Sun, Oct 12, 2014, at 03:48, k...@shike2.com wrote: > And the profile runs in the same tty that st opens. St by default > executes a non login shell, so profile is not loaded, but utmp executes > a login shell (because it creates the utmp session, so it is more > logical for it to execute a login shell). Why shouldn't a non-login shell have a utmp session? And if this option is to use a login shell, rather than merely using utmp, then I don't think it should be a compile-time option - just because someone sometimes wants a login shell (which could be done before, if desired, by running e.g. sh -l) doesn't mean they always want one. > I work with systems where BACKSPACE deletes the previous > character, and it is really painful you cannot generate a BACKSPACE > character with some terminal emulators. The position of St developers > is very clear about this topic, st must generates the correct ascii > value for each key. What character does DEL generate? I'm assuming it generates either DELETE or "Remove Here", and either way it's going to be equally painful that you can't generate one of the sequences that a VT220 does. Meanwhile, the VT220 has no key that generates Backspace. You didn't comment on the prior/next/find/select issue, either. > > Their meaning is defined in the standard. The method of obtaining > > default values is not, but that means it's the implementation's > > responsibility, not that it doesn't mean anything at all. > > Can you put here in which part of POSIX they are defined?. I'm > sorry, but they are not standard (although are commons), and > even there are some shells (dash for example) that don't set them. http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html > > But unsetting it, along with the initial call to cresize, should be fine > > on most systems, so maybe I've been too harsh about this. > > I'm sorry, but this is a work of the shell, because it is not possible > for a terminal (a real one, not emulated) to set variables. A real terminal has a fixed size, which is known in termcap/terminfo. If a terminal supports multiple sizes, you would historically have had to alter the variables manually.
Re: [dev] [st][PATCH] Add support for utmp in st
On Sat, Oct 11, 2014, at 04:07, k...@shike2.com wrote: > Value of erase key for example, or in general the configuration > of line kernel driver These can't come from the profile either; since st opens a new tty that is not the same device the user logged in on. (stty(1)). Backspace key in st generates > BACKSPACE, but almost all terminals generate DELETE instead > (read FAQ for more details). It's not clear why the position of the key and the intent of the user typing it is less important than the label of the key. What's st's position on prior/next/find/select vs pgup/pgdn/home/end? And speaking of the editing keypad keys, the code that most terminals send for the del key is that for the VT220 "remove here" key. The delete key above enter on the VT220 is labeled I also have some adittionals > configurations like for example 'tput smkx' (set keypad on), > or `tabs`.About the three variables you tell, TERM is the only > that a terminal must set, LINES and COLUMNS are shell stuff > and are not even standard. Their meaning is defined in the standard. The method of obtaining default values is not, but that means it's the implementation's responsibility, not that it doesn't mean anything at all. But unsetting it, along with the initial call to cresize, should be fine on most systems, so maybe I've been too harsh about this.
Re: [dev] [st][PATCH] Add support for utmp in st
On Tue, Sep 23, 2014, at 01:18, Roberto E. Vargas Caballero wrote: > St runs an interactive shell and not a login shell, and it means > that profile is not loaded. The default terminal configuration > in some system is not the correct for st, but since profile is > not loaded there is no way of getting a script configures the > correct values. What exactly does "terminal configuration" mean here? TERM, LINES, and COLUMNS? Shouldn't st itself be responsible for setting these? They certainly don't belong in the profile. What is utmp doing, exactly, and why does st want to run the user's default shell instead of the SHELL that's passed in to st's environment by its parent? Is it appropriate to be setting SHELL to utmp? Why set SHELL at all? What program does utmp execute, and is it intentional that utmp is not executed if the user specifies a command?
Re: [dev] Ideas for using sic
On Wed, Oct 1, 2014, at 12:57, q...@c9x.me wrote: > On Mon, Sep 29, 2014 at 09:55:07PM -0700, Eric Pruitt wrote: > > rlwrap ./sic -h "$IRC_HOST" | tee -a irc-logs | grcat sic.grcat > > Hi, > > how does rlwrap deal with random text that gets inserted by sic > when some data arrives on the channel? This was my main problem > with sic, to prevent that and enable multichannel I have written > http://c9x.me/irc/. It occurs to me that a "line input" program (that would work along the lines of a mud client, with a separate editable input line from where output goes, and maybe managing scrollback) would be a good candidate for a "do one thing" utility.
Re: [dev] [RFC] Design of a vim like text editor
On Thu, Sep 25, 2014, at 08:57, Raphaël Proust wrote: > I actually have my vimrc setting K as an upward J (i.e., join current > line with the previous one) (although I haven't made the effort to > make it work in visual mode because then I just use J): > nnoremap K :.-,.join Why not just map it to kJ?
Re: [dev] [RFC] Design of a vim like text editor
On Wed, Sep 24, 2014, at 15:36, Marc André Tanner wrote: > > - 'J' in visual mode is not implemented > > Why would one use it? To be able to select lines to be joined interactively instead of having to count the lines by hand (since there's no J, only J). I do this all the time.
Re: [dev] [RFC] Design of a vim like text editor
On Wed, Sep 24, 2014, at 15:21, Marc André Tanner wrote: > > x should not delete the end of line character (but this might be solved > > with the placement issue above) > > I (and a few others? Christian Neukirchen?) actually like the fact that > the newline is treated like a normal character. You might consider an option like "whichwrap" [which can make vim delete newline with x - well, not x, but 2x.] to enable and disable this behavior.
Re: [dev] [st] Understading st behaviour
On Wed, Apr 16, 2014, at 4:19, Amadeus Folego wrote: > It works! As I am using tmux just for the scrollback and paste > capabilities I am not worried with losing sessions. > > Maybe I'll write a suckless multiplexer for this sometime. Eh - "multiplexing" refers to the multiple session capability, not to the scrolling. The basic issue is that tmux provides three relatively distinct features: scrolling, multiplexing, and detachability. A program providing any one of these capabilities essentially has to be a terminal emulator - you can take some shortcuts, like passing through the keyboard, and passing through output rather than reinterpreting it, but you've got to parse all output control sequences to know what's on the screen. For scrolling, you need it in order to understand what has scrolled off the screen and in order to restore the main screen when you're done with scrolling. For multiplexing, you need it in order to effectively switch between windows. For detaching, you need it to restore the content when reattaching. I've actually used a detaching program that doesn't track screen contents (it discards all output while detached, and sends SIGWINCH or control-L on reattach to make the program redraw itself) - it's not pleasant to deal with for non-fullscreen programs. You could do multiplexing the same way, in principle, but it's intractable for scrolling. A "truly suckless" design would have the three features in separate programs. And since they all have to do essentially the same thing (maintain their own idea of the screen state and redraw it on demand), this functionality could be in a library. Or you could just have it in the scrolling program and the other two programs don't care, which would make it a somewhat unpleasant experience to try to use them without being in conjunction with the scrolling program. That's also three separate programs you have to control from the keyboard.
Re: [dev] What is bad with Python
On Wed, Mar 12, 2014, at 15:04, FRIGN wrote: > Impressive, but better use > $ LD_TRACE_LOADED_OBJECTS=1 t > instead of > $ ldd t > next time to prevent arbitrary code-execution[1] in case you're dealing > with unknown binaries. I don't know if it was here and you or somewhere else or someone else, but someone said this before and I pointed out the problems with this argument. It's even worse in this case because you propose using LD_TRACE_LOADED_OBJECTS=1 t [which won't actually work, incidentally, without . in PATH] instead of LD_TRACE_LOADED_OBJECTS=1 /lib/ld-linux.so.2 ./t - your proposed command doesn't actually prevent the exploit (it actually makes it easier, by making it possible to exploit with a mere statically-linked program rather than a fancy ELF interpreter trick) Also, wanting to do this with an unknown, untrusted executable is, in practice, _incredibly rare_. And since this is an executable he just built himself, it obviously doesn't apply here. The 'safe' command [which, remember, you got wrong] is onerously long for a suggestion that people should use every time. Maybe the best way forward is to make ldd default to the safe way and require user confirmation (with a warning) before the unsafe one.
Re: [dev] [sbase] move mknod(1) to ubase
On Sat, Jan 25, 2014, at 17:46, Roberto E. Vargas Caballero wrote: > Uhmm, it looks bad. If we want to be 100% POSIX complaint then we have to > move > mknod to ubase, and change the mknod system call of tar (and next > archivers that > could be implemented in sbase) to a system("mknod ..."). The mknod utility isn't in POSIX either. POSIX permits tar implementations to ignore block and character device entries: http://pubs.opengroup.org/onlinepubs/7908799/xcu/pax.html
Re: [dev] portable photoshop-like lite application based on C?
On Tue, Dec 3, 2013, at 9:50, Markus Teich wrote: > Mihail Zenkov wrote: > > ldd /usr/bin/gimp-2.8 > > Heyho, > > http://www.catonmat.net/blog/ldd-arbitrary-code-execution/ Considering that he probably _actually_ executes the very same gimp-2.8 binary all the time, your concern is misplaced. This attack is highly situational, requiring the attacker to cause someone to encounter a binary that they would not otherwise execute and to be curious about what libraries it uses. "Don't run ldd on an unknown binary you wouldn't execute" becomes "don't run ldd ever on anything" - the cargo cult at its finest. I propose not allowing untrusted binaries to be placed in /usr/bin in the first place.
Re: [dev] suckless shell prompt?
On Tue, Nov 26, 2013, at 12:09, Bryan Bennett wrote: > And sending that email calls into question your ability to either read > a full thread or to recognize human names. In my defense, you'd already had it pointed out to you once and continued in your misconception without even understanding the correction. That you would then miraculously realize your mistake and sent a later email (which you have no reason to assume that I'd received at the time I wrote my response) retracting it is not something I could reasonably have been expected to predict.
Re: [dev] suckless shell prompt?
On Mon, Nov 25, 2013, at 5:26, Martti Kühne wrote: > Announcing a shell prompt and including git.h indeed makes no sense > whatsoever. What part of git is useful when writing a shell > interpreter? I'm sorry, I can't possibly imagine how this isn't > apparent to you. Do you understand the difference between a prompt and an interpreter? This is a program that is meant to be called to print stuff before each command you type (Several shells include the ability to call such a program). Not a "shell interpreter". This is _so_ blindingly obvious that your failure to recognize it calls your ability to have basic reading comprehension into question.
Re: [dev] suckless shell prompt?
On Thu, Nov 21, 2013, at 13:44, Martti Kühne wrote: > Staring at the code in horror. > Something about git and nyancat. > Without running the code - I have trust issues from similar occasions > - you're kidding, right? The nyancat thing is clearly just a little joke. As for git... you can't _possibly_ be serious about being horrified that a program written for the specific purpose of displaying git repository information uses git.
Re: [dev] Mailing list behavior - was: Question about arg.h
On Thu, Nov 7, 2013, at 11:42, Calvin Morrison wrote: > Why do I top post? yes i am lazy! After being with gmail since it was > in beta, I still don't have an option to god damned bottom-post by > default!! Top posting or bottom posting isn't an "option", it's determined by _where you click the mouse_. You're not supposed to just start typing where the cursor drops, you're supposed to edit out the bits of the quote that you're not replying to. I'm sick of people blaming their email clients and other people taking this at face value. What the hell would such an option _do_?
Re: [dev] Suckless remote shell?
On Tue, Nov 5, 2013, at 9:43, Szabolcs Nagy wrote: > you don't have large file support, The lack of large file support is entirely an artifact of the fact that the "lseek" listed on that page uses an int instead of an off_t. The existence of special APIs for large file support on e.g. Linux and Solaris is an artifact of the fact that OSes made before a certain time period used a 32-bit type for off_t. A modern OS does not need any more system calls for large file support, since you can simply discard the non-large-file-supporting versions of those system calls.
Re: [dev] st: bracketed paste mode
On Thu, Sep 19, 2013, at 10:51, Nick wrote: > To check, how does this work exactly? Does X send the escape code to > any window when pasting with middle click, and those which don't > understand it just ignore it? And then once st has done the > appropriate stuff with the pasted text, vim (for example) will > detect that and behave as though :paste is enabled for the duration > of the paste? The application has to request it be enabled with a private mode escape sequence. I don't believe vim presently has any built-in support for it, but I could be wrong - and you could probably hack it by putting the mode in t_is or t_ti, putting the end escape sequence in :set paste, and setting a keybinding for the start sequence.
Re: [dev] [sbase] sponge v2
On Tue, Jul 2, 2013, at 13:42, Calvin Morrison wrote: > doesn't sponge soak up into memory, not into a file? Sponge's "killer feature" is that it doesn't open the output file until after the input is finished. Using it in a pipeline removes this, because it's something else instead of sponge writing to the file.
Re: [dev] [st] Implementing non-latin support...
On Sat, Jun 15, 2013, at 0:35, Eon S. Jeon wrote: > Thanks for your interest. > > Would you explain how you tested? I've done only few tests: echo & vim. > The cursor handling should be incomplete, because I used a very hacky > method to workaround the innate ASCII-ism structure. For cursor behavior, generally what other terminals do is allow the cursor to "actually" be in either of the two cells (and movement commands can place it in either one), but they _draw_ it over the whole character (moving the cursor from one half of a wide character to the other therefore has no visual change). When e.g. horizontal movement in something like a text editor goes one whole wide character at a time, it's generally because the application is enforcing this by moving it two columns explicitly. What you should do is run the command "stty cbreak -echo; cat", then do some typing (and pasting of wide characters), moving around the cursor with arrows (which send single cursor movement escape sequences), and type in other escape sequences for anything you're curious about. I've attached a file I used as a test suite to discover the behavior of other terminals. Note that you should do this outside of tmux if you use it; tmux itself has some bugs in this area that can make it hard to understand what's going on. [0m#8[H [4;10H[0;7mOverwrite Tests:[36;44m [5;10HEEEEEEEEEEEEE [6;10HEEEEEEEEEEEEE [7;10HEEEEEEEEEEEEE [8;10HEEEEEEEEEEEEE [9;10HEEEEEEEEEEEEE [9;10H[41;33mA[AB[AC[AD[AEF[BG[BH[BI[BJ[BK [9;72H[0;7mWrap tests:[m [10;73Hï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ [11;74Hï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ [12;75Hï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ [13;76Hï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ [15H[0;7mDeletion Tests:[m [16Hï¼ï¼ï¼[16H[1P [17Hï¼ï¼ï¼[17H[2P [18Hï¼ï¼ï¼[18H[3P [19Hï¼ï¼ï¼[19H[4P [20Hï¼ï¼ï¼[20H[5P [21Hï¼ï¼ï¼[21H[6P [16;10Hï¼ï¼ï¼[16;10H[P [17;10Hï¼ï¼ï¼[17;11H[P [18;10Hï¼ï¼ï¼[18;12H[P [19;10Hï¼ï¼ï¼[19;13H[P [20;10Hï¼ï¼ï¼[20;14H[P [21;10Hï¼ï¼ï¼[21;15H[P [16;20Hï¼ï¼ï¼[16;20H[P [17;20Hï¼ï¼ï¼[17;20H[P[P [18;20Hï¼ï¼ï¼[18;20H[P[P[P [19;20Hï¼ï¼ï¼[19;20H[P[P[P[P [20;20Hï¼ï¼ï¼[20;20H[P[P[P[P[P [21;20Hï¼ï¼ï¼[21;20H[P[P[P[P[P[P [14;40H| [15;40H| [16;40H| [17;40H| [18;40H| [19;40H| [20;40H| [21;40H| [22;40H| [23;40H| [23H
Re: [dev] [st] Implementing non-latin support...
On Fri, Jun 14, 2013, at 23:22, Eon S. Jeon wrote: > I'm not used to IRC, but I'll try to stay in the channel. It'll be nice > to talk about this topic. > > By the way, would you give me some information about your patch? I > started working on this, because I had not been able to find actual > works. > > Well, instead, I found some mails posted by you in April. I kinda agree > with what you were talking about. It does feel awkward to store utf8 > stream > instead of code points, though I decided to bear it. lol I stored utf8 because it already stores utf8; but then I ended up not being able to actually come up with a solution for combining characters, so what do I know? There's a copy of my st.c attached to one of those emails, I think. -- Random832
Re: [dev] [st] Implementing non-latin support...
On Fri, Jun 14, 2013, at 17:24, esj...@lavabit.com wrote: > I'm currently working on non-latin character support. I uploaded my > progress to github. > > Github URL: https://github.com/esjeon/st/tree/stable-nonlatin > (branch 'stable-nonlatin', meaning it's based on stable(?) release 0.4.1) > ... and here's my test string: 한글 漢字 > ひらがな > > Everything looks just okay. Basically, wide characters are displayed > correctly, and can be selected and copied. I have not tested with input > methods, because I don't use them. I already had a wide character patch a few weeks ago, and did some fairly extensive testing of what other terminals do with them in various overwriting/insertion/deletion situations. Are you on IRC?
Re: [dev] [sbase] 64-bit type for split
On Tue, Jun 11, 2013, at 13:35, Galos, David wrote: > In my implementation of split, the ability to split files into rather > large chunks is important. However, c89 does not provide a 64-bit int > type by default. Although I could manually emulate 64-bit counting, a > uvlong would be far cleaner. Is there a suckless-approved way of using > such an integer in a c89 environment? c89 provides whatever size types it wants to. How exactly do you think you are going to be able to work with / create files larger than whatever off_t type is provided by the environment supports? Or are you limiting this to pure ansi instead of posix?
Re: [dev] [sbase] changes to test
On Thu, May 30, 2013, at 15:31, Christoph Lohmann wrote: > Please make this a diff or patch and include the manpage too. Just > throwing out code pieces does not really keep maintainers motivated. Okay - I'll get it in patch format later today, but it might be this weekend before I have time to write a manpage - test has a _lot_ of options. -- Random832
Re: [dev] [sbase] changes to test
On Thu, May 30, 2013, at 10:09, random...@fastmail.us wrote: > This version has binary operators (e.g. = != -gt -lt) implemented > (limited to the r My client ate part of this sentence - It was "limited to the range of intmax_t".
[dev] [sbase] changes to test
I had partially implemented the test/[ command a while ago, and then got distracted with other things and never came back to it - I remembered about it when I saw this other sbase patch. This version has binary operators (e.g. = != -gt -lt) implemented (limited to the r, and properly handles being called as /bin/[ (previous version required argv[0] to be == "[" to invoke [ behavior, this one simply checks the last character of argv[0]) It still only supports the POSIX single-test syntax, with no support for the XSI ( ) -a -o operators. /* See LICENSE file for copyright and license details. */ #include #include #include #include #include #include #include "util.h" static bool unary(const char *, const char *); static bool binary(const char *, const char *, const char *); static void usage(void); bool is_bracket = false; int main(int argc, char *argv[]) { bool ret = false, not = false; argv0 = argv[0]; if(*argv0 && argv0[strlen(argv0)-1] == '[') { /* checks if argv[0] ends with [ * for [ or /bin/[ etc */ is_bracket = true; if(strcmp(argv[argc-1], "]") != 0) usage(); argc--; } if(argc > 2 && !strcmp(argv[1], "!")) { not = true; argv++; argc--; } switch(argc) { case 2: ret = *argv[1] != '\0'; break; case 3: ret = unary(argv[1], argv[2]); break; case 4: ret = binary(argv[1], argv[2], argv[3]); break; default: usage(); } if(not) ret = !ret; return ret ? EXIT_SUCCESS : EXIT_FAILURE; } bool unary(const char *op, const char *arg) { struct stat st; int r; if(op[0] != '-' || op[1] == '\0' || op[2] != '\0') usage(); switch(op[1]) { case 'b': case 'c': case 'd': case 'f': case 'g': case 'p': case 'S': case 's': case 'u': if((r = stat(arg, &st)) == -1) return false; /* -e */ switch(op[1]) { case 'b': return S_ISBLK(st.st_mode); case 'c': return S_ISCHR(st.st_mode); case 'd': return S_ISDIR(st.st_mode); case 'f': return S_ISREG(st.st_mode); case 'g': return st.st_mode & S_ISGID; case 'p': return S_ISFIFO(st.st_mode); case 'S': return S_ISSOCK(st.st_mode); case 's': return st.st_size > 0; case 'u': return st.st_mode & S_ISUID; } case 'e': return access(arg, F_OK) == 0; case 'r': return access(arg, R_OK) == 0; case 'w': return access(arg, W_OK) == 0; case 'x': return access(arg, X_OK) == 0; case 'h': case 'L': return lstat(arg, &st) == 0 && S_ISLNK(st.st_mode); case 't': return isatty((int)estrtol(arg, 0)); case 'n': return arg[0] != '\0'; case 'z': return arg[0] == '\0'; default: usage(); } return false; /* should not reach */ } bool binary(const char *arg1, const char *op, const char *arg2) { intmax_t iarg1, iarg2; if(!strcmp(op,"=") || !strcmp(op,"==")) { return strcmp(arg1,arg2) == 0; } if(!strcmp(op,"!=")) { return strcmp(arg1,arg2) != 0; } /* Note: this does not handle correctly if the values are both * out of range in the same direction, it will consider them * equal. */ iarg1 = strtoimax(arg1, 0, 10); iarg2 = strtoimax(arg2, 0, 10); if(!strcmp(op,"-eq")) return iarg1 == iarg2; if(!strcmp(op,"-ne")) return iarg1 != iarg2; if(!strcmp(op,"-gt")) return iarg1 > iarg2; if(!strcmp(op,"-ge")) return iarg1 >= iarg2; if(!strcmp(op,"-lt")) return iarg1 < iarg2; if(!strcmp(op,"-le")) return iarg1 <= iarg2; usage(); } void usage(void) { const char *ket = is_bracket ? " ]" : ""; eprintf("usage: %s string%s\n" " %s [!] [-bcdefghLnprSstuwxz] string%s\n" " %s [!] string1 {=,!=} string2%s\n" " %s [!] int1 -{eq,ne,gt,ge,lt,le} int2%s\n" , argv0, ket, argv0, ket, argv0, ket, argv0, ket); }
Re: [dev] Re: Why HTTP is so bad?
On Sun, May 26, 2013, at 9:21, Dmitrij Czarkoff wrote: > May it owe to the fact that this particular IPC protocol is *the* > protocol > used for nearly all IPC in the system? I have no idea what protocol you are talking about.
Re: [dev] upload via html?
On 05/25/2013 07:29 PM, Nicolas Braud-Santoni wrote: Well, SFTP requires you to create a user account. (I'm aware that it may not be one with which you can SSH in). Some people might not want this. Everything runs as a user. You could use www-data, whatever anonymous FTP uses, or simply "nobody". There's no fundamental reason you couldn't write an SFTP daemon that allows anonymous access. However, this doesn't exist by default. Also, and this is something many people may not know, it's non-trivial to make an account that cannot be used for _port forwarding_ - simply making it impossible to log in with a shell [e.g. shell set to /bin/false] doesn't accomplish this.
Re: [dev] Re: Why HTTP is so bad?
On 05/25/2013 12:55 AM, Strake wrote: Yes. Thus I can easily swap out any component, or insert mediators between components. For example, I could write my own fetcher to scrub the HTTP headers, or block ads; and I wouldn't need plug-ins to view PDFs or watch movies. Why is the requirement that it conform to your IPC protocol* less onerous than requiring it to conform to a particular in-process API that would make it a "plug-in"? *which has to handle navigation on both ends, A] what happens when you click a link in your viewer and B] what happens to your viewer when the user navigates away from it. Also, is the browser required to download the whole file before opening the viewer, or can for example a PDF viewer display the first page before the last page is downloaded? Also for large files (highly relevant to a movie viewer) with a file format that allows it, you could take advantage of range fetching, but in both of these cases the viewer has to speak HTTP and just be told a URL by the navigation component.
Re: [dev] Re: Why HTTP is so bad?
On 05/24/2013 07:13 PM, Strake wrote: And you spend a day on wikipedia or tvtropes and you've got two hundred HTML viewers open? Yes. I meant as opposed to the usual dozen. The viewer sends a "go" message back to the fetcher, which kills the old viewer and loads the new one, and can keep a URL log. So the fetcher (which presumably also has UI elements such as an address bar, back/forward button, etc) is the monolithic browser I described. How exactly is this different from the current model? That the two components communicate via IPC rather than an in-process API?
Re: [dev] Re: Why HTTP is so bad?
On Fri, May 24, 2013, at 16:02, Strake wrote: > Yes. A web browser ought to have a component to fetch documents and > start the appropriate viewer, as in mailcap. The whole monolithic web > browser model is flawed. And you spend a day on wikipedia or tvtropes and you've got two hundred HTML viewers open? You need _something_ monolithic to manage a linear (or, rather, branching only when you choose to, via open new window or new tab) browsing history, even if content viewers aren't part of it. When you click a link within "the appropriate viewer", it needs to be _replaced_ with the viewer for the content at the link you clicked on. And if you don't like the way people normally browse a site like wikipedia or tvtropes, then... well, you've missed the point of hypertext, and what you're building isn't a web browser.
Re: [dev] upload via html?
On Tue, May 14, 2013, at 8:50, Martti Kühne wrote: > On Tue, May 14, 2013 at 2:44 PM, wrote: > > On Mon, May 13, 2013, at 23:20, Sam Watkins wrote: > >> HTTP PUT with ranges would be useful, could mount filesystems over HTTP. > > > > There's no standard HTTP directory listing. > > How is lack of a standard a problem in that concern? Context. I was replying to someone saying something about mounting filesystems over HTTP.
Re: [dev] upload via html?
On Mon, May 13, 2013, at 23:20, Sam Watkins wrote: > HTTP PUT with ranges would be useful, could mount filesystems over HTTP. There's no standard HTTP directory listing.
[libutf] Re: [dev] [st][patch] not roll our own utf functions
On 05/05/2013 01:06 PM, Nick wrote: Hmm, I'm not sure that's the right decision. Maybe include the appropriate .c & .h file for libutf in the source tree? That's what I do in a couple of projects. I don't have strong feelings about it, but libutf is pretty reasonable and I'm not convinced it should be avoided. I also have to wonder what's the point of libutf at all if it's not going to be used as the UTF-8 library for suckless projects.
Re: [dev] [st] RFC halt function
On Thu, Apr 25, 2013, at 14:15, Christoph Lohmann wrote: > Nice joke. Try to implement a scrollback buffer without bugs and flaw‐ > lessly. > > > Sincerely, > > Christoph Lohmann The buffer's the easy part. What's hard is actually implementing scrolling. I've been tempted to hack in a way to have the mouse wheel transparently tell tmux to scroll up and down through its modal scrolling feature.
Re: [dev] st: Large pile of code
On Wed, Apr 24, 2013, at 15:32, Kent Overstreet wrote: > I switched to gnu99 for typeof() - it makes it possible to write min > and max macros that don't evaluate their arguments twice, and IMO is a > very worthwhile extension. Wait, you switched _to_ gnu99? For _that_? A) Why do min and max need to be macros at all? Also, where do you call them on anything that's not an int? B) Where do you call them on anything that has side effects (i.e. that _needs_ to not be evaluated twice)?
Re: [dev] st: Large pile of code
On Wed, Apr 24, 2013, at 9:32, Carlos Torres wrote: > I like the seperation of term.c from st.c, I agree that makes reading > st.c clearer. I can't comment on the removal of forward declarations, > typedefs and static vars though the resulting difference is legible as > well. (frankly code in alphabetical order makes me want to sort it > according to code flow and surrounding context...) i think the choice > of using the fontconfig utf8 functions was a good idea. I frowned > when you switched to 'gnu99' from 'c99' (i pictured a lot of flames on > that) If it _can_ be compiled in c99 mode, no reason it shouldn't be - then people can compile it using LLVM/clang, tendra, pcc, etc. How hard is it going to be to merge these changes with what changes have been made to the main version since he branched off from it?
Re: [dev] [PATCH] Fix selecting clearing and BCE
On 04/23/2013 05:27 PM, Roberto E. Vargas Caballero wrote: It is very confusing see a hightlight blank line, that really is selecting the previous content of the line. If the selecting mark keeps in the screen it is only some garbage in it. If you can find other terminal emulator with this behaviour please let me know it. Maybe the behavior is wrong - but if the problem is that it is _still selected_ (i.e. hilight goes away when you select something else), it's not something that can be solved with anything to do with visual attributes only. That was why I was asking for clarification whether it is _still selected_, or just _still hilighted_. I wasn't able to view the video or run st at the time when you posted the video... now I've run st and confirmed that the problem is that it is _still selected_. I can work on a patch to fix this today. This really has nothing to do with the visual attribute, it's that the logic for removing the selection when its content changes (whether by erasing or by text being printed within it) is broken or missing.
Re: [dev] [PATCH] Fix selecting clearing and BCE
On Tue, Apr 23, 2013, at 17:05, Roberto E. Vargas Caballero wrote: > What _exactly_ is the behavior you are observing? Are you sure it's not _actually_ staying selected, rather than simply drawing that way? If you click the mouse somewhere else, does the original selection go away, or does it stay reversed? If it goes away, what is the problem? Are you expecting it should go away immediately when it is erased? There is some merit to the idea that the selection should go away if any character within it is modified - maybe we should be talking about that. -- Random832
Re: [dev] [PATCH] Fix selecting clearing and BCE
On Tue, Apr 23, 2013, at 16:21, Roberto E. Vargas Caballero wrote: > In drawregion you have: > > 3172 bool ena_sel = sel.bx != -1; > 3173 > 3174 if(sel.alt ^ IS_SET(MODE_ALTSCREEN)) > 3175 ena_sel = 0; > ... > 3190 if(ena_sel && *(new.c) && selected(x, y)) > 3191 new.mode ^= ATTR_REVERSE; > > in selclear: > > 937 sel.bx = -1; > 938 tsetdirt(sel.b.y, sel.e.y); > > in bpress: > > 822 if (sel.snap != 0) { > 823 tsetdirt(sel.b.y, sel.e.y); > 824 draw(); > 825 } > > > > It means when you select something you modify the attribute of the > selected > region. That's not true. Only the attribute passed to xdraws() is altered - the real attribute stored in the character cell is left alone. Line 3189 new = term.line[y][x]; makes a _copy_ of the Glyph structure, and line 3191 only modifies the copy, not the original.
Re: [dev] [PATCH] Fix selecting clearing and BCE
On Tue, Apr 23, 2013, at 14:34, Roberto E. Vargas Caballero wrote: > From: "Roberto E. Vargas Caballero" > > The commit b78c5085f72 changed the st behaviour enabling BCE capability, > that means erase regions using background color. Problem comes when you > clear a region with a selection, because in this case the real mode of > the > Glyph is not the value of term.line[y][x], due in drawregion we had > enabled > the ATTR_REVERSE bit. I don't understand the issue. How is this desired behavior? It looks like your change makes it toggle the _real_ ATTR_REVERSE bit on the selected region, making the selection appear to vanish, and it'll end up in the wrong colors once the selection is actually removed.
Re: [dev] [st] colors and attributes, general question
On Tue, Apr 23, 2013, at 11:53, Christoph Lohmann wrote: > It’s the simple way of doing all the brigthening and reversing. St is > keeping to what other terminals do. But since none of them keeps to any > standard colors or good behaviour is this what makes st being what it is > – a simple terminal. My point was, it's only "the simple way" when you've already got both colors calculated because the function draws both. But if drawing backgrounds is moved into a separate function, as I was planning to do, that function would be simpler if it didn't have to think about the effects that bold/italic/underline have on the foreground color. My question was whether this behavior is like this because other terminals do the same thing, or _only_ because it's simpler.
[dev] [st] colors and attributes, general question
I'm planning on reworking xdraws and drawregion to draw the background and text as separate functions. To do this I need to understand some things: As I understand it, the behavior is to have all attribute effects on color (e.g. bold brightening, italic/underline colors) affect only the foreground and not the background when in normal mode, and affect only the background and not the foreground in reverse (ATTR_REVERSE) mode. Is this understanding correct? As I understand it, the behavior for MODE_REVERSE is to use RGB color inversion, and not bg/fg swapping (so yellow-on-red becomes blue-on-cyan, not red-on-yellow) , on all colors _except_ for the default bg/fg colors. Is this understanding correct? Is the above outlined behavior actually correct by the standards / desirable / does it match the behavior of other terminals? config.def.h says " Another logic would only make the simple feature too complex." but I find that in making this change supporting the current behavior (my understanding as outlined above) is actually more complex (because it requires me to duplicate attribute color mapping in both functions) There is also, as far as I can tell, no support for brightening the background for ATTR_BLINK. -- Random832
Re: [dev] [st] double-width usage
On Tue, Apr 23, 2013, at 11:01, Silvan Jegen wrote: > I saw, compiled and tested it but when using mutt only half of the > (Japanese Kanji) characters would be drawn (so presumably only one > character cell of a two-cell double character). If I wasn't at a > conference I would deliver some screenshots but as things stand, I can > only get back to you after I am back home in a few days. Don't worry about screenshots, I know what it looks like. That is the graphical glitch I was referring to (along with being left-aligned). I'm considering possible ways to fix it - but all I can think of is to rewrite the entire drawregion function to draw all backgrounds first and then all characters.
Re: [dev] [st] [PATCH] 8bit-meta like xterm
On Tue, Apr 23, 2013, at 10:30, Thorsten Glaser wrote: > random...@fastmail.us dixit: > > >Wait a minute... what exactly do you _expect_ meta to do? Using (for > >example) meta-a to type 0xE1 "a with acute" is _not_, in fact, the > >expected or intended behavior; it is a bug. And I don't think it will > > No, it is the intended behaviour. > http://fsinfo.noone.org/~abe/typing-8bit.html The fact that someone discovered it, _thought_ it was intended, and showed other people how to do it does not mean that it actually was intended. > >even work with UTF-8 applications, and st is an exclusively UTF-8 > >terminal. > > XTerm handles that transparently: when in UTF-8 mode, Meta-d > is still CHR$(ASC("d")+128) = "ä", just U+00E4 instead of a > raw '\xE4' octet. If this were an intended feature why would it elevate latin-1 over other unicode characters? This only proves my point. > This is *extremely* useful – especially as it leads people > away from national keyboard layouts towards QWERTY while > retainig the ability to write business eMails, which require > correct spelling. And what the heck is wrong with national keyboard layouts that it's "useful" to "lead people away from" them?
Re: [dev] [st] [PATCH] 8bit-meta like xterm
On Tue, Apr 23, 2013, at 7:51, Otto Modinos wrote: It means first of all vim. I have also tried mocp, and that didn't work either. What apps are you using that work? Huh? Vim doesn't have any keybindings that use meta. Wait a minute... what exactly do you _expect_ meta to do? Using (for example) meta-a to type 0xE1 "a with acute" is _not_, in fact, the expected or intended behavior; it is a bug. And I don't think it will even work with UTF-8 applications, and st is an exclusively UTF-8 terminal. What I expect meta to do is for example in irssi meta-a goes to the next window with activity, meta-1 goes to window 1, etc. -- Random832
Re: [dev] [st] [PATCH] 8bit-meta like xterm
On 04/23/2013 04:50 AM, Christoph Lohmann wrote: I am considering making this the default behaviour of st. Are there any arguments against it? I'm actually confused by what he means by "most of the apps I tried didn't recognize the escape sequence", because every app I've ever used recognizes it (which is simply to prefix a character with \033 to represent meta), no app I've ever used recognizes the 8-bit behavior, and an app expecting the 8th bit (which I've never seen) would not handle non-ascii text input properly.
Re: [dev] [st] double-width usage
On 04/23/2013 03:07 AM, Christoph Lohmann wrote: Hello comrades, Here’s some RFC for people using double‐width characters in terminals in their daily life. Which applications do you use that handle double-width as you expect them? Do these applications use the double-width for the layout? Any Chinese or Japanese user? If double-width characters would be drawn to fit the standard cell size of the terminal (drawing them in half the font size) would this suffice your need? This question implies that it is possible to simply increase the average fontsize so the complex glyphs look good. Would this suffice your need? Naming the applications would be important so I can test st to their compatibility. Sincerely, Christoph Lohmann Did you see the st.c I posted a few days ago? The logic for double width is mostly complete in it - I just have to fix a few graphical glitches, and there are a couple of corner cases (mainly regarding erasing a double width character by overwriting with a single width when background colors are involved) that different terminals don't handle the same way, that it's not clear which terminal we should follow or if it's even important to emulate one particular behavior.
Re: [dev] [st] Need help implementing combining characters
On 04/21/2013 12:45 PM, Carlos Torres wrote: Maybe send out what you have and others can better grok what you intend, and see how it may fit? I could do that, but I haven't actually modified the drawing code substantially yet, except to include the combining characters themselves in the buffer that drawregion passes to xdraws. The general design I am using is to expand the 'c' array within the Glyph struct, and store multiple UTF-8 characters (zero-padded) in it. I'm also having drawing issues with double-width characters that I don't know how to fix, so maybe it would be best if I just send what I have. Should I send it in the form of a patch or just my st.c?
[dev] Need help implementing combining characters
I've got the logic fully implemented (both for maintaining multiple characters in a single cell and for copying the selection) but I can't figure out how to make them draw correctly. I don't understand what the xdraws and drawregion function is doing. Current behavior is it draws the combining glyph in a cell of its own next to the base glyph.
Re: [dev] [st] wide characters
On Mon, Apr 15, 2013, at 15:36, Thorsten Glaser wrote: > Actually, wint_t is the standard type to use for this. One > could also use wchar_t but that may be an unsigned short on > some systems, or a signed or unsigned int. Those systems aren't using wchar_t *or* wint_t for unicode, though. The main reason for wint_t's existence is that wchar_t isn't guaranteed to be able to represent a WEOF value distinct from all valid character values. wchar_t can be used just fine for any actual character, but if the system doesn't use unicode as its wchar type, it could (for example) be a signed 16-bit int to wchar_t's unsigned 8-bit. You can use #if __STDC_ISO_10646__ to test whether the implementation uses unicode for wchar_t (most modern systems do, though some may not define this constant) - if so, then wchar_t is, naturally, guaranteed to be able to represent at least the range 0 to 0x10, and wint_t that plus WEOF (usually -1). They're usually both 32-bit signed ints. MS Windows uses an unsigned short for both types due to various historical reasons.
Re: [dev] [st] wide characters
On Mon, Apr 15, 2013, at 15:16, Strake wrote: > On 15/04/2013, random...@fastmail.us wrote: > > On Mon, Apr 15, 2013, at 10:58, Martti Kühne wrote: > >> According to a quick google those chars can become as wide as 6 > >> bytes, > > > > No, they can't. I have no idea what your source on this is. > > In UTF-8 the maximum encoded character length is 6 bytes [1] What on earth does that have to do with using an int to store the code point *instead of* the raw UTF-8 bytes (which are used _now_)? Also, this is out of date; the latest version of unicode (since 2003 at the latest) limits code points to 0x10 and therefore UTF-8 sequences to four bytes. Unless your manpage is much older than mine, it states this clearly and you misread it.
Re: [dev] [st] wide characters
On Mon, Apr 15, 2013, at 10:58, Martti Kühne wrote: > On Sun, Apr 14, 2013 at 2:56 AM, Random832 wrote: > > Okay, but why not work with a unicode code point as an int? > > -1 from me. > It is utter madness to waste 32 (64 on x86_64) bits for a single > glyph. A. current usage is char[4] B. int is 32 bits on x86_64. There's no I in LP64. > According to a quick google those chars can become as wide as 6 > bytes, No, they can't. I have no idea what your source on this is. > and believe me you don't want that, as long as there are > mblen(3) / mbrlen(3)... I don't know how these functions are relevant to your argument.
Re: [dev] [st] wide characters
On 04/14/2013 02:10 AM, Christoph Lohmann wrote: Greetings. On Sun, 14 Apr 2013 08:10:22 +0200 Random832 wrote: I am forced to ask, though, why character cell values are stored in utf-8 rather than as wchar_t (or as an explicitly unicode int) in the first place, particularly since the simplest way to detect a wide character is to call the function wcwidth. What was the reason for this design decision? It doesn't save any space, since on most systems UTF_SIZ == sizeof(int) == sizeof(wchar_t). That design decision can change when I’m actually implementing the dou‐ ble‐width and double‐height support in st. The codebase is small enough to change such a type in less than 10 minutes. So no religion was intro‐ duced here. The reason for my question about using codepoints instead of UTF-8 was because I thought it might make it easier to support combining diacritics, not wide characters. The two problems are broadly related because both of them affect the number of character cells occupied by a string. And I don't know the st codebase well enough (or at all, really) to tell at a glance what would have to be changed to be able to support a double-width character cell, or to support wrapping to the next line if one is output at the second-to-last column. I hadn't yet the time to read all the double-width implementations in other terminals so st would do the »right thing« in implementing all questionable cases. Double‐width characters are like BCE a design decision applications need adapt to. Some corner cases I haven't yet found a good answer to: * Is there any standard for this except for setting the flag in terminfo and taking up two cells in the terminal? I don't know if there's a standard. I can find nothing about character cell terminals in any UTR, and ECMA 48 is silent on the question of wide characters. I don't know what terminfo flag you are referring to. I was talking about support for east asian characters, not VT100-style stretching of ASCII characters. I suspect the widcs/swidm/rwidm capabilities refer to the latter (though the only actual instance in the terminfo database is a swidm string on the att730). Observed behavior in various terminals that do support them is: * cursor position can be in either half of a double character, though the whole character is hilighted (all observed terminals) * outputting one at the end of the line (i.e. where a pair of two narrow characters would be split across lines) fails entirely (xterm) or wraps to the next line leaving the last cell alone (vte, tmux, mlterm, kterm). * outputting a narrow character on top of a wide character erases the entire wide character (xterm, tmux, mlterm, kterm) or erases only when in the left half (vte) * deleting (e.g. with ESC [ P) part of a character has various different behaviors: ** on xterm and kterm, deleting either half of a character replaces the remaining half with a single-width blank space. ** tmux's behavior is very buggy: a vertical line drawn across a different part of the screen _after_ deleting different parts of wide characters on different lines ended up redrawing incorrectly after refreshing. As for the wide characters themselves, deleting the left half deletes the entire character and deleting the right half has no effect, but there is some hidden state involved - a sequence of two deletions will delete a single wide character. I suspect the "right half" is filled with some placeholder value that is not output to the host terminal, and they are deleted individually. This is consistent with all of my observations. ** on mlterm, deleting the left half of a character deletes the entire character; deleting the right half replaces it with two spaces. ** on vte, deleting the right half of a character replaces the _next_ character with a space. Deleting the left half replaces the present character with a space, but seems to leave some hidden state, since the cursor on this "space" is still double width. * the xterm/kterm behavior seems the most rational, since it yields no visual glitches, always keeps the cursor in the same logical position, and a deletion always shifts characters right of it by the same amount. I haven't made any detailed investigation into the actual set of characters that are considered wide (or combining) by each terminal and by various applications, (except tmux, which has a list of ranges in utf8.c). I also haven't investigated whether any of them have locale-dependent treatment of "ambiguous" characters (e.g. greek or cyrillic) which are wide in historical east asian fonts (except tmux, which does not) mlterm does have an option that makes it work differently; the above results are with -Z enabled. * If st has double-width default. * What happens if the application does naive character
Re: [dev] [st] wide characters
On 04/13/2013 07:07 PM, Aurélien Aptel wrote: The ISO/IEC 10646:2003 Unicode standard 4.0 says that: "The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text. The wchar_t type is intended for storing compiler-defined wide characters, which may be Unicode characters in some compilers." utf-8 is rather straightforward to handle and process. Okay, but why not work with a unicode code point as an int?
[dev] [st] wide characters
I don't mean as in wchar_t, I mean as in characters (generally in East Asian languages) that are meant to take up two character cells. I am forced to ask, though, why character cell values are stored in utf-8 rather than as wchar_t (or as an explicitly unicode int) in the first place, particularly since the simplest way to detect a wide character is to call the function wcwidth. What was the reason for this design decision? It doesn't save any space, since on most systems UTF_SIZ == sizeof(int) == sizeof(wchar_t). And I don't know the st codebase well enough (or at all, really) to tell at a glance what would have to be changed to be able to support a double-width character cell, or to support wrapping to the next line if one is output at the second-to-last column.
Re: [dev] [st] windows port?
On Thu, Apr 11, 2013, at 10:59, Max DeLiso wrote: My aim is to create a minimalist terminal emulator for windows. I want a project whose relationship to the cmd/conhost/csrss triad is analogous to the relationship between st and xterm/x. I'm going to try and lift out of st all of the platform agnostic bits which I am able to, and generally use it as a reference for terminal emulation routines. If it doesn't work _with_ the "cmd/conhost/csrss triad", what programs are going to run in it? Cygwin, I suppose. The problem, in general, with unix-ish terminal emulators on windows is they don't work with applications designed to run in the console.
[dev] [sbase] cp and security
I've written most of cp, but one issue keeps bugging me. I can't figure out how to get rid of race conditions within the constraints that sbase is implemented in (POSIX 2001, no XSI extensions). If we were using POSIX 2008 or XSI extensions, I could use the at() functions, or at least fchdir(), to reliably solve this problem. As it is, I'm left with two choices: Emulate fchdir with a "magic cookie" struct containing an absolute path, device, and inode number [stat(".") every time and panic if device and inode number don't match the cookie] Do nothing. Any thoughts?