mksh on EBCDIC, testing

2017-04-27 Thread Thorsten Glaser
Hi,

I’m starting a new thread so I can archive the old one.

I’ve merged most of your patch, only did some things differently.
I also added your copyright to misc.c which had the most logic
from you; feel free to send patches to change your copyright info
on files you feel you have co-authorship now.

I’ve disabled the Vi editing mode for EBCDIC for now, it’s got its
own lookup table (although we can probably handle that with the
rtt2asc() macro as well, now that I think about it).

I’ve not merged the huuuge “intro to EBCDIC in the comments” part,
as some of it is no longer entirely true after our discussion.
Please *do* update it and send me an appropriate patch when you
have time.

Can you please test mksh CVS HEAD? (CVS and github mirror should
update within the next couple of minutes.)

$ sh Build.sh -r -E && ./test.sh

The new -E option is necessary to enable EBCDIC, I chose to not
rely on autodetection as we will need this in the testsuite, too.

Thanks,
//mirabilos
-- 
[...] if maybe ext3fs wasn't a better pick, or jfs, or maybe reiserfs, oh but
what about xfs, and if only i had waited until reiser4 was ready... in the be-
ginning, there was ffs, and in the middle, there was ffs, and at the end, there
was still ffs, and the sys admins knew it was good. :)  -- Ted Unangst über *fs


[Bug 625164] Re: some extended Korn shell globs are really slow

2017-04-27 Thread Thorsten Glaser
mksh’s globbing really sucks. Way out: parse them as special kind of
regex (NFA with making $KSH_MATCH an array, possibly).

References:

- https://research.swtch.com/glob

- https://swtch.com/~rsc/regexp/regexp1.html

-- 
You received this bug notification because you are a member of mksh
Mailing List, which is subscribed to mksh.
Matching subscriptions: mkshlist-to-mksh-bugmail
https://bugs.launchpad.net/bugs/625164

Title:
  some extended Korn shell globs are really slow

Status in mksh:
  In Progress

Bug description:
  Attaching a testcase.

  The first glob is decently fast, the second one is noticeable (about a 
second) on a 3 GHz Athlon.
  The third glob needs to be killed with SIGKILL out of all things.

  Also, I have another shell script, where replacing
  [[ $foo = *@(x)* ]] with [[ $foo = *'x'* ]] and
  [[ $foo = @(1|2………),* ]] with [[ $foo = 1,* || $foo = 2………,* ]
  made it noticeable faster.

  The probable culprit is gmatchx, do_gmatch, and friends, mostly from
  misc.c.

  This bug serves as documentation for now, because I have no idea how
  to tackle it, but if someone takes it up and submits patches, be my
  guest.

To manage notifications about this bug go to:
https://bugs.launchpad.net/mksh/+bug/625164/+subscriptions


Re: [PATCH] IBM z/OS + EBCDIC support

2017-04-27 Thread Thorsten Glaser
Daniel Richard G. dixit:

>Good news!
>
>I played with this some more, and found what was missing: a call to
>setlocale().

Oh.

>/* very much NOT Latin-1 compatible */
>setlocale(LC_ALL, "Ru_RU.IBM-1025");
[...]
>...the input should have been converted to ISO 8859-5.
>
>So it seems like maybe the IBM docs are a bit flexible in what they mean
>when they say "ISO 8859-1" :-]

Yeah, their definition of "ASCII codepage" is also a bit... off.

>Do you still want the other tables?

No, thanks, this is information enough.

>thing to note is that not just any EBCDIC codepage can be used in a
>POSIX environment, because if you can't encode e.g. square brackets,
>then basic things like shell scripts will break.

Hmm. I could check for required characters in the output,
or just leave this to the user. (We likely only support
the codepage the shell was compiled for anyway, due to
all those embedded strings.)

>> When does it error out, too?
>
>It's in the doc. Both failure modes (non-SBCS locale, out-of-memory
>condition) should be extremely rare, to the point that they don't really
>need to be handled gracefully.

OK, thanks.

bye,
//mirabilos
-- 
 Beware of ritual lest you forget the meaning behind it.
 yeah but it means if you really care about something, don't
ritualise it, or you will lose it. don't fetishise it, don't
obsess. or you'll forget why you love it in the first place.