Re: Sortlessness?

2013-09-04 Thread John Gilmore
Googling turns up the fact that the fundamental paper by René
Haentjens, "Ordering universal character strings", is available at

http://www.hpl.hp.com/hpjournal/dtj/vol5num3/vol5num3art4.txt

Tony H was right to mention the work of the National Language
Technical Center at IBM's Toronto Laboratory.  (It is|was actually in
North York, Ontario, a Toronto suburb.)

I cherish my copies of its multivolume National Language Design Guide,
and anyone who can find copies of them on the net should download
them.

Why the NLTC was killed off, notionally by IBM Canada, is unlikely
ever to be fully understood.  No one outside IBM is in any position to
speculate about such things, and those inside it all have their own
organizational political imperatives to defend.  What is clear in the
record is that it was a centre of excellence.  Nowhere else, for
example, have I seen other cogent treatments of the problems of
treating cyrillic text embedded in roman text, roman text embedded ir
arabic text, and the like.


John Gilmore, Ashland, MA 01721 - USA


Re: Sortlessness?

2013-09-04 Thread Paul Gilmartin
(I'm trying to move this to IBM-MAIN; it really doesn't
belong on ASSEMBLER-LIST.  And not trimming quoted material
as much as I usually would.)

On 2013-09-04 10:29, Tony Harminc wrote:
> On 1 September 2013 00:51, Paul Gilmartin wrote:
>> On 2013-08-31, at 08:55, John Gilmore wrote:
>>>
>>> ...  They use data transformations to make it possible
>>> for two keys to be compared using a single CLC[L].  (DB2 does similar
>>> things too.)
>>>
>> This can be particularly complex for literary collating conventions
>> such as EN_US which DFSORT gets terribly wrong.  I tried a PMR on
>> this a few years ago.  When I reported that DFSORT and a C program
>> using strcoll() produce similar incorrect results, DFSORT and I
>> agreed that the problem should belong to LE.
>>
>> LE gave me WAD with a rationale so outrageous that I gave up in
>> disgust, making no effort to escalate.
>
> Isn't it a POSIX violation to produce incorrect collation results for
> a locale? Not, I suppose, that that's stopped them before.
>
> It's a shame because IBM was in the forefront of getting this
> collation stuff right, and into the POSIX standards. See the early
> Redbook GG24-3516 Keys to Sort and Search for Culturally Expected
> Results, and much subsequent work from IBM's long gone National
> Language Technical Center.
>
Thanks for the reference.  I'll look for it on publibz.  Or might I
find it on InfoCenter?

The first point of frustration is the inconsistency in the *names*
of the locales.  They're case-sensitive on most platforms; case-
insensitive (I think) on z/OS.  I needed to supply the following
preamble to make my test case portable:

static char
#if defined( __APPLE__ )
*US = "en_US.UTF-8",
*CA = "en_CA.UTF-8",
#elif defined( __linux__ )
*US = "en_US.utf8",
*CA = "en_CA.utf8",
#elif defined( __MVS__ )
#if ( '0' == 0xf0 )
*US = "En_US.IBM-1047",  /* EBCDIC */
*CA = "En_CA.IBM-1047",
#else
*US = "En_US.UTF-8.xplink",  /* ASCII  */
*CA = "En_GB.UTF-8.xplink",
#endif
#elif defined( __sun )
*US = "en_US.ISO8859-1",
*CA = "en_CA.ISO8859-1",
#else
*US = "en_US.utf8",
*CA = "en_CA.utf8",
#endif
*C = "C";


gil


Re: Sortlessness?

2013-09-04 Thread John Gilmore
Peter,

The URL you cite is worthy, but it is a small proper subset of the
NLTC materials.

I found myself disagreeing with some of it as I read it, but that is
no bad thing.

John Gilmore, Ashland, MA 01721 - USA


Re: Sortlessness?

2013-09-04 Thread Farley, Peter x23353
John,

Does this set of IBM "globalization guidelines" web pages match any part(s) of 
the NLTC design guide you mentioned?

http://www-01.ibm.com/software/globalization/guidelines/outline.html

Just curious if what I found there matches what you have.

Peter

-Original Message-
From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On 
Behalf Of John Gilmore
Sent: Wednesday, September 04, 2013 5:07 PM
To: ASSEMBLER-LIST@LISTSERV.UGA.EDU
Subject: Re: Sortlessness?

Googling turns up the fact that the fundamental paper by René
Haentjens, "Ordering universal character strings", is available at

http://www.hpl.hp.com/hpjournal/dtj/vol5num3/vol5num3art4.txt

Tony H was right to mention the work of the National Language
Technical Center at IBM's Toronto Laboratory.  (It is|was actually in
North York, Ontario, a Toronto suburb.)

I cherish my copies of its multivolume National Language Design Guide,
and anyone who can find copies of them on the net should download
them.

Why the NLTC was killed off, notionally by IBM Canada, is unlikely
ever to be fully understood.  No one outside IBM is in any position to
speculate about such things, and those inside it all have their own
organizational political imperatives to defend.  What is clear in the
record is that it was a centre of excellence.  Nowhere else, for
example, have I seen other cogent treatments of the problems of
treating cyrillic text embedded in roman text, roman text embedded ir
arabic text, and the like.

--

This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify us immediately by e-mail and delete the message and any 
attachments from your system.


Re: Sortlessness?

2013-09-04 Thread Tony Harminc
On 1 September 2013 00:51, Paul Gilmartin  wrote:
> On 2013-08-31, at 08:55, John Gilmore wrote:
>>
>> ...  They use data transformations to make it possible
>> for two keys to be compared using a single CLC[L].  (DB2 does similar
>> things too.)
>>
> This can be particularly complex for literary collating conventions
> such as EN_US which DFSORT gets terribly wrong.  I tried a PMR on
> this a few years ago.  When I reported that DFSORT and a C program
> using strcoll() produce similar incorrect results, DFSORT and I
> agreed that the problem should belong to LE.
>
> LE gave me WAD with a rationale so outrageous that I gave up in
> disgust, making no effort to escalate.

Isn't it a POSIX violation to produce incorrect collation results for
a locale? Not, I suppose, that that's stopped them before.

It's a shame because IBM was in the forefront of getting this
collation stuff right, and into the POSIX standards. See the early
Redbook GG24-3516 Keys to Sort and Search for Culturally Expected
Results, and much subsequent work from IBM's long gone National
Language Technical Center.

Tony H.


Re: Sortlessness?

2013-09-01 Thread Martin Truebner
Paul,

>> ... so outrageous that I gave up in disgust

Reflects my feeling in some of the arguments I had lately about PMRs
(which had to be opened to begin with, and the reasons they were
closed later).

--
Martin

Pi_cap_CPU - all you ever need around MWLC/SCRT/CMT in z/VSE
more at http://www.picapcpu.de


Automatic reply: Sortlessness?

2013-09-01 Thread Slater, Mark

I'm currently out of the office until Monday 09/09/2013.

MARKSANDSPENCER.COM

Unless otherwise stated above:
Marks and Spencer plc
Registered Office:
Waterside House
35 North Wharf Road
London
W2 1NW

Registered No. 214436 in England and Wales.

Telephone (020) 7935 4422
Facsimile (020) 7487 2670

www.marksandspencer.com

Please note that electronic mail may be monitored.

This e-mail is confidential. If you received it by mistake, please let us know 
and then delete it from your system; you should not copy, disclose, or 
distribute its contents to anyone nor act in reliance on this e-mail, as this 
is prohibited and may be unlawful.


Re: Sortlessness?

2013-09-01 Thread Paul Gilmartin
On 2013-08-31, at 08:55, John Gilmore wrote:
>
> ...  They use data transformations to make it possible
> for two keys to be compared using a single CLC[L].  (DB2 does similar
> things too.)
>
This can be particularly complex for literary collating conventions
such as EN_US which DFSORT gets terribly wrong.  I tried a PMR on
this a few years ago.  When I reported that DFSORT and a C program
using strcoll() produce similar incorrect results, DFSORT and I
agreed that the problem should belong to LE.

LE gave me WAD with a rationale so outrageous that I gave up in
disgust, making no effort to escalate.

"Never argue with an idiot. They will only bring you down to their
level and beat you with experience" -- diverse attributions, from
George Carlin to Mark Twain to far older.

If anyone else cares to take up the banner, I'll gladly donate my test
cases, both DFSORT and C; a mere few dozen lines each.

-- gil


Re: Sortlessness?

2013-08-31 Thread Andreas F. Geissbuehler

zMan ...

/me notes that his quip about using a card sorter has, as usual, caused
rampant pedanticism and topic drift, wonders why he bothers...


No idea how many will search the list archives after the topic was finally
beaten to death. Hopefully they will find one or more posts with complete,
accurate and comprehensible information. That's why I bothered posting on
this trivia subject.

Andreas Geissbuehler


Re: Sortlessness?

2013-08-31 Thread Andreas F. Geissbuehler

On 2013-08-30 16:04, zMan wrote:

 If you need one, there's always
http://www.ebay.com/itm/IBM-Model-83-Punch-Card-Sorter-/300954726197?pt=US_Vintage_Computers_Mainframes&hash=item46124caf35


At 16:28 -0600 on 08/30/2013, Paul Gilmartin wrote about Re:
Sortlessness?:
And it operates in time linear with respect to the size of
the input data set, implying that for a sufficiently large
input data set it will outperform most competing technologies.

-- gil


Robert A. Rosenberg addded:
I think that should be "it operates in time linear with respect to the
size of the input data set TIMES THE LENGTH OF THE SORT
FIELD". IOW:
The actual time (ignoring the time it takes to collect the 12 stacks of
cards and putting them back into the feed tray for sorting on the
next column) is the same as a single column/pass sort of a deck whose
size is X times as large (where X is the number of columns you are
sorting on).


FWIW...alpha-numeric columns need 2 passses through the sorter !!

I think it shold be "it operates in time linear with respect to the
NUMBER OF CARDS TIMES the SUM of the sort field columns
PLUS the number of alpha-numeric columns in the sort field. The
latter includes the number of numeric field columns with +/- sign."

e.g. request for some report sorted on:
cc.71-72 Prov/State and cc.11-16 Date MMDDYY
requires *10* passes through the sorter in this order:
sort N on cc. 14, 13, 12, 11, 16, 15, 72
sort Z on cc. 72
sort N on cc. 71
sort Z on cc. 71
Say 20'000 cards / 1000 cpm = 20 min / pass, 3:20 hrs total
excluding card jams, human errors, ...

Andreas Geissbuehler


Re: Sortlessness?

2013-08-31 Thread zMan
/me notes that his quip about using a card sorter has, as usual, caused
rampant pedanticism and topic drift, wonders why he bothers...


On Sat, Aug 31, 2013 at 10:55 AM, John Gilmore  wrote:

> There is a qualitative difference between modern sorting technology
> and the very simple 'logical' or lexicographic sorting operations
> performed by a card sorter, one that Chris.Baicher properly emphasized
> in an earlier post.  They use data transformations to make it possible
> for two keys to be compared using a single CLC[L].  (DB2 does similar
> things too.)
>
> A single example will suffice here.  The four signed binary-integer
> storage formats and the nine floating-point storage formats all use
> the twos-complement sign-representation, 0b for non-negative or 1b for
> negative.  The single-byte signed representation of -128 is thus
> b and that of +127 is 0111b.   Lexicographically,  the 2C
> representation of -128 is greater than the 2C representation  of +127.
>  This inconvenience can be dealt with in a constructed key that
> concatenates 'mixed' data-type sort fields in at least two ways, e.g.,
>  by complementing the high-order, leftmost bits of such quantities.
>
> Operations of this kind are well beyond the scope of card sorters.
> Sorts have become black boxes.  Few of their users know or care much
> about what goes on inside them, and this is a pity because they embody
> a lot of not at all obvious technology that is of considerable
> interest.  Much of it is or, better, would be useful elsewhere too.
>
> John Gilmore, Ashland, MA 01721 - USA
>



--
zMan -- "I've got a mainframe and I'm not afraid to use it"


Re: Sortlessness?

2013-08-31 Thread John Gilmore
There is a qualitative difference between modern sorting technology
and the very simple 'logical' or lexicographic sorting operations
performed by a card sorter, one that Chris.Baicher properly emphasized
in an earlier post.  They use data transformations to make it possible
for two keys to be compared using a single CLC[L].  (DB2 does similar
things too.)

A single example will suffice here.  The four signed binary-integer
storage formats and the nine floating-point storage formats all use
the twos-complement sign-representation, 0b for non-negative or 1b for
negative.  The single-byte signed representation of -128 is thus
b and that of +127 is 0111b.   Lexicographically,  the 2C
representation of -128 is greater than the 2C representation  of +127.
 This inconvenience can be dealt with in a constructed key that
concatenates 'mixed' data-type sort fields in at least two ways, e.g.,
 by complementing the high-order, leftmost bits of such quantities.

Operations of this kind are well beyond the scope of card sorters.
Sorts have become black boxes.  Few of their users know or care much
about what goes on inside them, and this is a pity because they embody
a lot of not at all obvious technology that is of considerable
interest.  Much of it is or, better, would be useful elsewhere too.

John Gilmore, Ashland, MA 01721 - USA


Re: Sortlessness?

2013-08-31 Thread Gerhard Postpischil

On 8/31/2013 9:42 AM, Blaicher, Christopher Y. wrote:

Consider that a fast card sorter could process about 2,000 cards a
minute, or even if it could process 20,000 cards a minute, that works
out to about 2,666 bytes a second for the 2,000 card case or 26,666
bytes a second for the fictional 20,000 card case.


It's worse than that, since the sorter requires either one (numeric
only) or two (alphanumeric) passes per sort column. Luckily my only use
was confined to short numeric fields 

Gerhard Postpischil
Bradford, Vermont


Re: Sortlessness?

2013-08-31 Thread Blaicher, Christopher Y.
I find the comment "for a sufficiently large input data set it will outperform 
most competing technologies" most interesting, and dated.

It beat competing technologies of the day, but not of today.

Consider that a fast card sorter could process about 2,000 cards a minute, or 
even if it could process 20,000 cards a minute, that works out to about 2,666 
bytes a second for the 2,000 card case or 26,666 bytes a second for the 
fictional 20,000 card case.

Syncsort MFX typically will process over 100,000,000 bytes per second, 
depending on the input and output devices as they tend to be the limiting 
factors.  CPU time may not be linear based on a number of factors, but I can 
buy more processors, I cannot buy more wall clock time.

The card sorter was the speed demon of its day, but it was replaced for a 
reason.

Chris Blaicher
Principal Software Engineer, Software Development
Syncsort Incorporated
50 Tice Boulevard, Woodcliff Lake, NJ 07677
P: 201-930-8260  |  M: 512-627-3803
E: cblaic...@syncsort.com


-Original Message-
From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On 
Behalf Of Paul Gilmartin
Sent: Friday, August 30, 2013 5:29 PM
To: MVS List Server 2
Subject: Re: Sortlessness?

On 2013-08-30 16:04, zMan wrote:
> If you need one, there's always
> http://www.ebay.com/itm/IBM-Model-83-Punch-Card-Sorter-/300954726197?p
> t=US_Vintage_Computers_Mainframes&hash=item46124caf35
>
And it operates in time linear with respect to the size of the input data set, 
implying that for a sufficiently large input data set it will outperform most 
competing technologies.

-- gil


Re: Sortlessness?

2013-08-30 Thread Robert A. Rosenberg

At 16:28 -0600 on 08/30/2013, Paul Gilmartin wrote about Re: Sortlessness?:


On 2013-08-30 16:04, zMan wrote:

 If you need one, there's always

http://www.ebay.com/itm/IBM-Model-83-Punch-Card-Sorter-/300954726197?pt=US_Vintage_Computers_Mainframes&hash=item46124caf35


And it operates in time linear with respect to the size of
the input data set, implying that for a sufficiently large
input data set it will outperform most competing technologies.

-- gil


I think that should be "it operates in time linear with respect to the size of
the input data set TIMES THE LENGTH OF THE SORT FIELD". IOW: The
actual time (ignoring the time it takes to collect the 12 stacks of
cards and putting them back into the feed tray for sorting on the
next column) is the same as a single column/pass sort of a deck whose
size is X times as large (where X is the number of columns you are
sorting on).


Re: Sortlessness?

2013-08-30 Thread Roger Bolan
As a programmer, I only had to put my deck in the sorter AFTER I dropped
it! :)


On Fri, Aug 30, 2013 at 2:43 PM, Capps, Joey  wrote:

> Not since the old days when you dropped your card deck in a physical card
> sorter :-)
>
> -Original Message-
> From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU]
> On Behalf Of Gord Tomlin
> Sent: Friday, August 30, 2013 3:33 PM
> To: ASSEMBLER-LIST@LISTSERV.UGA.EDU
> Subject: Sortlessness?
>
> Just a little Friday afternoon curiosity...does anyone have, or know of,
> any z/OS systems that have *no* sort product (DFSORT, Syncsort, etc.)
> installed? Before anyone asks, no, I am not planning to develop one!
>
> --
>
> Regards, Gord Tomlin
> Action Software International
> (a division of Mazda Computer Corporation)
> Tel: (905) 470-7113, Fax: (905) 470-6507
>


Re: Sortlessness?

2013-08-30 Thread Paul Gilmartin
On 2013-08-30 14:32, Gord Tomlin wrote:
> Just a little Friday afternoon curiosity...does anyone have, or know of,
> any z/OS systems that have *no* sort product (DFSORT, Syncsort, etc.)
> installed? Before anyone asks, no, I am not planning to develop one!
>
Does the POSIX sort command count?  I find it very useful.

-- gil


Re: Sortlessness?

2013-08-30 Thread zMan
If you need one, there's always
http://www.ebay.com/itm/IBM-Model-83-Punch-Card-Sorter-/300954726197?pt=US_Vintage_Computers_Mainframes&hash=item46124caf35


On Fri, Aug 30, 2013 at 2:54 PM, Paul Gilmartin wrote:

> On 2013-08-30 14:32, Gord Tomlin wrote:
> > Just a little Friday afternoon curiosity...does anyone have, or know of,
> > any z/OS systems that have *no* sort product (DFSORT, Syncsort, etc.)
> > installed? Before anyone asks, no, I am not planning to develop one!
> >
> Does the POSIX sort command count?  I find it very useful.
>
> -- gil
>



--
zMan -- "I've got a mainframe and I'm not afraid to use it"


Re: Sortlessness?

2013-08-30 Thread Paul Gilmartin
On 2013-08-30 16:04, zMan wrote:
> If you need one, there's always
> http://www.ebay.com/itm/IBM-Model-83-Punch-Card-Sorter-/300954726197?pt=US_Vintage_Computers_Mainframes&hash=item46124caf35
>
And it operates in time linear with respect to the size of
the input data set, implying that for a sufficiently large
input data set it will outperform most competing technologies.

-- gil


Re: Sortlessness?

2013-08-30 Thread Capps, Joey
Not since the old days when you dropped your card deck in a physical card 
sorter :-)

-Original Message-
From: IBM Mainframe Assembler List [mailto:ASSEMBLER-LIST@LISTSERV.UGA.EDU] On 
Behalf Of Gord Tomlin
Sent: Friday, August 30, 2013 3:33 PM
To: ASSEMBLER-LIST@LISTSERV.UGA.EDU
Subject: Sortlessness?

Just a little Friday afternoon curiosity...does anyone have, or know of, any 
z/OS systems that have *no* sort product (DFSORT, Syncsort, etc.) installed? 
Before anyone asks, no, I am not planning to develop one!

--

Regards, Gord Tomlin
Action Software International
(a division of Mazda Computer Corporation)
Tel: (905) 470-7113, Fax: (905) 470-6507


Re: Sortlessness?

2013-08-30 Thread Ed Jaffe

On 8/30/2013 1:32 PM, Gord Tomlin wrote:

Just a little Friday afternoon curiosity...does anyone have, or know of,
any z/OS systems that have *no* sort product (DFSORT, Syncsort, etc.)
installed? Before anyone asks, no, I am not planning to develop one!


We ran for many years without a commercial sort product. Hard to imagine
a production customer could get away with that...

--
Edward E Jaffe
Phoenix Software International, Inc
831 Parkview Drive North
El Segundo, CA 90245
http://www.phoenixsoftware.com/


Sortlessness?

2013-08-30 Thread Gord Tomlin

Just a little Friday afternoon curiosity...does anyone have, or know of,
any z/OS systems that have *no* sort product (DFSORT, Syncsort, etc.)
installed? Before anyone asks, no, I am not planning to develop one!

--

Regards, Gord Tomlin
Action Software International
(a division of Mazda Computer Corporation)
Tel: (905) 470-7113, Fax: (905) 470-6507