At 15:53 -0600 on 12/24/2015, Joel C. Ewing wrote about Re: Is there
a source for detailed, instruction-level perfo:
As Tom has noted, the most dramatic performance enhancements typically
come from a change in strategy or algorithm used. In my experience you
get better results by looking for ways to accomplish the end result by
having the program do fewer actions rather than concentrating on
micro-optimzing the individual actions
This story (and the others) reminds me of an incident that occurred
early in my programming life.
We had an application that read Column Binary data on a 2540 Card
Reader. The gotcha was that the card was not pure Column Binary but
half CB with the other half being normal EBCDIC. The program would
read the card as CB but not eject it leaving the card image in the
reader's buffer. It would then do a 2nd read (from the buffer) as
EBCDIC with the bad format flag on and eject the card. The result was
a 160 byte image of the card as CB and a 80 byte EBCDIC image with
the CB columns as random junk.
This worked until the Bean Counters wanted to replace the 2540 with a
2501 (since we were no longer punching output cards so did not need
the punch capability of the 2540). Since the 2501 was an unbuffered
Read-and-Eject device you got one crack at reading the card so it
could only be read as CB (unless we did a second pass of the deck to
get the EBCDIC data). It was decided to have the program take the CB
image and convert the EBCDIC section from CB. The task of writing the
conversion routine was given to another programmer who built a table
of all the 256 2-byte bit patterns that represented the holes in the
card. His program would then do a search of the table one column at a
time (I do not remember if this was a Binary or Hash search). In any
case the program was slow/inefficient.
I was asked to look at his code and see if I could speed up his code.
I was able to do so by starting from scratch by using a few TRs and
an OC. The basic idea was to use a TR to separate the top 6 rows from
the bottom 6 rows of the card image in the CB buffer. Then TR each of
the two set of rows to form an 5-bit map showing if Row 12/11/0/8/9
was punched or not and a 3-bit binary number from 0 to 7 showing
which row in the 1-7 range was punched (ie: Row 5 yielded 101).
OC'ing the top row over the bottom yielded a value showing which
punches were on the card. Running the result through one final TR
converted the Card-Image EBCDIC into the Internal Mapping.
Using the same sequence and a different set of TR Tables and
replacing the final TR with a TRT that checked for more then 1 bit
on, acted as a sanity check on the EBCDIC part of the CB. My version
ran VERY fast. The major effort was creating the TR tables (and all
of the mapping info needed was there since this was done by the
original programmer when he created his tables).
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN