Concatenating TERSEd data?

2008-10-17 Thread Tim Hare
We need to TERSE a fairly large (for us) amount of data. This data is in 
multiple separate datasets now, but needs to be sent as one large sequential 
dataset.  We can TERSE the concatenated sequential input of course; but out 
of curiosity I'm wondering: can you TERSE the individual components, 
concatenate the results via IEBGENER, and the UNTERSE the resulting file on 
the other end? 

From what I remember about Lempel-Ziv, the dictionary is built as you go 
along but it might mean that the second and subsequent files concatenated 
would be read with incomplete information, resulting in erroneous 
decompression results?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html



Re: Concatenating TERSEd data?

2008-10-17 Thread Jürgen Kehr

Hi,

couldn't you put the sequential datasets into one large PO dataset and 
terse this one ? On the other side you unterse it to a PO file again and 
unload it with IEBGENER.


Tim Hare schrieb:
We need to TERSE a fairly large (for us) amount of data. This data is in 
multiple separate datasets now, but needs to be sent as one large sequential 
dataset.  We can TERSE the concatenated sequential input of course; but out 
of curiosity I'm wondering: can you TERSE the individual components, 
concatenate the results via IEBGENER, and the UNTERSE the resulting file on 
the other end? 

From what I remember about Lempel-Ziv, the dictionary is built as you go 
along but it might mean that the second and subsequent files concatenated 
would be read with incomplete information, resulting in erroneous 
decompression results?


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

  


--

___



Freundliche Gruesse / Kind regards



Dipl.Math. Juergen Kehr, IT Schulung  Beratung, IT Education + Consulting

Tel.  +49-561-9528788  Fax   +49-561-9528789  Mobil +49-172-5129389

ICQ 292-318-696 (JKehr)



mailto:[EMAIL PROTECTED]

mailto:[EMAIL PROTECTED]

___

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html



Re: Concatenating TERSEd data?

2008-10-17 Thread John Kington
Tim,


 We need to TERSE a fairly large (for us) amount of data. This data is in
 multiple separate datasets now, but needs to be sent as one large
sequential
 dataset.  We can TERSE the concatenated sequential input of course; but
out
 of curiosity I'm wondering: can you TERSE the individual components,
 concatenate the results via IEBGENER, and the UNTERSE the resulting file
on
 the other end?

You should be able to experiment with this easily. When I need to xmit
multiple datasets from one z/OS environment to another, I always create a
DSS backup, terse it, xmit it, unterse it and run a DSS restore.
Regards,
John

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html



Re: Concatenating TERSEd data?

2008-10-17 Thread Tony Harminc
2008/10/17 Tim Hare [EMAIL PROTECTED]:
 We need to TERSE a fairly large (for us) amount of data. This data is in
 multiple separate datasets now, but needs to be sent as one large sequential
 dataset.  We can TERSE the concatenated sequential input of course; but out
 of curiosity I'm wondering: can you TERSE the individual components,
 concatenate the results via IEBGENER, and the UNTERSE the resulting file on
 the other end?

It's trivial to try, but I very much doubt it...

 From what I remember about Lempel-Ziv, the dictionary is built as you go
 along but it might mean that the second and subsequent files concatenated
 would be read with incomplete information, resulting in erroneous
 decompression results?

Terse appears to be Lempel-Ziv-Wegner (or Welch, depending on whose
expired patent you prefer W to stand for), but it is a particular
implementation of a general algorithm, and there are header and
trailer records, both undocumented. By inspection, the header is a
pretty straightforward 12 byte piece that describes both some original
dataset characteristics and some encoding method info, but the trailer
is longer and less obvious. It looks to me as though the trailer is
just informational, but I don't know if it contains enough information
to be skipped over reliably.

Regardless, the dictionary after the first compress/decompress
operation would not be the same as the initial dictionary, and you
would have no way to tell the decompressor to start with a virgin
dictionary.

Without knowing much about the encoding, you could terse and
concatenate the segments, and then at the other end run a splitter
program to scan through the compressed data looking for headers, and
invoked the deterse for each segment. Unfortunately the headers are
not uniquely identifiable, i.e. there is no eyecatcher, and a
syntactically correct header could occur within the compressed data
stream. So your splitter program would have to scan forward from the
13th byte, treating the data stream as 12-bit chunks, until you reach
a zero chunk, indicating logical EOF, then figure out how to skip over
the trailer, which doesn't appear to contain its own length, and scan
for the next header. It's always possible AMATERSE already does this.

Another approach might be to put the original multiple datasets into
members of a PDS, and terse that with AMATERSE, which understands
PDS[E]s. After the deterse, you would have an identical PDS, which
could be easily turned back into a sequential dataset. Or run a DSS
dump selecting your datasets, terse the output of that, then deterse
and DSS restore at the other end.

Tony H.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html



Re: Concatenating TERSEd data?

2008-10-17 Thread John McKown
I would probably do the following:

1. TERSE each dataset to be transmitted.

2. Create a PDS large enough to contain each TERSEd dataset as a separate
member.

3. Put another member in that pds to restore the TERSEd datasets within it

4. XMIT that PDS (not TERSE).

Why the XMIT at the end instead of another TERSE? Because TERSE'ing the PDS
would likely just use CPU with little futher compression.

Why TERSE each dataset, then make each TERSE a member of a PDS instead of
using DFDSS of each dataset, then TERSE'ing the DFDSS dump? Because not
everybody has DFDSS.

Of course step 2 assumes that you have a single volume with enough space to
contain the PDS with all the members. If this is not true, then I'd likely
download each TERSE'd dataset to my desktop. Once on my desktop, I'd use
zip without compression to combine the TERSE'd datasets into a single
zip file.

--
John

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html



Re: Concatenating TERSEd data?

2008-10-17 Thread Paul Gilmartin
On Fri, 17 Oct 2008 14:23:44 -0500, John McKown wrote:

Why the XMIT at the end instead of another TERSE? Because TERSE'ing the PDS
would likely just use CPU with little futher compression.

Yes, but XMIT expands it somewhat with control sequences.

Of course step 2 assumes that you have a single volume with enough space to
contain the PDS with all the members. If this is not true, then I'd likely
download each TERSE'd dataset to my desktop. Once on my desktop, I'd use
zip without compression to combine the TERSE'd datasets into a single
zip file.

What's the advantage of zip without compression over pax?  tar?

Dammit! why won't AMATERSE tolerate HFS files, NFS files, or
POSIX pipes allocated with JCL or DYNALLOC as its TERSE'd
datasets?

-- gil

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html



Re: Concatenating TERSEd data?

2008-10-17 Thread Mark Zelden
On Fri, 17 Oct 2008 12:00:48 -0500, Tim Hare [EMAIL PROTECTED] wrote:

n Fri, 17 Oct 2008 12:00:48 -0500, Tim Hare [EMAIL PROTECTED] wrote:

We need to TERSE a fairly large (for us) amount of data. This data is in
multiple separate datasets now, but needs to be sent as one large sequential
dataset.  We can TERSE the concatenated sequential input of course; but out
of curiosity I'm wondering: can you TERSE the individual components,
concatenate the results via IEBGENER, and the UNTERSE the resulting file on
the other end?

On Fri, 17 Oct 2008 14:23:44 -0500, John McKown [EMAIL PROTECTED] wrote:

I would probably do the following:

1. TERSE each dataset to be transmitted.

2. Create a PDS large enough to contain each TERSEd dataset as a separate
member.


PDS?  If this a large amount of data (the OP didn't give a clue as to what
that really meant...) then it's not going to fit within the small (relatively)
size restriction of a PDS ( 64K tracks).  

Why not terse them (concatenated) and the output can be on a single
tape data set (multi-volume if required) if the size is too big for disk.
I've done that to deal with 15 volume 3390-3 SADUMP data sets.

Mark
--
Mark Zelden
Sr. Software and Systems Architect - z/OS Team Lead
Zurich North America / Farmers Insurance Group - ZFUS G-ITO
mailto:[EMAIL PROTECTED]
z/OS Systems Programming expert at http://expertanswercenter.techtarget.com/
Mark's MVS Utilities: http://home.flash.net/~mzelden/mvsutil.html

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html



Re: Concatenating TERSEd data?

2008-10-17 Thread John McKown
On Fri, 17 Oct 2008, Paul Gilmartin wrote:

 What's the advantage of zip without compression over pax?  tar?

Most Windows people have zip and know what it is. But, in this case, it 
might be better to pax the files. Does pax read and write legacy datasets?

 
 Dammit! why won't AMATERSE tolerate HFS files, NFS files, or
 POSIX pipes allocated with JCL or DYNALLOC as its TERSE'd
 datasets?

Likely that header information referred to earlier specified things like 
DSORG, LRECL, RECFM, and other things that don't really apply to a UNIX 
file. Not that AMATERSE could not be expanded to support UNIX files by 
using some other values in that header information.

 
 -- gil

-- 
Q: What do theoretical physicists drink beer from?
A: Ein Stein.

Maranatha!
John McKown

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html



Re: Concatenating TERSEd data?

2008-10-17 Thread Paul Gilmartin
On Fri, 17 Oct 2008 18:09:43 -0500, John McKown wrote:

 What's the advantage of zip without compression over pax?  tar?

Most Windows people have zip and know what it is. But, in this case, it
might be better to pax the files. Does pax read and write legacy datasets?

I was thinking terse, then tar; even as you suggested terse, then zip.

Pax reads and writes legacy data sets on the archive side only.

 Dammit! why won't AMATERSE tolerate HFS files, NFS files, or
 POSIX pipes allocated with JCL or DYNALLOC as its TERSE'd
 datasets?

Likely that header information referred to earlier specified things like
DSORG, LRECL, RECFM, and other things that don't really apply to a UNIX
file. Not that AMATERSE could not be expanded to support UNIX files by
using some other values in that header information.

No, no; I was thinking UNIX files on the tersed side, not the
untersed.  Now, it's possible to terse and cp the archive to
a Unix file, and even unterse directly from the unix archive
provided you precatenate an empty legacy data set !?!?.

I.e. you'd like to terse to NFS files mounted from your
Linux system, then pax on Linux.

-- gil

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html