Re: Dropping duplicates.
Check out the Unix uniq utility, which will eliminate duplicate lines in a sorted file. -- David Andrews A. Duda and Sons, Inc. [EMAIL PROTECTED] -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Dropping duplicates.
Listers, Anyone have any thoughts on this: I have a large sequential file. I need to drop duplicate records from said file. Sort would work fine if I knew the correct key sequence. This information is not imediately available. The file needs to retain its input sequence. Duplicates are always grouped together. Short of writing a program is there a quick way to fix this? Thanks Adrian. Webmaster, http://www.losangelesmetro.net. Supporter of Expo Light Rail - Enabler for the Digital Coast http://www.friends4expo.org. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Dropping duplicates.
Thanks Frank, No DFSORT and NO ICETools here. DFSORT would have been great. Regular sort will not work without a Sort or merge statement. The file characteristics are: Organization . . . : PS Record format . . . : FB Record length . . . : 320 Block size . . . . : 27840 The first eighty bytes look like: 077075333730D2123200506001435L062M79506 MPC It fills 114 cylinders. Dups are identical records and they should not exist. Thanks again A. Webmaster, http://www.losangelesmetro.net. Supporter of Expo Light Rail - Enabler for the Digital Coast http://www.friends4expo.org. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Dropping duplicates.
Actually I thought of a much better way to do this with DFSORT's ICETOOL given that you say all of the duplicates are grouped together. This version only requires one copy pass rather than two sort passes. //S1EXEC PGM=ICETOOL //TOOLMSG DD SYSOUT=* //DFSMSG DD SYSOUT=* //IN DD * 01 02 01 01 02 03 04 01 02 03 01 /* //OUT DD SYSOUT=* //TOOLIN DD * * Select first record with each key. SELECT FROM(IN) TO(OUT) ON(1,4,CH) FIRST USING(CTL1) /* //CTL1CNTL DD * * Force copy instead of sort since dup records are * grouped together. OPTION COPY /* Frank Yaeger - DFSORT Team (IBM) Specialties: ICETOOL, IFTHEN, OVERLAY, Symbols, Migration = DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort/ -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Dropping duplicates.
On 29 Jun 2005 17:21:35 -0700, in bit.listserv.ibm-main (Message-ID:[EMAIL PROTECTED]) [EMAIL PROTECTED] (Adrian H Auer-Hudson) wrote: Anyone have any thoughts on this: I have a large sequential file. I need to drop duplicate records from said file. Sort would work fine if I knew the correct key sequence. This information is not imediately available. The file needs to retain its input sequence. Duplicates are always grouped together. Short of writing a program is there a quick way to fix this? These are statements for Syncsort. I'm not sure if there's an exact equivalent for DFSORT. Also, I haven't tried it, but this *might* work: SORT FIELDS=COPY SUM FIELDS=NONE -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html