--zhXaljGHf11kAtnf Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable
On Tue, Aug 26, 2003 at 08:03:35PM -0500, Ron Johnson wrote: > On Tue, 2003-08-26 at 18:23, Bijan Soleymani wrote: > > On Tue, Aug 26, 2003 at 02:30:57PM -0500, Ron Johnson wrote: > > > On Tue, 2003-08-26 at 13:29, David Turetsky wrote: > > > > > On Tue, 2003-08-26 at 10:05, Kirk Strauser wrote: > > > > > From: Ron Johnson [mailto:[EMAIL PROTECTED] > > > > > For example, COBOL has intrinsic constructs for easily handling > > > > > ISAM files in a variety of manners. Likewise, there is a very=20 > > > > > powerful intrinsic SORT verb. > > > > >=20 > > > >=20 > > > > Yes, but how does that compare with similarly powerful features in = Perl? > > >=20 > > > I *knew* someone would ask about the Programmable Extraction and > > > Reporting Language... > > >=20 > > > Please don't think that I am implying that Perl or C are bad language= s. > > > I certainly wouldn't write a logfile analyzer in COBOL. > > >=20 > > > For my knowledge, how would Perl sort the contents of a file, > > > allowing the programmer to use a complex algorithm as an > > > input filter, and then take the output stream, processing it > > > 1 record at a time, without needing to write to and then read > > > from temporary files with all of the extra SLOC that that entails? > >=20 > > 1) Read from records from file into an array. > > 2) Order them with any perl code you want and store them in an array. > > 3) Use a nice foreach loop to process them. > > --- Outline of Perl code --- > > #!/usr/bin/perl > > @records =3D read_records_function("records.txt"); > > #either > > @sorted =3D sort @records; #to put things in alphabetical order > > #or=20 > > @sorted =3D sort function @records; #to sort using a function > > # or even > > @sorted =3D sort {sort-function-code} @records; #to have it in-line > > foreach $record (@sorted) > > { > > # code to process records > > } >=20 > That's great for in-memory stuff. =20 >=20 > What about when there are, say, 10 or 40M records to process? >=20 > And what if you only need to SORT a fraction of those 40M records, > and the winnowing algorithm is very complicated, possibly needing > to access other files or database tables in the process? Well then you do it differently: open(INPUT,"records.txt"); LOOP: while($record =3D <INPUT>) { if(!input_filter($record)) { next LOOP; } else { #code to process record } } close(INPUT); Where input_filter() is the winnowing function you define. Note: This program assumes that records are seperated by newline. But you could set the input record seperator to whatever you want. Bijan --=20 Bijan Soleymani <[EMAIL PROTECTED]> http://www.crasseux.com --zhXaljGHf11kAtnf Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/T8WbUof+95vTyAwRAtHsAKCU3JHTX5pQSzMChWVETO9GDbD1xQCgg+xt 7/KYJ2zm1zoWCpiNjkOLxtA= =QnI7 -----END PGP SIGNATURE----- --zhXaljGHf11kAtnf-- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]