Hi,
I have three files of data which consists of records of double-quoted
strings that are comma separated. The first data "field" for each
record contains a unique sequence number. All the records in the files
contain the same number of "fields".
Step 1: I need to merge two of these files of records together, sorted
on the the sequence number. (into a separate output file)
Step 2: I need to compare this merged output file (call it A) with the
third input file. This time, I need to look for missing sequence
numbers - (they increment by 1) - and insert dummy records for these
missing sequence numbers into the merged output file (A).
I wouldn't think this is too difficult, but...
The problem is that some of the double-quoted strings contain newlines.
OK, there are a lot of newlines in these quoted strings - and they have
to remain in the data.
So, when I try to open a filehandle to attempt to separate the file
contents on the commas as I'm trying to put the file contents into an
array - I only get the first line of file data.
Here's my code (try not to laugh too hard!)
select STDOUT;
$|=1;
select STDERR;
$|=1;
$InFile1 = (qq/$ENV{'TEMP'}\\InFile1.txt/);
open INFILE1, "< $InFile1";
@InArray1 = split(",", );
print "did it\n";
close INFILE1;
foreach $InArray1(@InArray1)
{
print $InArray1;
}
My resulting output is:
C:\Temp>perl -w migration3.pl
did it
"1100new""yes"
The contents of my test file (InFile1.txt) are:
"1100","","new","yes"
"1101","today is the best
day of my life","old","no"
This is just a tiny test file that I tossed together. The real data
consists of ~20 fields per record. Of these, there are maybe 8-12
fields that could contain embedded newlines. There could be multiple
newlines in any of these fields.
Also, the actual data files are not huge. Each is under 7Mb.
I did look around to see if there were any modules available that would
help me out. I looked at File::Sort, Sort::Merge, and File:MergeSort.
I'm not sure how to get past this first hurdle. These modules are
either looking at the file contents line by line or the input mechanism
is completely open, and I would need to supply my own.
Any assistance is appreciated! :)
Best Regards,
-Bill
___
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs