> On 1/19/2010 12:09 AM, Perl Noob wrote: >> I have a data file with thousands of records. The problem is that >> the >> records in the data file span two lines for each record. I want to >> write a perl script that makes each record a single line. The file >> looks like this: >> >> RECORD1FIELD1 RECORD1FIELD2 RECORD1FIELD3 RECORD1FIELD3 >> RECORD1FIELD4 RECORD1FIELD5 >> >> RECORD2FIELD1 RECORD2FIELD2 RECORD2FIELD3 RECORD2FIELD3 >> RECORD2FIELD4 RECORD2FIELD5 >> >> . . . >> >> What I want is this: >> >> RECORD1FIELD1 . . .RECORD1FIELD5 >> RECORD2FIELD1 . . .RECORD2FIELD5 >> >> >> The second line of each record actually has a bunch of spaces before >> the first field. I thought I could exploit this with: >> >> s/\n //gi; >> >> what I thought would happen is the script would look for a new line >> followed by a bunch of empty spaces and delete only those. But that >> didn't work. >> >> Using a hex editor I saw that each new line was 0D 0A. I then tried: >> >> s/\x0D\x0A//gi; >> >> that didn't work either. >> >> I just want to move the second line of each record to the end of the >> first. It seems so simple, but I am exhausted of trying different >> things. >> >> >> >> > > I see a couple of choices. Your example data seems to have an > extra newline between logical records. If that's true, then > you can read them as paragraphs, e.g., > > 1 #!/usr/bin/perl > 2 > 3 use warnings; > 4 use strict; > 5 > 6 $/ = "\n\n"; # one of the paragraph modes > 7 > 8 while( <DATA> ) { > 9 my @fields = split; > 10 print "@fields\n"; > 11 } > 12 > 13 > 14 __DATA__ > 15 RECORD1FIELD1 RECORD1FIELD2 RECORD1FIELD3 RECORD1FIELD3 > 16 RECORD1FIELD4 RECORD1FIELD5 > 17 > 18 RECORD2FIELD1 RECORD2FIELD2 RECORD2FIELD3 RECORD2FIELD3 > 19 RECORD2FIELD4 RECORD2FIELD5 > 20 > > If the apparent extra newline was not intentional, then > you could simply read two lines at a time, e.g., > > 1 #!/usr/bin/perl > 2 > 3 use warnings; > 4 use strict; > 5 > 6 while( <DATA> ) { > 7 $_ .= <DATA>; > 8 my @fields = split; > 9 print "@fields\n"; > 10 } > 11 > 12 > 13 __DATA__ > 14 RECORD1FIELD1 RECORD1FIELD2 RECORD1FIELD3 RECORD1FIELD3 > 15 RECORD1FIELD4 RECORD1FIELD5 > 16 RECORD2FIELD1 RECORD2FIELD2 RECORD2FIELD3 RECORD2FIELD3 > 17 RECORD2FIELD4 RECORD2FIELD5 > > > -- > Brad
I am AMAZED at the help available in this forum. It is an awesome resource. I can see, though, that my situation needs to be stated more clearly. The data is not consistent throughout the entire file. I WISH I only had to skip every other line. The problem is not quite that simple. The data I need is always consistent within the file, but is not so neat as to be on every other line. The common characteristic of the data I need is that the record has an end of line marker followed by 65 spaces on the following line. Here is a better sample of what I described: _______BEGIN SAMPLE DATA FILE_________________ RandomJunkNothingImportantMoreJunk StuffthatdoesntmatterWhocaresaboutthis RECORD1FIELD1(3 spaces)RECORD1FIELD2(3 spaces)RECORD1FIELD3(newline) (65 spaces)RECORD1FIELD4(12 spaces)RECORD1FIELD5 RECORD2FIELD1(3 spaces)RECORD2FIELD2(3 spaces)RECORD2FIELD3(newline) (65 spaces)RECORD2FIELD4(12 spaces)RECORD2FIELD5 RandomJunkNothingImportantMoreJunk StuffthatdoesntmatterWhocaresaboutthis MoreJunkThatDoesntmatterStuffIdontwantWhocaresaboutthis RECORD3FIELD1(3 spaces)RECORD3FIELD2(3 spaces)RECORD3FIELD3(newline) (65 spaces)RECORD3FIELD4(12 spaces)RECORD3FIELD5 RECORD4FIELD1(3 spaces)RECORD4FIELD2(3 spaces)RECORD4FIELD3(newline) (65 spaces)RECORD4FIELD4(12 spaces)RECORD4FIELD5 RECORD5FIELD1(3 spaces)RECORD5FIELD2(3 spaces)RECORD5FIELD3(newline) (65 spaces)RECORD5FIELD4(12 spaces)RECORD5FIELD5 RECORD6FIELD1(3 spaces)RECORD6FIELD2(3 spaces)RECORD6FIELD3(newline) (65 spaces)RECORD6FIELD4(12 spaces)RECORD6FIELD5 ___________END SAMPLE DATA FILE ____________________ You will notice in the sample above that the only consistent items between the usable data is the (newline) followed by (65 spaces). Therefore if I could find a way to do a search and replace s/(newline)(65spaces)//gi; that would be great. I just need to get each (newline)followed by (65spaces) and delete it. I just am not sure how to do that. My brain hurts. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/