Re: help me with a parsing script please

Rob Dixon Fri, 13 May 2011 01:15:49 -0700

On 12/05/2011 10:23, Nathalie Conte wrote:


HI,

I have this file format
chr start end strand
x 12 24 1
x 24 48 1
1 100 124 -1
1 124 148 -1

Basically I would like to create a new file by grouping the start of the
first line (12) with the end of the second line (48) and so on
the output should look like this:
x 12 48 1
1 100 148 -1

I have this script to split and iterate over each line, but I don't know
how to group 2 lines together, and take the start of the firt line and
the end on the second line? could you please advise? thanks

unless (open(FH, $file)){
print "Cannot open file \"$file\"\n\n";
}

my @list = <FH>;
close FH;

open(OUTFILE, ">grouped.txt");


foreach my $line(@list){
chomp $line;
my @coordinates = split(/' '/, $region);
my $chromosome = $coordinates[0];
my $start = $coordinates[1];
my $end = $coordinates[2];
my $strand = $coordinates[3];
...???


Hi Nathalie

I have written something that should work for you. It includes basic
checks (that the chromosome and strand fields in the two lines match,
and that the end field of the first line matches the start field of the
second line. You may want to add more, depending how much you trust your
data.

HTH,

Rob


use strict;
use warnings;

while (my $line1 = <DATA>) {
  my $line2 = <DATA>;
  last unless defined $line2;

  my @data = (
    [ split ' ', $line1 ],
    [ split ' ', $line2 ],
  );

  die unless $data[0][0] eq $data[1][0];
  die unless $data[0][3] == $data[1][3];
  die unless $data[0][2] == $data[1][1];

  $data[0][2] = $data[1][2];

  print "@{$data[0]}\n";
}

__DATA__
x     12    24    1
x    24    48    1
1    100    124    -1
1    124    148    -1

**OUTPUT**
x 12 48 1
1 100 148 -1


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: help me with a parsing script please

Reply via email to