Wang, Li wrote:
Dear All


Hello,

Thanks very much for your help!
I tried the script with my real data and found out that the situation gets more 
complicate.
The following is part of my data:
scaffold_1_13528        T/T     C/T     T/T     C/T     T/T     C/T     T/T     
T/T     N/N     T/T     C/T     T/T     C/C     C/T     C/T     T/T     T/T     
T/T     T/T     T/T     T/T     C/T     C/T     C/T     C/T     T/T     T/T     
T/T     N/N     T/T (keep)
scaffold_1_13531        G/G     G/G     G/G     G/G     G/G     G/C     G/G     
G/G     N/N     G/G     G/C     G/G     G/G     G/G     G/G     G/G     G/G     
G/G     G/G     G/G     G/G     G/G     G/G     G/G     G/G     G/G     G/G     
G/G     N/N     G/G  (keep)
scaffold_1_13546        A/A     A/A     A/A     A/A     A/A     C/A     A/A     
A/A     N/N     A/A     C/A     N/N     A/A     A/A     N/N     N/N     A/A     
N/N     A/A     N/N     A/A     A/A     A/A     C/A     C/A     A/A     A/A     
N/N     N/N     A/A  (keep)
scaffold_1_22222        N/N     C/C     N/N     C/C     N/N     N/N     C/C     
C/C     N/N     N/N     C/C     C/C     N/N     C/C     C/C     C/C     N/N     
N/N     C/C     N/N     C/C     N/N     C/C     N/N     C/C     C/C     C/C     
N/N     N/N     C/C (delete)
scaffold_1_113139       C/C     C/C     N/N     C/C     N/N     C/C     C/C     
C/G     N/N     C/C     N/N     C/C     N/N     C/C     C/C     N/N     N/N     
C/C     N/N     C/G     C/G     N/N     C/C     C/C     N/N     C/C     C/C     
N/N     C/C     C/C  (keep)
scaffold_1_113140       G/G     G/G     N/N     G/G     N/N     G/G     G/G     
G/G     N/N     G/G     G/G     G/G     N/N     G/G     G/G     N/N     N/N     
G/G     N/N     G/G     G/G     N/N     G/G     G/G     N/N     G/G     G/A     
N/N     G/A     G/G  (keep)
scaffold_1_113207       A/A     A/A     N/N     A/A     N/N     A/A     A/A     
A/A     N/N     A/A     A/A     A/A     N/N     A/A     A/A     N/N     N/N     
A/A     N/N     A/A     A/A     N/N     A/A     A/A     N/N     A/A     A/A     
N/N     A/A     A/A (delete)
scaffold_1_114021       C/C     C/C     N/N     C/C     N/N     C/C     C/C     
C/T     C/C     C/C     C/C     N/N     N/N     C/C     C/C     C/T     N/N     
C/C     N/N     C/T     C/C     N/N     C/C     C/C     C/C     C/C     C/C     
N/N     C/C     C/C  (keep)
scaffold_1_114213       A/C     C/C     A/A     C/C     N/N     A/C     A/A     
A/A     A/A     A/C     A/C     A/C     N/N     A/A     A/A     A/A     N/N     
A/A     A/A     A/A     A/A     N/N     A/A     C/C     A/A     A/A     A/A     
N/N     A/A     A/A  (keep)

If in each line, without count of "N/N", all the other SNPs are the same, delete this 
line. The "scaffold" indicates the position of the SNP.

My code is as follows:
#! /usr/bin/perl
use strict;
use warnings;

my $usage="perl $0<infile>\n";

my $in=shift or die $usage;
open (IN,$in) or die "Error: not found the $in\n";

my $outfile = "SNPFilterSeg.txt";
open (OUT, ">$outfile");

my $i;


while (<IN>){
     next if /^#/;
     $_=~s/\n//;
        $_=~s/\r//;
        my @tmp=split("\t",$_);
        my @arr;
        for ($i=1; $i<=30; $i++){
        next if $tmp[$i] =~ m/N\/N/; #filter out all "N/N"
        @arr = split("\t",$tmp[$i]); #assign the filtered data to a new array 
@arr
        }
     if (@arr == grep $arr[0] eq $_, @arr) {
        print OUT "here\n";
       }
      else{
        print OUT "@tmp\n";
        }
     }

close IN;
close OUT;


#!/usr/bin/perl
use strict;
use warnings;

my $usage = "perl $0 <infile>\n";

my $in = shift or die $usage;
open IN, '<', $in or die "Cannot open '$in' because: $!";

my $outfile = "SNPFilterSeg.txt";
open OUT, '>', $outfile or die "Cannot open '$outfile' because: $!";


while ( <IN> ) {
    next if /^#/;
    my ( undef, @tmp ) = grep $_ ne 'N/N', split;
    print OUT $_ if @tmp != grep $tmp[ 0 ] eq $_, @tmp;
    }

close IN;
close OUT;




John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction.                   -- Albert Einstein

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to