a question related to file selection

2009-01-23 Thread Li, Aiguo (NIH/NCI) [E]
Hi all,

 

I need copy files from a directory daily to a folder.  How can I select
files based on dates that were created?

 

Thanks,

 

AG Lee



A questions about how to split files

2009-01-15 Thread Li, Aiguo (NIH/NCI) [E]
H all,

 

I need to split a file containing three columns of data as shown below
into three separate files.  Each split file should contain row names and
one column of data and the column name should be the file names.  Is
there any perl advanced function that allow me to do this?

 

probe set

E10662B_U133P2

E10662_U133P2

HF1589_U133P2

1007_s_at

12.59

12.9

12.7

1053_at

9.72

9.01

9.55

117_at

9.59

8.99

8.91

121_at

10.21

10.16

10.08

1255_g_at

6.02

6.28

6.01

1294_at

9.32

9.66

9.29

1316_at

10.05

10.01

9.89

1320_at

4.36

5.04

5.56

1405_i_at

5.52

8.24

6.52

 

 

Thanks in advance!

 

AG Lee

 



a question related to working with array

2008-08-15 Thread Li, Aiguo (NIH/NCI) [E]
Hello,

I have a programming questions to you all,

I have three arrays: arrayA1, arrayA2, (arrayA1 and arrayA2 are same

length) and array B.  Using the following code I created the fourth 

arrayC that contain intersection elements between arrayA1 and arrayB.

Now I need to get the corresponding elements in arrayA2 into arrayD.

Can anyone help me with this issue?

map { $original {$_} = 1 } @arrayA1;

 

@arrayC = grep {$original {$_} } @arrayB;

 
The bottom line question is how to track the element location of $_ in

arrayA1 when generating the intersected elements in arrayB, then use
this index to get 

the corresponding elements in arrayA2.



Thanks in advance for your help and have a good night!

 AG Lee

 

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




A question related to column subsetting

2005-08-19 Thread Li, Aiguo (NIH/NCI)
Hi, all.

 

I need to write a perl script to extract a subset of columns based on the
column header.  For example, I have a dataset containing 200 columns and a
list of 10 column names.  I like to extract a dataset that contains the 10
columns of data only from the whole dataset.

 

What in my mind now is to generate array1 from the header of the whole
dataset and generate array2 from the 10 column names.  Then to find the
position of this 10 column names in the array1.  Then read the entire
dataset to extract the 10 columns.  Am I on the right track?  Is there a
ready to use function for this type of task?

 

Thanks in advance

 

AG



question related to readdir function

2005-07-15 Thread Li, Aiguo (NIH/NCI)
Dear all.
 
I wrote a piece of code to read and open all files from a directory to do
something with it.
But I found that the variable assigned to @file_array split the file names
into pieces.  For example ". . test.rpt" file will be separated into three
elements in an array, which are "." "." "test_rpt".  
 
My question is how can I assign the entire file name to one variable without
spliting it?
 
Thanks,
 
AG
 
#!usr/local/bin/perl
 
use strict;
use warnings;
 
my $dirtoget="C:/perl/work/data/";
 
opendir (DIR, $dirtoget) or die 'Can not open DIR';
 
my @file_array;
 
 @file_array =readdir(DIR);
 closedir(DIR);
 
foreach my $filename (@file_array) {
 
 print $filename;
open (IN, "$f") || die 'Can not open IN';
 
while(){
print $_;}
close(IN);}



Number or string?

2005-02-22 Thread Li, Aiguo (NIH/NCI)
Hi, all.

I have the following data from a file.
__data from file IN__
SNP_A-1512608   23  148840899   0.8281090.823391
11128
SNP_A-1512550   23  148841154   1.7173971.750476
11129
SNP_A-1518843   23  149078514   0.8322850.99744 11130
SNP_A-1507809   23  149080794   1.4632251.463085
11131
SNP_A-1519263   23  149309465   0.9901721.124282
11132
SNP_A-1512795   23  149514662   1.37836 1.51924 11133
SNP_A-1518711   23  149890944   1.5413071.920374
11134
SNP_A-1517959   23  150083331   0.5359660.942863
11135

While I trying to get a difference between column 4 and column5, I got the
following error message:
"Argument "STD" isn't numeric in subtraction (-) at cpdiffer.pl line 19,
 lin
e 1.
Argument "gli3ak" isn't numeric in subtraction (-) at cpdiffer.pl line 19,

line 1."

MY QUESTIONS IS: why perl treat the numeric value in column 4 as string and
how to convert it?

Thanks,

AG Lee
The code is as below:

#!/bin/perl -w

#use strict;
use warnings;

my $differ;


open (IN, "C:/perl/work/data/cpdata.txt") or die ("Can not open cpdata.txt
file! \n");
open (OUT, ">C:/perl/work/data/cpout.txt") or die ("Can not open cpout.txt
file! \n");
while(my $line = )
{
chomp $line;

my ($snp, $chro, $location, $gli3ak, $std, $rest) = split(/\t/, $line);
#print $line;
#print "$snp, $chro, $location, $gli3ak, $std, $rest\n";

$differ = $gli3ak-$std;

if(($differ >= 2) || ($differ <= -1.5))
{

print OUT "$snp\t$chro\t$location\t$gli3ak\t$std\t$rest\n";
}

}

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




How to go back to a line and do something with it in an input fil e?

2005-01-24 Thread Li, Aiguo (NIH/NCI)
hello, all.

I need to transform the column3 data from an input file as indicated below
into three values: 0, 0.4, 1.
if there are more than 3 NoCall continuously in column3, change NoCall into
"1". this option has the highest precedence.
for the rows where column2 is between 52 and 105 (range from a file),
transform the values in column3 into "0.4".
for the rows outside of this region, change the values in column3 into "0"

print into an output file.

I am struggling with how to go back to the line to change the column 3 after
3 consecutive NoCall has been counted??? 

I currently read the input data into an array and used this line to go back
to the beginning of first NoCall line: my $nocalline = $contents[$line_num -
3];
Perl does not like it and an error message says: ""my" variable $nocalline
masks earlier declaration in same scope a
 line 38. Use of implicit split to @_ is deprecated at trynocall.pl line 38.
syntax error at trynocall.pl line 38, near ");"

Can anyone give some hints?

Thanks

Aiguo


input data is:  output data should be:
==  
1   22  AA  1   22  0
1   24  AA  1   24  0
1   26  NoCall  1   26  0
1   27  BB  1   27  0
1   30  AB  1   30  0
1   35  NoCall  1   35  1
1   40  NoCall  1   40  1
1   41  NoCall  1   41  1
1   42  NoCall  1   42  1
1   48  AB  1   48  0
1   50  AB  1   50  0
1   52  BB  1   52  0.4
1   53  NoCall  1   53  0.4
1   55  NoCall  1   55  0.4
1   56  BB  1   56  0.4
1   66  AA  1   66  0.4
1   70  NoCall  1   70  1
1   90  NoCall  1   90  1
1   99  NoCall  1   99  1
1   100 NoCall  1   100 1
1   101 NoCall  1   101 1
1   103 AA  1   103 0.4
1   105 BB  1   105 0.4
2   22  AA  2   22  0
2   24  BB  2   24  0
2   26  AB  2   26  0
2   27  BB  2   27  0
2   30  AA  2   30  0
2   35  AA  2   35  0
2   40  AA  2   40  0


the code I have
=
#!usr/local/bin/perl

use strict;
use warnings;

open (DATA, "C:/Aiguo_2004/SNP_Johnpark/work files/summary/testnocall1.txt")
or die "Can not open file \n";
open (LOH, "C:/Aiguo_2004/SNP_Johnpark/work files/summary/loh.txt") or die
"Can not open file \n";
open (OUT1, ">C:/perl/work/nocallout.txt") or "die can not open file \n";

my @contents = ;
my $chro_num;
my $lohbegin;
my $lohend;
my $line_num = 0;


while(my $lohline = ) #this file contain regions for changing to 0.4
{
($chro_num, $lohbegin, $lohend) = split(/\t/, $lohline);
foreach my $line (@contents) 
{
while($line =~ /^[1..9]/)
{
$line_num++;
#print $line_num;
(my $chro, my $position, my $call) = split (/\t/, $line);

if(($chro_num == $chro)&&($position >= $lohbegin) &&
($position <= $lohend))
{
call = 0.4;
unless($call =~ "NoCall")
{
my $n++;
my $m++;

if($n >= 3)
{
my $nocalline = $contents[$line_num - 3];
my ($nocall_chro, $nocall_begin, $nocall =
split(/\t/, $nocalline);
print "Nocall chro = $nocall_chro, Nocall
begin = $nocall_begin \n"; 
$nocall = "Nocall";
}
#else{$call = 0.4;}
print $call;
$contents[$line_num] = join('\t', $chro, $position,
$call);
}
}
#print "Chro:$chro, Position:$position, call=$call\n";
}
#print @contents;
}
}

close DATA;
close OUT1;
close LOH;

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Why the if loop does not work??

2004-12-09 Thread Li, Aiguo (NIH/NCI)

I am trying to merge two file based on a SNP_A-## list in each file. For
some reason, the regular expression in the if loop does not work and I can
not match the key values generated from hash to the string from the input
file.  Could anybody help me detect the problem?

Thanks,

AG


#!usr/local/bin/perl

use strict;
use warnings;
open (DATA, "C:/perl/work/A172cn.txt") or die "Can not open file $!\n";
open (DATA2, "C:/perl/work/a127_gdas.txt") or die "Can not open file $! \n";

while()
{
my $mykey;
my $myvalue;
my %Hash;
my %mainhash = ();

next unless /^SNP/;
%Hash=getkeyvalue($mykey,$myvalue);

foreach $mykey (keys(%Hash)) 
{

my $inline ;

while($inline = )
{

next unless /SNP/;
#print "mykey $mykey my value: $Hash{$mykey} \n";
if($inline =~ m/($mykey)/) 
{
print "$mykey $Hash{$mykey} $inline \n";

}
}
}
}

sub getkeyvalue
{

my @line = ();
my $value;
my $col;


@line = split('\t', $_);

$col = $line[0];

chomp $col;
$value =join("\t", $line[1], $line[2]);
return ($col, $value);
}


#__DATA__
#SNP_A-1509443  3   3776202
#SNP_A_1518557  3   3776202
#SNP_A_1514538  5   5350951
#SNP_A_1516403  1   5483872
#BFFX-BioB-M_at  P P P P P A P
#[snip]

#__DATA2__
#SNP ID dbSNP RS ID Chromosome  Physical Position   TSC ID
A172_Call   A172_Call Zone
#7085   SNP_A-1509443   rs1393064   1   2882121 TSC0565952  AA
0.02861
#4900   SNP_A-1518557   rs9663211   3985402 TSC0273278  AA
0.152388
#8258   SNP_A-1517286   rs1599169   1   4804829 TSC0694296  BB
0.538696
#10947  SNP_A-1516024   rs5803091   4982250 TSC1478148  AA
0.569713
#7794   SNP_A-1514538   rs1414379   1   5468765 TSC0609730  AA
0.299872
#9130   SNP_A-1516403   rs1890191   1   5596686 TSC0913001  AA
0.221319
#7214   SNP_A-1518687   rs1396904   1   6605831 TSC0574502  BB
0.040226
#526SNP_A-1509959   rs9504931   6654350 TSC0042354  BB
0.123611
#4345   SNP_A-1515791   rs8452631   7133863 TSC0218512
NoCall  0.814947
#7914   SNP_A-1512212   rs1418490   1   7134783 TSC0617931  BB
0.077556
#4470   SNP_A-1513560   rs7056951   7145191 TSC0246331  AA
0.700697
#8386   SNP_A-1519671   rs2286511   7620645 TSC0730553  AA
0.09444
#4854   SNP_A-1515942   rs9661341   8082754 TSC0272985  BB
0.212891
#637SNP_A-1509129   rs2054741   10542407TSC0043572
BB  0.122514
#9481   SNP_A-1512107   rs1281034   1   10706737TSC0984465
NoCall  10
#432SNP_A-1514390   rs7182061   11004020TSC0041639
BB  0.66461
#10471  SNP_A-1518041   rs2206321   1   12221853TSC1262794
AA  0.058009
#[snip]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Merge two files

2004-12-07 Thread Li, Aiguo (NIH/NCI)

Hello, all.

I need to merge two files according a id list in each file.  What I am
trying to do is to have a sub routine that get the id from data file 1 as
key for a hash, a second sub routine to put the value from data file 1 into
the hash, then while reading in the data file 2 using the while loop, match
the hash key to the id for each line, if it match print the $_ and $key =>
$value from the hash.

I am not sure whether my logic is right or not here and am having trouble in
achieving my second goal -- fill data into hash--.  Could anyone see what is
wrong?

Thanks,

#!usr/local/bin/perl

use strict;
use warnings;



while()
{
my $mykey;
my $myvalue;
my @a;

my %Hash = ();
next unless /^SNP/;
@a = getkeyvalue();
$mykey = $a[0];
$myvalue = $a[1];
#print "a0 $mykey, a1 $myvalue \n";
%Hash=($mykey,$myvalue);
print "Hash: $mykey is $myvalue";
return %Hash
}


sub getkeyvalue 
{
my $line ;
my @line = ();
my $value;
my $col;

@line = split('\t', $_);

#print @line;

$col = $line[0];

chomp $col;
#print "key = $col, $line[0]";
$value = join("\t", @line[1..2]);
#print "value =: $value";
return ($col, $value);
}
__DATA1__

ids chro_numLocation
SNP_A-1509443   1   2672921
SNP_A-1518557   3   3776202
SNP_A-1516024   1   4798845
SNP_A-1514538   5   5350951
SNP_A-1516403   1   5483872
SNP_A-1518687   2   6493017
SNP_A-1509959   1   6541536
SNP_A-1512212   4   7021969
SNP_A-1513560   1   7032377
SNP_A-1519671   5   7507831
[snip]  

__DATA2__
SNP ID  dbSNP RS ID Chromosome  Physical Position   TSC ID
A172_Call   A172_Call Zone
7085SNP_A-1509443   rs1393064   1   2882121 TSC0565952  AA
0.02861
4900SNP_A-1518557   rs9663211   3985402 TSC0273278  AA
0.152388
8258SNP_A-1517286   rs1599169   1   4804829 TSC0694296  BB
0.538696
10947   SNP_A-1516024   rs5803091   4982250 TSC1478148  AA
0.569713
7794SNP_A-1514538   rs1414379   1   5468765 TSC0609730  AA
0.299872
9130SNP_A-1516403   rs1890191   1   5596686 TSC0913001  AA
0.221319
7214SNP_A-1518687   rs1396904   1   6605831 TSC0574502  BB
0.040226
526 SNP_A-1509959   rs9504931   6654350 TSC0042354  BB
0.123611
4345SNP_A-1515791   rs8452631   7133863 TSC0218512
NoCall  0.814947
7914SNP_A-1512212   rs1418490   1   7134783 TSC0617931  BB
0.077556
4470SNP_A-1513560   rs7056951   7145191 TSC0246331  AA
0.700697
8386SNP_A-1519671   rs2286511   7620645 TSC0730553  AA
0.09444
4854SNP_A-1515942   rs9661341   8082754 TSC0272985  BB
0.212891
637 SNP_A-1509129   rs2054741   10542407TSC0043572
BB  0.122514
9481SNP_A-1512107   rs1281034   1   10706737TSC0984465
NoCall  10
432 SNP_A-1514390   rs7182061   11004020TSC0041639
BB  0.66461
10471   SNP_A-1518041   rs2206321   1   12221853TSC1262794
AA  0.058009
[snip]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




RE: Where is it wrong with my code

2004-12-02 Thread Li, Aiguo (NIH/NCI)
Never mind.  It is working now.

Thanks,

Aiguo

-Original Message-
From: Li, Aiguo (NIH/NCI) 
Sent: Thursday, December 02, 2004 10:11 AM
To: Perl Beginners
Subject: Where is it wrong with my code


Hello, all.

I am trying to assign a "P" for any values greater than 1.0 and assign a "A"
otherwise.  However, I need to skip the header line and the first column.
Something is wrong with my code and it does not skip the first column well.
Please help me to detect the bug.

Thanks,

Aiguo

#!/usr/bin/perl
use warnings;
use strict;

while () {
next unless /_at/;
my @groupPA = split(/\t/, $_);
print " \t @groupPA \n";
foreach my $groupPA (@groupPA){
my $call;
my @PAcall;
next unless $groupPA =~ (/\d+/);
print $groupPA;
#print $_;
if($groupPA =~ /\d+/ && $groupPA >= 1 ){$call = "P";}
elsif($groupPA =~ /\d+/ && $groupPA < 1){$call ="A";}
push (@PAcall, $call);
#print "@PAcall \n";
}
}
__DATA__
TypeHF  LY  M   MSS NC  NR  N   D
CC  CR  MS  T98 U87
1405_i_at   1.2 0   0   0   0   0   0
1   0   0   0   0   0
1431_at 1.2 0   1   2   0   0.5 1.5 2
2   1.3 2   0   1.3
1438_at 0.4 0.7 2   0   2   2   0
2   2   2   2   0   0
1487_at 2   2   2   2   2   2   2   2
2   2   2   2   2
1494_f_at   0.4 0   0   1   0   0   0
1   1   0   0   0.7 0.7
1598_g_at   2   2   2   2   2   2   2
2   2   2   2   2   2
160020_at   2   2   2   2   0   1   2
2   2   2   2   2   2
1729_at 2   2   2   2   0   0   0   2
2   2   2   2   2
1773_at 1.2 2   2   0   0   0   0   2
1   1.3 2   0   2
177_at  1.2 1.3 2   2   2   1   1
2   2   1.3 2   2   2
179_at  2   2   2   2   2   1   1.5 2
2   2   1   2   1.3
[snip] 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




Where is it wrong with my code

2004-12-02 Thread Li, Aiguo (NIH/NCI)
Hello, all.

I am trying to assign a "P" for any values greater than 1.0 and assign a "A"
otherwise.  However, I need to skip the header line and the first column.
Something is wrong with my code and it does not skip the first column well.
Please help me to detect the bug.

Thanks,

Aiguo

#!/usr/bin/perl
use warnings;
use strict;

while () {
next unless /_at/;
my @groupPA = split(/\t/, $_);
print " \t @groupPA \n";
foreach my $groupPA (@groupPA){
my $call;
my @PAcall;
next unless $groupPA =~ (/\d+/);
print $groupPA;
#print $_;
if($groupPA =~ /\d+/ && $groupPA >= 1 ){$call = "P";}
elsif($groupPA =~ /\d+/ && $groupPA < 1){$call ="A";}
push (@PAcall, $call);
#print "@PAcall \n";
}
}
__DATA__
TypeHF  LY  M   MSS NC  NR  N   D
CC  CR  MS  T98 U87
1405_i_at   1.2 0   0   0   0   0   0
1   0   0   0   0   0
1431_at 1.2 0   1   2   0   0.5 1.5 2
2   1.3 2   0   1.3
1438_at 0.4 0.7 2   0   2   2   0
2   2   2   2   0   0
1487_at 2   2   2   2   2   2   2   2
2   2   2   2   2
1494_f_at   0.4 0   0   1   0   0   0
1   1   0   0   0.7 0.7
1598_g_at   2   2   2   2   2   2   2
2   2   2   2   2   2
160020_at   2   2   2   2   0   1   2
2   2   2   2   2   2
1729_at 2   2   2   2   0   0   0   2
2   2   2   2   2
1773_at 1.2 2   2   0   0   0   0   2
1   1.3 2   0   2
177_at  1.2 1.3 2   2   2   1   1
2   2   1.3 2   2   2
179_at  2   2   2   2   2   1   1.5 2
2   2   1   2   1.3
[snip] 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Print an array without the duplicate

2004-11-18 Thread Li, Aiguo (NIH/NCI)

Hi, all.

I need to print a header line without duplicated item and am struglling with
the elimination of replicated items.  The following codes are nor working
yet.  Could anyone help me to make this work?

#!usr/bin/perl

use strict;
use warnings;

my @header;
my $i ;
my $size = @header;


while () {
  if (/Type/) 
{
@header = split(/\t/, $_);
print $_;
for ($i=1; $i<= $size; $i++)
{
if (my @item[i] ne @item[i+1])
{pirnt $item[i+1];}
}
}
  } 
  
  
 __DATA__
 Type   HF  HF  HF  HF  LY  LY  LY
 [snip]

 Output should be
TypeHF  LY

Thanks,

Aiguo

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




RE: Why doesn't this work?

2004-11-11 Thread Li, Aiguo (NIH/NCI)

Hi, Zeus.

Thanks for your comment.  "Replicates" line is the header line and there are
two treatments in this case.

Probe idTreat1 Treat1 Treat1 treat1 Treat2 treat2 treat2
AFFX-BioB-5_at  P   P   P   P   P   P   P
AFFX-BioB-M_at  P   P   M   P   P   M   P
AFFX-BioB-3_at  P   M   P   A   A   P   P
AFFX-BioC-5_at  P   P   A   M   P   A   M
AFFX-BioC-3_at  M   M   P   A   P   P   M

The output should be as follow without the process of calculation

Probe idTreat1  Treat2
AffX-BioB-5_at  (2p +M)/4 =2(2*3+0)/3=2
FFX-BioB-M_at   (2*3+0)/4 =1.7  (2*3+0)/3=2
AFFX-BioB-3_at  (2*2+0)/4 =1(2*2+0)/3=1.3
AFFX-BioC-5_at  (2*2+1)/4 =1.25 (2*1+1)/3=1
AFFX-BioC-3_at  (2*1+1)/4 = 0.75(2*2+1)/3=1.7

The denominator is always the # of replicates in each treatment.

Thanks,

Aiguo



-Original Message-
From: Zeus Odin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, November 11, 2004 8:08 AM
To: [EMAIL PROTECTED]
Subject: Re: Why doesn't this work?


"Aiguo Li" <[EMAIL PROTECTED]> wrote in message...
> Hello,

Hello.

> I have the following dataset and want to calculate a P/A ratio for 
> each replicates in the dataset. In this case, treatment 1 has 4 
> replicats and treatment2 has 3 replicates. The P/A = [((#of P)*2) + (# 
> of M)]/# of replicates.  The output should be two columns of P/A 
> ratios for two treatments.

Your explanation doesn't make much sense to me. Maybe it's just me.
;-)

Is the line starting with "Replicates" a header row? I gleaned from your
code that

P/A = (2p + m)/a

where p, m, and a equal the number of P's, M's, and A's respectively on a
row. If this is true, what is to be done when a row is missing A, M, or P?
Of course, if A is missing then the equation is undefined because you have
zero in the denominator. You also state that the output should be two
columns, yet in your code you print $a, $m, and $p but nothing else.

It would be helpful if you took one record from below then manually compute
the two columns you want printed.

>
>
> I have made this far with the following code, but have not been able 
> to
make
> the code work yet.  Could anybody shed some light on it?
>
> Thanks,
>
> AG
>
> #!usr/bin/perl -w
>
> use strict;
> use warnings;
>
> my @split;
> my @replicate = ( 5, 3);
> my @ratio;
> my $p=0;
> my $m=0;
> my $a=0;
> my $rep=0;
> my $item;
> my $ratio;
>
>
> open (FILE, "
>
> while(  )
> {
> #print;
> chomp;
> @split = split (/\t/, $_);
> push (@split, $_);
> #print "$_ \n";
> foreach $rep (@replicate)
> {
> for(my $i=1; $i<=$rep; $i++)
> {
> push (@split, $_);
> SWITCH:
> if ($_ =~ "P") {$p++; last SWITCH;}
> if ($_ =~ "M") {$m++; last SWITCH;}
> if ($_ =~ "A") {$a++; last SWITCH;}
> }
> print $p, $m, $a;
> @ratio = (($p*2)+$m)/$rep;
>
>
>
>
> }
> }
>
> close FILE;



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Why doesn't this work?

2004-11-09 Thread Li, Aiguo (NIH/NCI)
Hello,

I have the following dataset and want to calculate a P/A ratio for each
replicates in the dataset. In this case, treatment 1 has 4 replicats and
treatment2 has 3 replicates. The P/A = [((#of P)*2) + (# of M)]/# of
replicates.  The output should be two columns of P/A ratios for two
treatments.

Dataset:
Replicates  1   1   1   1   2   2   2
AFFX-BioB-5_at  P   P   P   P   P   P   P
AFFX-BioB-M_at  P   P   P   P   P   P   P
AFFX-BioB-3_at  P   P   P   A   A   P   P
AFFX-BioC-5_at  P   P   P   P   P   P   P
AFFX-BioC-3_at  P   P   P   P   P   P   P
AFFX-BioDn-5_at P   P   M   P   P   P   P
AFFX-BioDn-3_at P   P   P   P   P   P   P
AFFX-CreX-5_at  P   P   P   P   P   P   P
AFFX-CreX-3_at  P   P   P   P   P   P   P
AFFX-DapX-5_at  A   A   P   A   A   P   A
AFFX-DapX-M_at  A   A   A   A   A   A   A
AFFX-DapX-3_at  A   A   A   A   A   A   A
AFFX-LysX-5_at  A   A   A   P   A   A   A
AFFX-LysX-M_at  A   A   A   A   A   A   A
AFFX-LysX-3_at  A   A   A   A   P   M   A
AFFX-PheX-5_at  A   A   A   A   A   A   A
AFFX-PheX-M_at  A   A   A   A   A   A   A
AFFX-PheX-3_at  A   A   A   A   A   A   A


I have made this far with the following code, but have not been able to make
the code work yet.  Could anybody shed some light on it?

Thanks,

AG

#!usr/bin/perl -w

use strict;
use warnings;

my @split;
my @replicate = ( 5, 3);
my @ratio;
my $p=0;
my $m=0; 
my $a=0; 
my $rep=0;
my $item;
my $ratio;


open (FILE, " ) 
{
#print;
chomp;
@split = split (/\t/, $_);
push (@split, $_);
#print "$_ \n";
foreach $rep (@replicate)
{
for(my $i=1; $i<=$rep; $i++)
{
push (@split, $_);
SWITCH:
if ($_ =~ "P") {$p++; last SWITCH;}
if ($_ =~ "M") {$m++; last SWITCH;}
if ($_ =~ "A") {$a++; last SWITCH;}
}
print $p, $m, $a;
@ratio = (($p*2)+$m)/$a;




}
}

close FILE;

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




P/A ratio

2004-11-09 Thread Li, Aiguo (NIH/NCI)
Hi, all.

I have the following dataset and need to calculate a P/A ratio for each
replicates in the dataset. In this case, treatment 1 has 4 replicats and
treatment2 has 3 replicates. The P/A = [((#of P)*2) + (# of M)]/# of
replicates.  The output should be two columns of P/A ratios for two
treatments.

Dataset:
Replicates  1   1   1   1   2   2   2
AFFX-BioB-5_at  P   P   P   P   P   P   P
AFFX-BioB-M_at  P   P   P   P   P   P   P
AFFX-BioB-3_at  P   P   P   A   A   P   P
AFFX-BioC-5_at  P   P   P   P   P   P   P
AFFX-BioC-3_at  P   P   P   P   P   P   P
AFFX-BioDn-5_at P   P   M   P   P   P   P
AFFX-BioDn-3_at P   P   P   P   P   P   P
AFFX-CreX-5_at  P   P   P   P   P   P   P
AFFX-CreX-3_at  P   P   P   P   P   P   P
AFFX-DapX-5_at  A   A   P   A   A   P   A
AFFX-DapX-M_at  A   A   A   A   A   A   A
AFFX-DapX-3_at  A   A   A   A   A   A   A
AFFX-LysX-5_at  A   A   A   P   A   A   A
AFFX-LysX-M_at  A   A   A   A   A   A   A
AFFX-LysX-3_at  A   A   A   A   P   M   A
AFFX-PheX-5_at  A   A   A   A   A   A   A
AFFX-PheX-M_at  A   A   A   A   A   A   A
AFFX-PheX-3_at  A   A   A   A   A   A   A

Use strict;
Use warning;

My @split;
My @replicate = ( 4, 3);
My @ratio;
My $p=0, $m=0, $a=0, $rep=0, $i;

Open (IN, "C:\replicate.txt") or die "can not open file\n";

While () {
@split = split (/\t/, $_);
foreach $rep (@replicate)
{
for(my $i=0; $i<$rep; $i++)
{
foreach $item (@split);
{
if ($item == "P")
{
$p = $p +1;
}
elseif ($item == "M")
{
$m = $m +1;
}
else ($item == "A")
{
$a = $a +1;
}
@ratio = (($p*2)+$m)/$a;
print push (@ratio, $ratio);
}
}
}
}

I have not been able to make this work yet.  I would be appreciative of any
suggestions or help.

Thx in advance,

Aiguo Li


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Generate a perl list file

2004-10-19 Thread Li, Aiguo (NIH/NCI)

Dear all.

I have a tab delimited file as follow:

V   namep
1.0 AAA 0.001
0.9 BBB 0.003
0.8 CCC 0.004
.

I need to convert the file into following format:
{   labels =
(
{v="1.0"; name = "AAA"; p = "0.001"; },
{v="0.9"; name = "BBB"; p = "0.003";},
{v="0.8"; name = "CCC"; p = "0.004";}
);
}

I have not been able to make the following code work yet and would like to
hear your suggestion for a better option.  The following is my thought at
this point.

print " {   labels =\n";
print "\t\t (\n";
open IN "tag.txt";
while () {
@line = split(/\t/);
print "\t v=""@line[0]";";
print "l=""@line[1]";";
print " p= "@line[2]";";
};

Thanks,

Aiguo

-Original Message-
From: Alden Meneses [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 18, 2004 11:50 AM
To: [EMAIL PROTECTED]
Subject: Re: help on comparing lines in a text file


thanks GH

here is my updated code

use strict;
use warnings;

my $file = 'C:\Documents and Settings\menesea\My
Documents\alden712\ibd100\ibd100.txt';
open(IBDIN, "<$file") || die "cannot open $file $!"; while(){
my @array_a = split ' ', ;
my @array_b = split ' ', ;
compare_array();
}
close(IBDIN);

sub compare_array {
my (%seen, @bonly);
@[EMAIL PROTECTED] = (); # build lookup table
foreach my $item (@array_b){
push(@bonly, $item) unless exists $seen{$item};
}
print "@bonly \n";
}

I use komodo from activestate to help write my perl scripts. When I use the
while loop it complains about the @array_a and @array_b in the subroutine
thus I get a compile error when running.

can someone tell me what I am doing wrong?
TIA,
Alden

On Sat, 16 Oct 2004 22:34:59 +0200, Gunnar Hjalmarsson <[EMAIL PROTECTED]>
wrote:
> Alden Meneses wrote:
> > here is my updated code but it is not my loops are not set correctly 
> > as I get nothing when i print to screen.
> >
> > open(IBDINA, "<$file") || die "cannot open $file $!"; open(IBDINB, 
> > "<$file") || die "cannot open $file $!"; chomp(@list_a=);
> > chomp(@list_b=);
> > for ($a = 0; $a < @list_a; $a+=2){
> > @array_a=(split(/ /,$list_a[$a]));
> > for ($b =1; $b < @list_b; $b+=2){
> > @array_b=(split(/ /,$list_b[$b]));
> > @[EMAIL PROTECTED] = (); # build lookup table
> > foreach $item ($array_b){
> > push(@bonly, $item) unless exists $seen{$item};
> > print @bonly;
> > }
> > }
> > }
> >
> > close(IBDINA);
> > close(IBDINB);
> 
> Sorry to say it, but it looks terrible. ;-)
> 
> First and foremost, you are not using strictures and warnings. Posting 
> non-working code to a mailing list, without having had Perl perform 
> some basic checks, is bad, bad, bad.
> 
> Another thing is that you open two filehandles to the same file, and 
> unnecessarily complicates the assigning of the arrays. There is no 
> need for those outer loops.
> 
> This is a simplification of your code:
> 
> #!/usr/bin/perl
> use strict;
> use warnings;
> 
> my $file = '/path/to/file';
> 
> open IBDIN, "< $file" or die "cannot open $file $!";
> my @array_a = split ' ', ;
> my @array_b = split ' ', ;
> close IBDIN;
> 
> my (%seen, @bonly);
> @[EMAIL PROTECTED] = (); # build lookup table
> foreach my $item ($array_b){
> push(@bonly, $item) unless exists $seen{$item};
> print @bonly;
> }
> 
> __END__
> 
> Now, that code is not correct either. Actually, it doesn't even 
> compile since strictures are enabled, but just that fact illustrates 
> how using strict can help you detect a mistake.
> 
> Hopefully the above will help you move forward, and concentrate on the 
> comparison part of your program.
> 
> --
> 
> 
> Gunnar Hjalmarsson
> Email: http://www.gunnar.cc/cgi-bin/contact.pl
> 
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED] 
>  
> 
>

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Questions related to draw graphy using perl

2004-10-15 Thread Li, Aiguo (NIH/NCI)
Dear all.
 
I am a new user of Perl with some experience in perl data extraction with
pattern matching, but never did anything with making graphy using perl.  I
need to draw a chromosome copy number graphy and p-values.  The data looks
are as follow:
 
SNP id   physical location   copy number meta p-value   
SNP_A-1507380120264678   1.207953-20
SNP_A-1507487120319466   1.261954-20
SNP_A-1517022120783585   0.957751-20
SNP_A-1511651121478764   0.957812-20
SNP_A-151681817549   2.0043180.722914   
SNP_A-1516270122257923   3.2988680.967391   
SNP_A-1512747122611515   1.419261-14.5074   
SNP_A-1519155123556896   0.740608-14.5074   
SNP_A-1517539124789024   1.202693-14.5074   
SNP_A-1511157124890914   1.222511-14.5074   
 
The physical location should be the place where the p-value and copy number
should be along the vertical line.  The SNP id could be treated as
annotation.  I have some open source code available for drawing a chromosome
with cytobands writing in perl.  The final gool is to put this graph of copy
number beside the chromosome map,  which make me think that it will be
easier using perl to do draw this copy number map.  My questions to you is:
 
Is it possible to create graphy like this using perl?
 
The graphy in my mind should look like this and it will be better to use
histogram bar for meta p-values.
 
 
---|
 --|
   |
  -|
 
Thanks in advance,
 
Aiguo Li