Position Weight Matrix of Set of Strings with Perl

2006-09-06 Thread Wijaya Edward

Dear Experts,
 
I am looking for a really efficient way to compute a position weight matrix 
(PWM) from a set of strings. In each set the strings are of the same length. 
Basically PWM compute the frequency (or probabilities) of bases [ATCG] occur in 
each position/column of a string. For example the set of strings below:
 
AAA
ATG
TTT
GTC

Note that the length of these strings in the set 
maybe greater than 3. 

Would give the following result: 
 
$VAR1 =  {
'A' => [2,1,1],
'T' => [1,3,1],
'C' => [0,0,1],
'G' => [1,0,1]
 };
 
So the size of the array is the same with the length of the string.
In my case I need the variation of it, namely the probability of the 
each base occur in the particular position:

$VAR = {
'A' => ['0.5','0.25','0.25'],
'T' => ['0.25','0.75','0.25'],
'C' => ['0','0','0.25'],
'G' => ['0.25','0','0.25']
  }
 
In this link you can  find my incredibly naive and inefficient code. 
Can any body suggest a better and faster solution than this:
 
http://www.rafb.net/paste/results/c6T7B629.html
 
 
Thanks and Regards,
Edward WIJAYA
SINGAPORE

 Institute For Infocomm Research - Disclaimer -
This email is confidential and may be privileged.  If you are not the intended 
recipient, please delete it and notify us immediately. Please do not copy or 
use it for any purpose, or disclose its contents to any other person. Thank you.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: pipe as an argument

2006-09-06 Thread Dr.Ruud
"Budi Milis" schreef:

> How do accept pipe as an valid argument in perl, for example:
>
> echo 123456 | ./convert_time.pl
>
> convert_time.pl:
> #!/usr/bin/perl
>
> use POSIX qw(strftime);
>
> my $time_in = $ARGV[0];
> my $time_out = strftime "%Y%m%d", localtime($time_in);
> print "$time_out\n";


I don't really understand your question, but I guess you are looking for
this:

  chomp( my $time_in =  ) ;

-- 
Affijn, Ruud

"Gewoon is een tijger."



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: Position Weight Matrix of Set of Strings with Perl

2006-09-06 Thread Mumia W.

On 09/06/2006 04:02 AM, Wijaya Edward wrote:

Dear Experts,
 
I am looking for a really efficient way to compute a position weight matrix (PWM) from a set of strings. In each set the strings are of the same length. Basically PWM compute the frequency (or probabilities) of bases [ATCG] occur in each position/column of a string. For example the set of strings below:
 
AAA

ATG
TTT
GTC

Note that the length of these strings in the set 
maybe greater than 3. 

Would give the following result: 
 
$VAR1 =  {

'A' => [2,1,1],
'T' => [1,3,1],
'C' => [0,0,1],
'G' => [1,0,1]
 };
 
So the size of the array is the same with the length of the string.
In my case I need the variation of it, namely the probability of the 
each base occur in the particular position:


$VAR = {
'A' => ['0.5','0.25','0.25'],
'T' => ['0.25','0.75','0.25'],
'C' => ['0','0','0.25'],
'G' => ['0.25','0','0.25']
  }
 
In this link you can  find my incredibly naive and inefficient code. 
Can any body suggest a better and faster solution than this:
 
http://www.rafb.net/paste/results/c6T7B629.html
 
 
Thanks and Regards,

Edward WIJAYA
SINGAPORE



Although I'm sure that smarter posters than I will turn this 
into a one-liner, I think that my solution is not so atrocious:


use strict;
use warnings;
use Data::Dumper;
local our @deep;
local $; = ','; # A vestige of a previous version

my @data = qw(AAA ATG TTT GTC);
my @d2 = map [ split // ], @data;

my (%hash);
for my $entry (@d2) {
*deep = $entry;
for my $nx (0..$#deep) {
$hash{$deep[$nx]}[$nx]++;
}
}
foreach my $entry (values %hash) {
$entry = [ map defined $_ ? $_ : 0, @$entry ];
}
print Dumper(\%hash);

__HTH__


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: Position Weight Matrix of Set of Strings with Perl

2006-09-06 Thread Rob Dixon

Wijaya Edward wrote:

> Dear Experts,
>
> I am looking for a really efficient way to compute a position weight matrix
> (PWM) from a set of strings. In each set the strings are of the same length.
> Basically PWM compute the frequency (or probabilities) of bases [ATCG] occur
> in each position/column of a string. For example the set of strings below:
>
> AAA
> ATG
> TTT
> GTC
>
> Note that the length of these strings in the set
> maybe greater than 3.
>
> Would give the following result:
>
> $VAR1 =  {
> 'A' => [2,1,1],
> 'T' => [1,3,1],
> 'C' => [0,0,1],
> 'G' => [1,0,1]
>  };
>
> So the size of the array is the same with the length of the string.
> In my case I need the variation of it, namely the probability of the
> each base occur in the particular position:
>
> $VAR = {
> 'A' => ['0.5','0.25','0.25'],
> 'T' => ['0.25','0.75','0.25'],
> 'C' => ['0','0','0.25'],
> 'G' => ['0.25','0','0.25']
>   }
>
> In this link you can  find my incredibly naive and inefficient code.
> Can any body suggest a better and faster solution than this:
>
> http://www.rafb.net/paste/results/c6T7B629.html

Hi Edward.

A nice little problem. Thank you.

The main reason for the length of your own solution is that you haven't taken
the opportunity to use hashes to store data that is parallel across the four
possible characters, so the code is about four times as long as it needs to be!

Here is my solution. I have written it to pull data from the pseudo-filehandle
DATA, as it is unlikely that you will want your actual data hard-coded as an
array.

HTH.

Rob Dixon


use strict;
use warnings;

my %pwm;

while () {
  my $col = 0;
  foreach my $c (/\S/g) {
$pwm{$c}[$col++]++;
  }
}

foreach my $freq (values %pwm) {
  $_ = $_ ? $_ / keys %pwm : 0 foreach @$freq;
}

use Data::Dumper;
print Dumper \%pwm;


__END__
AAA
ATG
TTT
GTC


OUTPUT


$VAR1 = {
  'A' => [
   '0.5',
   '0.25',
   '0.25'
 ],
  'T' => [
   '0.25',
   '0.75',
   '0.25'
 ],
  'C' => [
   0,
   0,
   '0.25'
 ],
  'G' => [
   '0.25',
   0,
   '0.25'
 ]
};

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: Need Help

2006-09-06 Thread Ashok Varma

Hi Kishore,

The below snippet will get your desired result.

---
#!/usr/bin/perl
use strict ;

my @structure_name = ();
my $fni = 'd:\Sample.txt' ;
my $fno = 'd:\ashok.txt' ;
my $flag = 0;

open my $fhi, '<', $fni or die "open '$fni' failed: $!" ;
open my $fho, '>', $fno or die "open '$fno' failed: $!" ;

while (<$fhi>) {
if ($flag == 1 && $_ !~ /\}/) {

   push @structure_name, $_ if (grep {/NEED*/i} $_);
   next;
} else {
$flag = 0;
}
if($_ =~ /knk.+pmk.*/g || $_ =~ /pmk.+knk.*/g) {
   push @structure_name, $_;
$flag = 1;
}
}

map {$_ =~ s/(\s| )+//g} @structure_name;

foreach my $need (@structure_name) {
   my ($key, $value) = split(/=/, $need);
   if ($value eq '{') {
   $key =~ s/some//;
   print $fho "Structure : $key\n";
   } elsif($key =~ /NEED/) {
   print $fho "$key : $value\n";
   }
}

close $fho;
close $fhi ;
---

Let me know if you find it difficult to understand.  Hope script is not that
complicated to understand  :).  Have a nice time, Njo.
Rudd - Please do let me know how good i can optimize above script.

:o)
Ashok

On 9/4/06, Nagakishor, K <[EMAIL PROTECTED]> wrote:


I have a file containing 100's of structures in it. In that file I need
to identify the structures with particular name (ex: N1 AND N2) and dump
the values of only some fields (ex: P1 AND P2) in to another file.

If a structure does not contain the names N1 AND N2, then we should skip
it.



May anyone have a code for this or any idea of how to do this?





Re: Position Weight Matrix of Set of Strings with Perl

2006-09-06 Thread Dr.Ruud
Rob Dixon schreef:

> use strict;
> use warnings;
>
> my %pwm;
>
> while () {
>my $col = 0;
>foreach my $c (/\S/g) {
>  $pwm{$c}[$col++]++;
>}
> }
>
> foreach my $freq (values %pwm) {
>$_ = $_ ? $_ / keys %pwm : 0 foreach @$freq;
> }
>
> use Data::Dumper;
> print Dumper \%pwm;
>
>
> __END__
> AAA
> ATG
> TTT
> GTC


Is "keys %pwm" the right divisor, or is the number of lines?


Variant:

#!/usr/bin/perl
  use warnings ;
  use strict ;

  my ($c, $r, %pwm) ;
  /\s/ ? ($r++, $c=0) : $pwm{$_}[$c++]++
  for do {local $/;  =~ /(.)/sg} ;
  for (values %pwm) { ($_||=0)/=$r for @$_ } ;

  use Data::Dumper ;
  print Dumper \%pwm ;

__DATA__
AAA
ATG
TTT
GTC

:)

-- 
Affijn, Ruud

"Gewoon is een tijger."



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Perl interface with Oracle

2006-09-06 Thread Puri, Nilay
Hi All,

 

On Unix box we have Oracle 9i and Perl 5.8

Now we are upgrading Oracle to 10g.

 

In that case we need to re-intsall Perl DBD module.

 

Is there any other activity that should be taken care of ?

 

Thanks in advance,

Nilay


***
This email may contain confidential material. 
If you were not an intended recipient, 
please notify the sender and delete all copies. 
We may monitor email to and from our network. 

***




Re: pipe as an argument

2006-09-06 Thread Randal L. Schwartz
> ""Budi" == "Budi Milis" <[EMAIL PROTECTED]> writes:

"Budi> On 9/6/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>> Consider this:
>> my $arg = @ARGV ? shift @ARGV : ;
>> 

"Budi> Works as I expected, many thanks.
"Budi> However, my previous code was:

"Budi> my $time_in = $ARG[0] || ;

"Budi> and it doesn't work, why and whats the different with yours?

Perhaps ARG is not ARGV? 

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
 http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




arithmetic expression while substituting

2006-09-06 Thread Michael Alipio
Hi,

Suppose I have the output of this command
date +%d.%H

which outputs:
06.11

I want to adjust the last two digits to less 1:
such that it becomes  06.10..
how do I do that?

perhaps something like this.
s/\d+$/(regexp being lookup minus 1/


thanks!

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: arithmetic expression while substituting

2006-09-06 Thread Adriano Ferreira

On 9/6/06, Michael Alipio <[EMAIL PROTECTED]> wrote:

I want to adjust the last two digits to less 1:



perhaps something like this.
s/\d+$/(regexp being lookup minus 1/


s/(\d+)$/$1-1/e

is going to work, even though it is convoluted and not robust. For
example, '06.00' will become '06.-1'

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: arithmetic expression while substituting

2006-09-06 Thread Jay Savage

On 9/5/06, Michael Alipio <[EMAIL PROTECTED]> wrote:

Hi,

Suppose I have the output of this command
date +%d.%H

which outputs:
06.11

I want to adjust the last two digits to less 1:
such that it becomes  06.10..
how do I do that?

perhaps something like this.
s/\d+$/(regexp being lookup minus 1/


thanks!



s/\.(\d+)$/$1-1/e

But as Adriano pointed out, a simple subtraction won't do what you
want. 6.00 will become 6.-1.

you could get around that by doing something like:

s/\.(\d+)/$1 > 0 ? $1-1 : 0/e

But that still probably won't do what you want, because 6.00 - 1
should really be 5.23 in most cases.

Your best bet is to look at a module like Date::Manip or Date::Calc.

also, there's no reason to run `date` as an external command. See
perl's built-in localtime() function.

HTH,

-- jay
--
This email and attachment(s): [  ] blogable; [ x ] ask first; [  ]
private and confidential

daggerquill [at] gmail [dot] com
http://www.tuaw.com  http://www.downloadsquad.com  http://www.engatiki.org

values of β will give rise to dom!


Net:SSH:Perl error

2006-09-06 Thread Jim
Hi,
I have a script that I have been running successfully on perl 5.6.1. It uses
Net::SSH to send the code of another perl script to a remote host.  I tried
to move the script(s) to a new box running 5.8 and it errors out while
trying to run the cmd method on the command string passed to it. Here is the
error (that is very misleading) and the relevant parts of the script.  For
some reason if I try to feed a `cat` to the $cmd string, it does not work
--
use Net::SSH::Perl;
$host = "somehost";
my $ssh = Net::SSH::Perl->new($host,
   protocol => '2',
   debug => 1,
   privileged => 0);

#  THIS FAILS, WORKED FINE BEFORE
#my $cmd = "perl -e '".`cat ./cm-unix.pl`."'";
#   THIS IS JUST A ONE LINE FILE with 'ls-l' THAT FAILS
my $cmd = `cat GO`;
# WORKS FINE
my $cmd = 'cd /tmp; ls -l';

$ssh->login( $ENV{USER} );
my($stdout, $stderr, $exit) = $ssh->cmd($cmd);
...

ERROR:
input must be 8 bytes long at
/usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi/Crypt/DES.pm line 57.
---

I have tried google and searching throug the archives but not have been able
to find out why this happens

Thanks for any help,
Jim









-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: Net:SSH:Perl error

2006-09-06 Thread Tom Phoenix

On 9/6/06, Jim <[EMAIL PROTECTED]> wrote:


input must be 8 bytes long at
/usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi/Crypt/DES.pm line 57.


I suspect that this is related to Unicode: A string of eight
characters isn't necessarily eight bytes anymore.

If you can track down the source of the string being used on that
line, you may be able to use pack() to limit it to eight bytes, or
perhaps unpack() to break it into eight-byte chunks, whichever is
appropriate. Or you could re-install an older version of perl, since
that works for you, and use that until the bugs are all found and
fixed.

You can report the bug (or search previous reports) via rt.cpan.org:

   http://rt.cpan.org

Hope this helps!

--Tom Phoenix
Stonehenge Perl Training

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




RE: Position Weight Matrix of Set of Strings with Perl

2006-09-06 Thread Wijaya Edward

Dear Rob,
 
I was trying your script with this set of strings:
 
__DATA__
CAGGTG
CAGGTG
 
But how come it returns:
 
$VAR1 = {
'A' => [ 0, '0.5' ], 
'T' => [ 0, 0, 0, 0, '0.5' ], 
'C' => [ '0.5' ],
'G' => [ 0, 0, '0.5', '0.5', 0, '0.5' ]
};

Instead of the correct:

$VAR1 = {
'A' => [ '0', '1', '0', '0', '0', '0' ],
'T' => [ '0', '0', '0', '0', '1', '0' ],
'C' => [ '1', '0', '0', '0', '0', '0' ],
'G' => [ '0', '0', '1', '1', '0', '1' ] 
};


Hope to hear from you again.


Regards,
Edward WIJAYA
SINGAPORE

 


 



From: Rob Dixon [mailto:[EMAIL PROTECTED]
Sent: Wed 9/6/2006 8:14 PM
To: beginners@perl.org
Subject: Re: Position Weight Matrix of Set of Strings with Perl



use strict;
use warnings;

my %pwm;

while () {
   my $col = 0;
   foreach my $c (/\S/g) {
 $pwm{$c}[$col++]++;
   }
}

foreach my $freq (values %pwm) {
   $_ = $_ ? $_ / keys %pwm : 0 foreach @$freq;
}

use Data::Dumper;
print Dumper \%pwm;


__END__
AAA
ATG
TTT
GTC


OUTPUT


$VAR1 = {
   'A' => [
'0.5',
'0.25',
'0.25'
  ],
   'T' => [
'0.25',
'0.75',
'0.25'
  ],
   'C' => [
0,
0,
'0.25'
  ],
   'G' => [
'0.25',
0,
'0.25'
  ]
 };

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 





 Institute For Infocomm Research - Disclaimer -
This email is confidential and may be privileged.  If you are not the intended 
recipient, please delete it and notify us immediately. Please do not copy or 
use it for any purpose, or disclose its contents to any other person. Thank you.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




regular expression question

2006-09-06 Thread chen li
Hello all,

I need a regular expression to process some data but
get stuck. I wonder if anyone here might have a clue.

 input: 
 my $line='group A 1 2 3 4';# separated by space

 results:
 my @data=("group A ",1,2,3,4);

Thanks,

Li

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: regular expression question

2006-09-06 Thread Adriano Ferreira

On 9/6/06, chen li <[EMAIL PROTECTED]> wrote:

I need a regular expression to process some data but
get stuck. I wonder if anyone here might have a clue.

 input:
 my $line='group A 1 2 3 4';# separated by space

 results:
 my @data=("group A ",1,2,3,4);


You barely need a regular expression for this. A split followed by a
join of the first two items would do.

   @data = split ' ', $line;
   unshift @data, (shift @data . " " . shift @data . " ");

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




How to write an integer to socket

2006-09-06 Thread zhao_bingfeng
Hi, guys
In a udp socket test routine, I want to write some integers to server in network
order. But unfortunately, my server receive just chars! how can I do?
My code:
 
#! /usr/bin/perl
 
use IO::Socket;
$sock = new IO::Socket::INET (PeerAddr => '192.168.89.166',
  PeerPort => 27000,
  Proto=> 'udp',
 );
die "Could not create socket with error: $!\n" unless $sock;
my $v = 3;
foreach (1 .. 3) {
print $sock $v++;
}
close ($sock);
 
 

---
Life is a different teacher... 
It doesn't teach lessons, and then keep exams... 
It keeps the exams first and then teaches the lessons.


 


Re: How to write an integer to socket

2006-09-06 Thread Adriano Ferreira

On 9/6/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

Hi, guys
In a udp socket test routine, I want to write some integers to server in network
order. But unfortunately, my server receive just chars! how can I do?


Take a look at 'perldoc pack'

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: How to write an integer to socket

2006-09-06 Thread Adriano Ferreira

On 9/7/06, Adriano Ferreira <[EMAIL PROTECTED]> wrote:

On 9/6/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Hi, guys
> In a udp socket test routine, I want to write some integers to server in 
network
> order. But unfortunately, my server receive just chars! how can I do?

Take a look at 'perldoc pack'


I meant 'perldoc -f pack'

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: How to write an integer to socket

2006-09-06 Thread Tom Phoenix

On 9/6/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:


In a udp socket test routine, I want to write some integers to server
in network order.



my $v = 3;
foreach (1 .. 3) {
print $sock $v++;
}


That doesn't look like "network order", it looks like "plain text".
Didn't you want to use pack()?

Hope this helps!

--Tom Phoenix
Stonehenge Perl Training

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: regular expression question

2006-09-06 Thread chen li


--- Adriano Ferreira <[EMAIL PROTECTED]> wrote:

> On 9/6/06, chen li <[EMAIL PROTECTED]> wrote:
> > I need a regular expression to process some data
> but
> > get stuck. I wonder if anyone here might have a
> clue.
> >
> >  input:
> >  my $line='group A 1 2 3 4';# separated by space
> >
> >  results:
> >  my @data=("group A ",1,2,3,4);
> 
> You barely need a regular expression for this. A
> split followed by a
> join of the first two items would do.
> 
> @data = split ' ', $line;
> unshift @data, (shift @data . " " . shift @data
> . " ");
> 
Hi Adriano,

The line code you provide doesn't work on my computer
but based on what you say I change it into this line
code and it works. 

unshift @data, join (' ',(shift @data, shift @data));

One more question what if I have a file that have
different lines 1) some lines have number only 2) some
lines have more than 2 words at the begining?

my $line1='1 1 1 1 1';
my $line2='group A 2 2 2 2";
my $line3= 'group B and C 3 3 3 3";

Do you think I need a if statement to do the job?

Thanks,

Li


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




答复: How to write an integer to socket

2006-09-06 Thread zhao_bingfeng
yeah, I know, thanks for clue, I just cannor remember  the functions. :)


> -邮件原件-
> 发件人: Adriano Ferreira [mailto:[EMAIL PROTECTED] 
> 发送时间: 2006年9月7日 11:01
> 收件人: [EMAIL PROTECTED]; beginners@perl.org
> 主题: Re: How to write an integer to socket
> 
> On 9/7/06, Adriano Ferreira <[EMAIL PROTECTED]> wrote:
> > On 9/6/06, [EMAIL PROTECTED] 
> <[EMAIL PROTECTED]> wrote:
> > > Hi, guys
> > > In a udp socket test routine, I want to write some integers to 
> > > server in network order. But unfortunately, my server 
> receive just chars! how can I do?
> >
> > Take a look at 'perldoc pack'
> 
> I meant 'perldoc -f pack'
> 
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED] For 
> additional commands, e-mail: [EMAIL PROTECTED] 
>  
> 
> 
> 



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: regular expression question

2006-09-06 Thread Mumia W.

On 09/06/2006 09:49 PM, chen li wrote:

Hello all,

I need a regular expression to process some data but
get stuck. I wonder if anyone here might have a clue.

 input: 
 my $line='group A 1 2 3 4';# separated by space


 results:
 my @data=("group A ",1,2,3,4);



As Adriano Ferreira said, you don't need a regex for this, but 
here it goes:


local $\ = "\n";
local $, = "\n";
my $line='group A 1 2 3 4';# separated by space
my @data = $line =~ m/(group A|\d+)/ig;
print @data;


Thanks,

Li




You're welcome.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: Position Weight Matrix of Set of Strings with Perl

2006-09-06 Thread Mumia W.

On 09/06/2006 05:41 AM, Mumia W. wrote:

On 09/06/2006 04:02 AM, Wijaya Edward wrote:

Dear Experts,
 
I am looking for a really efficient way to compute a position weight 
matrix (PWM) [...]


Although I'm sure that smarter posters than I will [...]


do it right.

Ugh, I forgot about Wijaya's requirement that the PWM be 
calculated in probabilities, and I also forgot that the 
lengths of the base-pair strings can be different. Here is my 
updated code:


use strict;
use warnings;
use Data::Dumper;
local our @deep;
local $" = ', ';

my $length = 5;
my @data = qw(AAA ATG TTT GTC);
@data = map [ split // ], @data;

my (%hash);
for my $entry (@data) {
*deep = $entry;
for my $nx (0..$#deep) {
$hash{$deep[$nx]}[$nx]++;
}
}

my $count = keys %hash;
while (my ($key, $values) = each %hash) {
$#{$values} = $length;
@$values = map defined $_ ? $_ / ($count) : 0, @$values;
@$values = map sprintf('%4.2f',$_), @$values;
print "$key => [ @{$hash{$key}} ]\n";
}

__END__

Output:
A => [ 0.50, 0.25, 0.25, 0.00, 0.00, 0.00 ]
T => [ 0.25, 0.75, 0.25, 0.00, 0.00, 0.00 ]
C => [ 0.00, 0.00, 0.25, 0.00, 0.00, 0.00 ]
G => [ 0.25, 0.00, 0.25, 0.00, 0.00, 0.00 ]
End output.

Note: I use $length to set the maximal array index value.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: regular expression question

2006-09-06 Thread David Romano
chen li wrote on Wed, Sep 06, 2006 at 08:23:42PM PDT:
> --- Adriano Ferreira <[EMAIL PROTECTED]> wrote:
> > On 9/6/06, chen li <[EMAIL PROTECTED]> wrote:
> > > I need a regular expression to process some data
> > but
> > > get stuck. I wonder if anyone here might have a
> > clue.
> > >
> > >  input:
> > >  my $line='group A 1 2 3 4';# separated by space
> > >
> > >  results:
> > >  my @data=("group A ",1,2,3,4);
> > 
> > You barely need a regular expression for this. A
> > split followed by a
> > join of the first two items would do.
> > 
> > @data = split ' ', $line;
> > unshift @data, (shift @data . " " . shift @data
> > . " ");
> > 
> Hi Adriano,
> 
> The line code you provide doesn't work on my computer
> but based on what you say I change it into this line
> code and it works. 
> 
> unshift @data, join (' ',(shift @data, shift @data));
> 
> One more question what if I have a file that have
> different lines 1) some lines have number only 2) some
> lines have more than 2 words at the begining?
> 
> my $line1='1 1 1 1 1';
> my $line2='group A 2 2 2 2";
> my $line3= 'group B and C 3 3 3 3";
> 
> Do you think I need a if statement to do the job?

If you want to use a regex for all these, the following might work with
your data:
use strict;
use warnings;

$"=',';
for () {
my @data = m/(\D+[^\d\s]|\d+)/g;
print "@data\n";
}

__DATA__
1 1 1 1 1
group A 2 2 2 2
group B and C 3 3 3 3

- David

-- 
"It may be true that the law cannot make a man love me, but it can stop him
from lynching me, and I think that's pretty important."
-- Martin Luther King Jr.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]