Moving between hashes 2.

2004-09-17 Thread Michael Robeson
**Sorry, if this is a repeat. Wasn't sure if the mail went through. If you already replied can you re-send it to my e-mail address above as well? Thanks!***

I have two sets of data that have been stored in hashes. The first hash 
 has amino-acid (protein) sequence data. The second hash has the 
 corresponding DNA sequence of those amino-acids: 

 Hash 1 
 key: value: 
 cat   =   mfgdhf 
 doq = mfg--f 
 mouse =   mf-d-f 

 Hash 2 
 key: value: 
 cat = agtcatgcacactgatcg 
 dog = agtcatgcatcg 
 mouse = agtcatcactcg 

 And I need to insert gaps (missing or absent data) proportionally into 
 the DNA sequence (Hash 2) so that the output is as follows: 

 Hash 3 
 key: value: 
 cat = agtcatgcacactgatcg 
 dog = agtcatgca--tcg 
 mouse = agtca---tca---ctcg 

 It doesn't look right here, but all the lines should end up being the 
 same length with courier font. Basically, I am having trouble scanning 
 though, say...  hash1{cat} and for every  dash found there being 
 finally represented as  three dashes in hash2{cat}. Also, every 
 amino-acid is represented by 3 DNA letters. This is why I need to move 
 in increments of 3 and add in increments of 3 for my final data to 
 appear as it does in Hash 3. 

 Example of relationship: 
 M F DF  = amino-acid 
 agt   tca --- act --- tcg  = dna 

 I have everything else set up I just need a few suggestions on how to 
 do the above. Any help will be greatly appreciated. 


<>

 -- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 


Moving between hashes 2.

2004-09-19 Thread Michael S. Robeson II
Ok, well I think I can see the forest but I have little idea as to what 
is actually going on here. I spent a few hours looking things up and I 
have a general sense of what is actually occurring but I am getting 
lost in the details that were posted in the last digest. See below:

On Sep 19, 2004, at 10:08, [EMAIL PROTECTED] wrote:
I see that you also made use of arrays. It struck me that, since the 
starting point is strings and not lists, using substr() would be more 
straight-forward:

  my %hash3;
  for ( keys %hash1 ) {
while ( my $aa = substr $hash1{$_},0,1,'' ) {
	I have never seen anything like this nor can I find anything in any of 
my Perl books to help me explain what the 0,1 and the " are doing to 
the substr of $hash1. I assume it is position information of some kind? 
If so, what is going on?

  $hash3{$_} .= $aa eq '-' ? '---' : substr $hash2{$_},0,3,'';
	
	This is something new to me. I think I follow your use of the ?: 
pattern feature. However, none of the perl books I have discuss it's 
use in this fashion. So, I am unsure of how you know to do that, or 
rather... how would I have known that I can do that? But basically I 
see that you are looking for '-' and equating it with what is matching 
between the ? and :  (i.e. '---').

	So, as far as I can tell, you are saying: "hey, if you find '-' in $aa 
then append a '---' in $hash3, otherwise append the next three DNA 
letters". However, I do not understand the syntax of how perl is 
actually doing this.

Help with explanation would be greatly appreciated. As you can see I 
can see what the big picture is, it's just that I am unable to 
determine mechanistically how perl is actually going about doing it. 
Also, any online references to the techniques used above would be 
great. I'd look for them myself but I do not know what some of these 
are actually called?

-Thanks so much, I have learned a little just from this much so far.
-mike
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 



Moving between hashes 2.

2004-09-24 Thread Michael Robeson
Gunnar,
Thanks so much for the help and the links! They help quit a bit. I 
decided to use the if statement you posted:

 if ( $aa eq '-' ) {
$hash3{$_} .= '---';
} else {
$hash3{$_} .= substr $dna,0,3,'';
}
instead of:
$hash3{$_} .= $aa eq '-' ? '---' : substr $dna,0,3,'';
only because I had to add a $count++ function within the else statement 
(shown below) to accomplish another task within my larger script:

if ( $aa eq '-' ) {
$hash3{$_} .= '---';
} else {
$hash3{$_} .= substr $dna,0,3,'';
   $count++
}
I couldn't figure out if it was possible to add $count++ within the ?: 
statement above. I tried but could not get it to work.

However, everything works well at this point. Again, I really 
appreciate the help!

-Mike
On Sep 20, 2004, at 6:55 PM, [EMAIL PROTECTED] wrote:
From: Gunnar Hjalmarsson <[EMAIL PROTECTED]>
Date: September 19, 2004 9:12:32 PM MDT
To: [EMAIL PROTECTED]
Subject: Re: Moving between hashes 2.
Michael S. Robeson II wrote:
Ok, well I think I can see the forest but I have little idea as to
what is actually going on here. I spent a few hours looking things
up and I have a general sense of what is actually occurring but I
am getting lost in the details that were posted in the last digest.
Well, before an attempt to explain and/or point you to the applicable
docs, I'd like to change my mind once again. :)  This is my latest
idea:
my %hash3;
for ( keys %hash1 ) {
my $dna = $hash2{$_};
for my $aa ( split //, $hash1{$_} ) {
$hash3{$_} .= $aa eq '-' ? '---' : substr $dna,0,3,'';
}
}
I'll assume that you don't have a problem with the outer loop, that
simply iterates over the hash keys. As a first step in each iteration
I copy the DNA sequence to the $dna variable, so as to not destroying
%hash2.
Over to the 'tricky' part. The inner loop iterates over each character
in the amino-acid sequence data, and respective character is assigned
to $aa. For that I use the split() function:
http://www.perldoc.com/perl5.8.4/pod/func/split.html
  $hash3{$_} .= $aa eq '-' ? '---' : substr $hash2{$_},0,3,'';
This is something new to me. I think I follow your use of the ?:
pattern feature. However, none of the perl books I have discuss
it's use in this fashion.
That sounds strange to me, because that's how it should be used...
Read about the conditional operator in
http://www.perldoc.com/perl5.8.4/pod/perlop.html
OTOH, that notation is basically the same as:
if ( $aa eq '-' ) {
$hash3{$_} .= '---';
} else {
$hash3{$_} .= substr $dna,0,3,'';
}
which is a little more intuitive (at least I think it is).
So, as far as I can tell, you are saying: "hey, if you find '-' in
$aa then append a '---' in $hash3, otherwise append the next three
DNA letters".
Precisely.
However, I do not understand the syntax of how perl is actually
doing this.
Hopefully the if/else statement makes it easier to grasp, and the '.='
operator is used just for appending something to a string.
Finally we have my use of the substr() function.
http://www.perldoc.com/perl5.8.4/pod/func/substr.html
It returns the first three characters in $dna, and since I also pass
the null string as the fourth argument, it changes the content of $dna
at the same time, i.e. it replaces the first three characters with
nothing.
HTH. If you need further explanations, you'll have to ask specific
questions.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>



RE: Moving between hashes 2.

2004-09-17 Thread Bob Showalter
Michael Robeson wrote:

Don't post MIME or HTML to the list. Plain text only.

> 
> I have two sets of data that have been stored in hashes. The first
> hash 
> has amino-acid (protein) sequence data. The second hash has the
> corresponding DNA sequence of those amino-acids:
> 
> 
> Hash 1
> key: value:
> cat   =   mfgdhf
> doq = mfg--f
> mouse =   mf-d-f
> 
> 
> Hash 2
> key: value:
> cat = agtcatgcacactgatcg
> dog = agtcatgcatcg
> mouse = agtcatcactcg
> 
> 
> And I need to insert gaps (missing or absent data) proportionally into
> the DNA sequence (Hash 2) so that the output is as follows:
> 
> 
> Hash 3
> key: value:
> cat = agtcatgcacactgatcg
> dog = agtcatgca--tcg
> mouse = agtca---tca---ctcg
> 
> 
> It doesn't look right here, but all the lines should end up being the
> same length with courier font. Basically, I am having trouble scanning
> though, say...  hash1{cat} and for every  dash found there being
> finally represented as  three dashes in hash2{cat}. Also, every
> amino-acid is represented by 3 DNA letters. This is why I need to move
> in increments of 3 and add in increments of 3 for my final data to
> appear as it does in Hash 3.
> 
> 
> Example of relationship:
> M F DF  = amino-acid
> agt tca --- act --- tcg  = dna
> 
> 
> I have everything else set up I just need a few suggestions on how to
> do the above. Any help will be greatly appreciated.

Here's one approach:

#!/usr/bin/perl

use strict;

while () {
my ($key, $mask, $src) = split;
my @mask = $mask =~ /./g;
my @src = $src =~ /.../g;
print "$key: ";
print $_ eq '-' ? '---' : shift @src for @mask;
print "\n";
}

__DATA__
cat mfgdhf agtcatgcacactgatcg
dog mfg--f agtcatgcatcg
mouse mf-d-f agtcatcactcg

Outputs:

cat: agtcatgcacactgatcg
dog: agtcatgca--tcg
mouse: agtcat---cac---tcg

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 




Re: Moving between hashes 2.

2004-09-17 Thread Gunnar Hjalmarsson
Michael Robeson wrote:
**Sorry, if this is a repeat. Wasn't sure if the mail went through.
If you already replied can you re-send it to my e-mail address
above as well? Thanks!***
Aren't you subscribed to the list? And if there is a problem with your
receiving of email, how about checking the list archive
http://www.mail-archive.com/beginners%40perl.org/msg61879.html
or nntp.perl.org?
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 



Re: Moving between hashes 2.

2004-09-17 Thread Gunnar Hjalmarsson
Bob Showalter wrote:
while () {
my ($key, $mask, $src) = split;
my @mask = $mask =~ /./g;
my @src = $src =~ /.../g;
print "$key: ";
print $_ eq '-' ? '---' : shift @src for @mask;
print "\n";
}
__DATA__
cat mfgdhf agtcatgcacactgatcg
dog mfg--f agtcatgcatcg
mouse mf-d-f agtcatcactcg
I see that you also made use of arrays. It struck me that, since the 
starting point is strings and not lists, using substr() would be more 
straight-forward:

  my %hash3;
  for ( keys %hash1 ) {
while ( my $aa = substr $hash1{$_},0,1,'' ) {
  $hash3{$_} .= $aa eq '-' ? '---' : substr $hash2{$_},0,3,'';
}
  }
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 



Re: Moving between hashes 2.

2004-09-19 Thread Gunnar Hjalmarsson
Michael S. Robeson II wrote:
Ok, well I think I can see the forest but I have little idea as to
what is actually going on here. I spent a few hours looking things
up and I have a general sense of what is actually occurring but I
am getting lost in the details that were posted in the last digest.
Well, before an attempt to explain and/or point you to the applicable
docs, I'd like to change my mind once again. :)  This is my latest
idea:
my %hash3;
for ( keys %hash1 ) {
my $dna = $hash2{$_};
for my $aa ( split //, $hash1{$_} ) {
$hash3{$_} .= $aa eq '-' ? '---' : substr $dna,0,3,'';
}
}
I'll assume that you don't have a problem with the outer loop, that
simply iterates over the hash keys. As a first step in each iteration
I copy the DNA sequence to the $dna variable, so as to not destroying
%hash2.
Over to the 'tricky' part. The inner loop iterates over each character
in the amino-acid sequence data, and respective character is assigned
to $aa. For that I use the split() function:
http://www.perldoc.com/perl5.8.4/pod/func/split.html
  $hash3{$_} .= $aa eq '-' ? '---' : substr $hash2{$_},0,3,'';
This is something new to me. I think I follow your use of the ?:
pattern feature. However, none of the perl books I have discuss
it's use in this fashion.
That sounds strange to me, because that's how it should be used...
Read about the conditional operator in
http://www.perldoc.com/perl5.8.4/pod/perlop.html
OTOH, that notation is basically the same as:
if ( $aa eq '-' ) {
$hash3{$_} .= '---';
} else {
$hash3{$_} .= substr $dna,0,3,'';
}
which is a little more intuitive (at least I think it is).
So, as far as I can tell, you are saying: "hey, if you find '-' in
$aa then append a '---' in $hash3, otherwise append the next three
DNA letters".
Precisely.
However, I do not understand the syntax of how perl is actually
doing this.
Hopefully the if/else statement makes it easier to grasp, and the '.='
operator is used just for appending something to a string.
Finally we have my use of the substr() function.
http://www.perldoc.com/perl5.8.4/pod/func/substr.html
It returns the first three characters in $dna, and since I also pass
the null string as the fourth argument, it changes the content of $dna
at the same time, i.e. it replaces the first three characters with
nothing.
HTH. If you need further explanations, you'll have to ask specific
questions.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 



Re: Moving between hashes 2.

2004-09-24 Thread Gunnar Hjalmarsson
Michael Robeson wrote:
I decided to use the if statement you posted:

only because I had to add a $count++ function within the else
statement (shown below) to accomplish another task within my larger
script:
if ( $aa eq '-' ) {
$hash3{$_} .= '---';
} else {
$hash3{$_} .= substr $dna,0,3,'';
$count++
}
I couldn't figure out if it was possible to add $count++ within the
?: statement above. I tried but could not get it to work.
Right, the conditional operator is merely designed for assignment.
OTOH, you don't need a loop to count a certain type of characters in a
string:
my $string = 'mfg--f';
my $count = $string =~ tr/a-z//;
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]