Re: regex parsing-Beginner

minky arora Wed, 05 Dec 2007 09:26:47 -0800

Hi,

Thanks evryone you helped me with this problem.I have reported all
parts of the file as was needed.There is one thing I need to figure
out.The last step:


So,what I need to do is,for the gene names where the difference in
range is not zero,I need to report the lines at the end of the
file(The sequence of strings) that contain that range.

Example: for gene range =35529..35736 and CDS range 35529..35723 the
diff=35736-35723=13..which is grater than 0..

So the range of 35723..35736 is contained in the line number starting
from 35701.I need to output the sequence string associated with
that.IN some cases, there may be more than one such line which
contains the desired range and I need to report all.

Here is the file snippet again:

gene            35529..35736
                     /gene="csfB"
                     /db_xref="EMBL:2632267"
     CDS             35529..35723
                     /gene="csfB"
                     /function="unknown"
                     /note="alternate gene name: yaaM; sigma-F transcribed
                     gene; alternate gene name: yaaM
                     sigma-F transcribed gene"
                     /codon_start=1
                     /transl_table=11
                     /protein_id="CAB11800.1"
                     /db_xref="GI:2632291"

35641 tttctacatc aacttctgat cctgactatg cgttttacgt aaaaaaacta aagagcattc
35701 atacaccgcc attatattca tagacctgaa aaggtctttt tttgtactct
taataataaa    //need this line
35761 aagaagatga aacttgttta aggattgaac gtagtagata ataataataa aactgagtat



Please Advise as to the best way to go about this.



On 12/4/07, Gunnar Hjalmarsson <[EMAIL PROTECTED]> wrote:
> minky arora wrote:
> > Gunnar Hjalmarsson wrote:
> >> minky arora wrote:
> >>> I have been able to retrive all the info i need and in
> >>> the format that I want.
> >>
> >> Would you mind letting us know about that format? Doing so might help us
> >> help you with your next question...
> >
> > Sure,
> >
> > If i understand correctly,I am pasting a part of the file to give you
> > a better idea of wat I am trying to do:
>
> <new file extract snipped>
>
> No, that's not what I meant. Anyway, if the file is not huge, I think I
> would simply slurp it.
>
>      my %diffs;
>      {
>          local $/;
>          local $_ = <FILE>;
>          while ( m/
>                      gene\s+\d+\.\.(\d+).+?
>                      gene="([^"]+)".+?
>                      CDS\s+\d+\.\.(\d+)
>                  /gsx ) {
>              $diffs{$2} = $1 - $3;
>          }
>      }
>      print "$_ -> $diffs{$_}\n" for keys %diffs;
>
> --
> Gunnar Hjalmarsson
> Email: http://www.gunnar.cc/cgi-bin/contact.pl
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> http://learn.perl.org/
>
>
>

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: regex parsing-Beginner

Reply via email to