Adriano Allora am Samstag, 18. November 2006 11:52:
> hi to all,

Ciao Adriano

> I've got a list of tagged words, like this one (only a little bit
> longest):
>
> <tLn nr=11>
> e       CON     e
> le      DET:def il
> ha      VER:pres        avere|riavere
> detto   VER:pper        dire
> <       NOM     <unknown>
> CORR    VER:infi        corre
>
>  >       NOM     <unknown>
>
> e       CON     e
> a       PRE     a
>
> I need to transform the list below in (in which the CORR tag isn't
> tagged):
>
> <tLn nr=11>
> e       CON     e
> le      DET:def il
> ha      VER:pres        avere|riavere
> detto   VER:pper        dire
> <CORR>
> e       CON     e
> a       PRE     a
>
> So I tried to write this awful script:
>
> #!/usr/bin/perl -w
>
> use strict;
>
> $^I = '';
>
> my $tic = 0;
> my  $toc = 0;
>
> while(<>)
>       {
>       if(/^<       NOM     <unknown>.*/i)

You don't need the .* in the regex here (and below).

>               {
>               $tic = 1;
>               next;
>               }
>       next if /^>       NOM     <unknown>.*/i;
>       next if $toc == 1;

$toc can only have the values 0 and 1. So, if you get here, $toc is 0...

>       $toc = 0;

...and this won't change $toc.

>       if($tic==1)
>               {
>               s/^(\/?\w+).+/$1/gi;
>               chomp();
>               $_ = "<$_>";
>               $toc = 1;
>               $tic = 0;
>               }
>       s/<>//g;
>       print;
>       }
>
> it doesn't return errors, but it stop printing the output after the
> first correction. Someone can explain me why 

Didn't look deeply enough in the code, so I can't :-)

> and eventually suggest how to correct the corrector?

The script below seems to do what you want. It's not very elegant, but (I 
think) easy to understand. I use a $inside variable that does what you maybe 
intended with $tic and $toc.

> PS: another strange thing: if I declare at the beginning of the script:
> my($tic,$toc); it returns me an error...

You don't say what error, but I got errors like

  "Use of uninitialized value in numeric eq (==) at ./script.pl line 19, 
  <DATA> line 1.". 

$tic/toc is used in a numeric comparison before a value has 
been assigned (my ($tic, $toc) leaves both undefined). The program flow may be 
different from what you expect, and maybe also the reason for a stop after 
the first correction.

I hope this helps,

Dani

#!/usr/bin/perl

use strict;
use warnings;

my $inside; # are we within a tagged area?

while(<DATA>) {
        if (/^<\s+NOM\s+<unknown>/i) {
                $inside=1;
                next;
        }
        elsif (/^>\s+NOM\s+<unknown>/i) {
                $inside=0;
                next;
        }
        elsif ($inside) {
                my ($str)=/(^\w+)/ or die;
                print "<$1>\n";
        }
        else {
                print;
        }
}

__DATA__
<tLn nr=11>
e       CON     e
le      DET:def il
ha      VER:pres        avere|riavere
detto   VER:pper        dire
<       NOM     <unknown>
CORR    VER:infi        corre
>       NOM     <unknown>
e       CON     e
a       PRE     a
<tLn nr=11>
e       CON     e
le      DET:def il
ha      VER:pres        avere|riavere
detto   VER:pper        dire
<       NOM     <unknown>
BLA    VER:infi        corre
>       NOM     <unknown>
e       CON     e
a       PRE     a

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to