In accordance with:
> >
> > Please reply to list if it's a mailing list post - http://shlom.in/reply
> > .
And the fact the message was not explicitly stated as replied-to-in-private
and the fact it does not appear to be the author's intention, I'm CCing this
to the list. Next time please hit reply-to-all.
On Tuesday 21 September 2010 00:16:23 Jordi Durban wrote:
> Thank you for your answer Shiomi, but I don't know exactly how to write
> what you said me.
> Are keys and values stored at the hash now??
> while ( my $line = <IN> ){
> chomp $line;
> my ($col_1, $col_2) = split (/\t/, $line);
> if ( $seen_pair{ $col_1 }{$col_2} || $seen_pair{ $col_2
> }{$col_1} ){ ##### *I copied this from a book. What am I saying here??*
> #~ print OUT1 "$col_1\t$col_2\n" ;
> next;
> }
> $seen_pair{ $col_1 }{$col_2} = 1; ##### *Are here all
> keys-values pairs or only those unique??*
> print OUT2 "$col_1\t$col_2\n";
>
Maybe you should read a good Perl tutorial or book and properly learn Perl
from the basics. See http://perl-begin.org/ for more information.
Try doing something like this:
[code]
open my $in_fh, "<", "my_input.txt"
or die "Cannot open input - $!";
open my $out_fh, ">", "filtered.txt"
or die "Cannot open output - $!";
my %seen;
while (my $line = <$in_fh>)
{
my $l = $line;
chomp($l);
my ($col_1, $col_2) = split(/\t/, $l);
my $token = join("\t", sort { $a cmp $b } ($col1, $col2));
if (! exists($seen{$token}))
{
$seen{$token}++;
print {$out_fh} $line;
}
}
close($in_fh);
close($out_fh);
[/code]
Regards,
Shlomi Fish
> Thank you very much.
>
> 2010/9/20 Shlomi Fish <[email protected]>
>
> > Hi Jordi,
> >
> > On Monday 20 September 2010 10:16:40 Jordi Durban wrote:
> > > Hi all!
> > > I have a file like this :
> > >
> > >
> > > colum a colum b
> > > uid = 1 uid = 4
> > > uid = 2 uid = 3
> > > uid = 3 uid = 2
> > > uid = 4 uid = 1
> > >
> > > I'm trying to find those columns with the same numbers regardless the
> >
> > colum
> >
> > > they are. That's, in the example, the row 2 is identital to row 3.
> > > So far, I have tried:
> > >
> > > my %seen_pair;
> > >
> > > while (my $line = <IN> ){
> > > chomp $line;
> > > my ($col_1, $col_2) = split (/\t/,$line);
> > >
> > > if ($seen_pair{$col_1 }{$col_2} || $seen_pair{ $col_2
> >
> > }{$col_1}
> >
> > > ){
> >
> > Well, you don't appear to be adding the columns to %seen_pair. What I
> > would do
> > is this:
> >
> > 1. Extract the numbers as $n1 and $n2 (unless the rest of the strings in
> > the
> > columns are relevant.
> >
> > 2. Prepare a unique token out of them:
> >
> > {{{
> > my $token = join(",", sort { $a <=> $b } ($n1, $n2));
> > }}}
> >
> > Notice that I sort the numbers in order to get a unique set. Make sure
> > the joined separator does not exist anywhere.
> >
> > 3. Store that in the hash, possibly with some data on the next column and
> > compare against it.
> >
> > Regards,
> >
> > Shlomi Fish
> >
> > --
> > -----------------------------------------------------------------
> > Shlomi Fish http://www.shlomifish.org/
> > List of Portability Libraries - http://shlom.in/port-libs
> >
> > <rindolf> She's a hot chick. But she smokes.
> > <go|dfish> She can smoke as long as she's smokin'.
> >
> > Please reply to list if it's a mailing list post - http://shlom.in/reply
> > .
--
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
"Star Trek: We, the Living Dead" - http://shlom.in/st-wtld
<rindolf> She's a hot chick. But she smokes.
<go|dfish> She can smoke as long as she's smokin'.
Please reply to list if it's a mailing list post - http://shlom.in/reply .
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/