Woops! Sorry! I hit "Reply", not "Reply all". And I know that I have to learn more Perl. Is a perl-beginner list isn't? Regards.
2010/9/21 Shlomi Fish <[email protected]> > In accordance with: > > > > > > > Please reply to list if it's a mailing list post - > http://shlom.in/yreply <http://shlom.in/reply> > > > . > > And the fact the message was not explicitly stated as replied-to-in-private > and the fact it does not appear to be the author's intention, I'm CCing > this > to the list. Next time please hit reply-to-all. > > On Tuesday 21 September 2010 00:16:23 Jordi Durban wrote: > > Thank you for your answer Shiomi, but I don't know exactly how to write > > what you said me. > > Are keys and values stored at the hash now?? > > while ( my $line = <IN> ){ > > chomp $line; > > my ($col_1, $col_2) = split (/\t/, $line); > > if ( $seen_pair{ $col_1 }{$col_2} || $seen_pair{ $col_2 > > }{$col_1} ){ ##### *I copied this from a book. What am I saying here??* > > #~ print OUT1 "$col_1\t$col_2\n" ; > > next; > > } > > $seen_pair{ $col_1 }{$col_2} = 1; ##### *Are here all > > keys-values pairs or only those unique??* > > print OUT2 "$col_1\t$col_2\n"; > > > > Maybe you should read a good Perl tutorial or book and properly learn Perl > from the basics. See http://perl-begin.org/ for more information. > > Try doing something like this: > > [code] > open my $in_fh, "<", "my_input.txt" > or die "Cannot open input - $!"; > open my $out_fh, ">", "filtered.txt" > or die "Cannot open output - $!"; > > my %seen; > > while (my $line = <$in_fh>) > { > my $l = $line; > chomp($l); > > my ($col_1, $col_2) = split(/\t/, $l); > > my $token = join("\t", sort { $a cmp $b } ($col1, $col2)); > > if (! exists($seen{$token})) > { > $seen{$token}++; > print {$out_fh} $line; > } > } > > close($in_fh); > close($out_fh); > [/code] > > Regards, > > Shlomi Fish > > > Thank you very much. > > > > 2010/9/20 Shlomi Fish <[email protected]> > > > > > Hi Jordi, > > > > > > On Monday 20 September 2010 10:16:40 Jordi Durban wrote: > > > > Hi all! > > > > I have a file like this : > > > > > > > > > > > > colum a colum b > > > > uid = 1 uid = 4 > > > > uid = 2 uid = 3 > > > > uid = 3 uid = 2 > > > > uid = 4 uid = 1 > > > > > > > > I'm trying to find those columns with the same numbers regardless the > > > > > > colum > > > > > > > they are. That's, in the example, the row 2 is identital to row 3. > > > > So far, I have tried: > > > > > > > > my %seen_pair; > > > > > > > > while (my $line = <IN> ){ > > > > chomp $line; > > > > my ($col_1, $col_2) = split (/\t/,$line); > > > > > > > > if ($seen_pair{$col_1 }{$col_2} || $seen_pair{ $col_2 > > > > > > }{$col_1} > > > > > > > ){ > > > > > > Well, you don't appear to be adding the columns to %seen_pair. What I > > > would do > > > is this: > > > > > > 1. Extract the numbers as $n1 and $n2 (unless the rest of the strings > in > > > the > > > columns are relevant. > > > > > > 2. Prepare a unique token out of them: > > > > > > {{{ > > > my $token = join(",", sort { $a <=> $b } ($n1, $n2)); > > > }}} > > > > > > Notice that I sort the numbers in order to get a unique set. Make sure > > > the joined separator does not exist anywhere. > > > > > > 3. Store that in the hash, possibly with some data on the next column > and > > > compare against it. > > > > > > Regards, > > > > > > Shlomi Fish > > > > > > -- > > > ----------------------------------------------------------------- > > > Shlomi Fish http://www.shlomifish.org/ > > > List of Portability Libraries - http://shlom.in/port-libs > > > > > > <rindolf> She's a hot chick. But she smokes. > > > <go|dfish> She can smoke as long as she's smokin'. > > > > > > Please reply to list if it's a mailing list post - > http://shlom.in/reply > > > . > > -- > ----------------------------------------------------------------- > Shlomi Fish http://www.shlomifish.org/ > "Star Trek: We, the Living Dead" - http://shlom.in/st-wtld > > <rindolf> She's a hot chick. But she smokes. > <go|dfish> She can smoke as long as she's smokin'. > > Please reply to list if it's a mailing list post - http://shlom.in/reply . > -- Jordi
