Parag Kalra wrote:
Hello Folks,
Hello,
This is my first post here. I am trying to emulate Linux 'sort' command through Perl. I got following code through Internet to sort the text file: # cat sort.pl my $column_number = 2; # Sorting by 3rd column since 0-origin based my $prev = ""; for ( map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [$_, (split)[$column_number]] } <> ) { print unless $_ eq $prev; $prev = $_; } Suppose I want to sort the data of text file having following rows & columns: # cat test.out jhvXgF U13GWt 3OvMCf VMkAWj 4ewejk pFnjd4 ie0hZF pPipQJ 4ewejk 4sqprx ie0hZF cqtexi FT9mWp d4fgMB gvZRJU XRRu0N hnzI2c GXAXWF 6xKH7A 3dLh18 When I sort it using the 'sort' command by 3rd column I get following output: # sort -u -k 3 test.out jhvXgF U13GWt 3OvMCf VMkAWj hnzI2c GXAXWF 6xKH7A 3dLh18 FT9mWp d4fgMB gvZRJU XRRu0N 4ewejk 4sqprx ie0hZF cqtexi 4ewejk pFnjd4 ie0hZF pPipQJ However when I sort the same text file by 3rd column using the piece of code, I get following: jhvXgF U13GWt 3OvMCf VMkAWj hnzI2c GXAXWF 6xKH7A 3dLh18 FT9mWp d4fgMB gvZRJU XRRu0N 4ewejk pFnjd4 ie0hZF pPipQJ 4ewejk 4sqprx ie0hZF cqtexi Difference can be seen the last 2 row values of 2nd column. The reason being 'ie0hZF' is repeated twice in 3rd column and also corresponding values in 1st column are same - '4ewejk' so discrepancy has occured in 2nd column. Can anybody help me fix the bug in the above code.
This is not a bug. Because you are only sorting based on one column the order of the other columns is indeterminate. If you want to sort the whole line like 'sort' does then change:
sort { $a->[1] cmp $b->[1] } To: sort { $a->[1] cmp $b->[1] || $a->[0] cmp $b->[0] } John -- The programmer is fighting against the two most destructive forces in the universe: entropy and human stupidity. -- Damian Conway -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/