Hi All...It's the Worm again,
I regularly ban evil people from my website and log their ip's to a log file just like
so:
216.86.29.12
202.144.64.4
212.189.244.34
212.189.244.34
24.14.15.59
194.23.95.100
152.163.205.36
209.244.229.125
168.191.249.201
152.163.205.37
213.3.226.93
216.201.37.161
24.226.175.199
Evil people don't stop at just one evil deed but keep going; ergo I have many
duplicate ip addresses in the file.
Far below in this email are some of the methods I was looking at to remove dupes from
the file, but along the way I noticed something unsavory: It seems that my rather
simple code of copying the contents of a file into an array and then copying that
array to a file "inserts leading spaces" on each line, right before the ip address.
Here is the code I used:
#open list of ip addresses
$file1 = 'c:\ipaddresses.txt';
open (handle1, "<$file1");
@things = <handle1> ;
#put ip addresses into an array
$file2 = 'c:\spaces.txt';
#declare variable for filename
open (handle2, ">$file2") || die "can't open file $!";
print handle2 "@things" || die "no file to print to? $!";
#print array to file spacey.txt
Does anyone know if these leading spaces can be avoided or do they have to be removed
everytime an array is copied into a file?
Thanks!
The Worm
Below are the ways I thought to removes dupes...know of a better way?
a) If @in is sorted, and you want @out to be sorted: (this assumes all true values in
the array)
$prev = 'nonesuch';
@out = grep($_ ne $prev && ($prev = $_), @in);
This is nice in that it doesn't use much extra memory, simulating uniq(1)'s behavior
of removing only adjacent duplicates. It's less nice in that it won't work with false
values like undef, 0, or ""; "0 but true" is OK, though.
b) If you don't know whether @in is sorted:
undef %saw;
@out = grep(!$saw{$_}++, @in);
c) Like (b), but @in contains only small integers:
@out = grep(!$saw[$_]++, @in);
d) A way to do (b) without any loops or greps:
undef %saw;
@saw{@in} = ();
@out = sort keys %saw; # remove sort if undesired
e) Like (d), but @in contains only small positive integers:
undef @ary;
@ary[@in] = @in;
@out = grep {defined} @ary;
_______________________________________________
Perl-Win32-Web mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-web