On Tue, 20 Jan 2015 17:47:58 +0000 Mike Martin <m...@redtux.org.uk> wrote:
> Take a load of Job Vacancy posts (xml files - loads of) > Parse the Information, getting rid of as much garbage as possible > Push a distinct list into a lookup hash If you're running Linux (or any POSIX), see `man sort` and search for /-u/ Since sort(1) is fully compiled, it should be faster than a Perl hash, especially for long lists. > Do replace to this list against a long list of regexes This will create a cross-product table. That means each pair has to be tested. Nobody has found a way to reduce this. > Spit out nicely formatted Clean Job Titles -- Don't stop where the ink does. Shawn -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/