On Tuesday 10 December 2002 4:23 pm, Brian York wrote:
> I have some log files that have a bunch of crap in them. I need to extract
> all of the email addresses in them and put them in a file.  Does anyone
> know how I might be able to do this?
> Find       > emails.file
> Thanks
> Brian

If your logfile is a normal text file and your mail addresses are between 
angle brackets, you could use grep on it to extract the lines containing them 
and then pipe the output through sed to remove the chaff either side and then 
finally through sort and uniq. eg;

grep "<.*@.*>" your-logfile | sed 's/^.*</</;   \
                                   s/?.*>/>/;   \
                                   s/mailto://; \
                                   s/>.*$/>/' | sort | uniq > emails.file

This only assumes 1 address per line and would filter out any following 
addresses on the line.
The first substitution command on sed removes anything from the beginning of 
the line up to and including the first angle-bracket and replaces it with an 
angle bracket.
The second command on sed removes any "?subject=" bits of some addresses.
The third obviously removes the mailto: bit
and the fourth removes anything to the right of the trailing angle bracket.

The complicated part is obviously the sed command and fiddlying with it to get 
the results you want. The commands I have shown may not suit your particular 
case, but I hope they at least point you in the right direction.

A tip here; leave off the sort and uniq parts until you have tweeked sed to 
your satisfaction so you can see the std-out directly

                Robin.

Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com

Reply via email to