Re: Help with WWW::Mechanize - Next Question

Mathew Snyder Tue, 05 Dec 2006 19:47:52 -0800

Rob Dixon wrote:
> Mathew Snyder wrote:
>> With all the help I've received I've been able to get this working. 
>> This is my
>> text:
>> #!/usr/bin/perl
>>
>> use warnings;
>> use strict;
>> use WWW::Mechanize;
>> use HTML::TokeParser;
>>
>> my $username = 'msnyder';
>> my $password = 'xxxxxxx';
>> my $status   = 'open';
>>
>> my $agent = WWW::Mechanize->new();
>> $agent->get('https://rt.ops.xxxxxxxxxxx.com/');
>>
>> $agent->submit_form(
>>         form_name => 'login',
>>         fields    => {
>>                 'user' => $username,
>>                 'pass' => $password,
>>         }
>> );
>>
>> $agent->follow_link(text => "Tickets");
>>
>> $agent->submit_form(
>>         form_name => 'BuildQuery',
>>         fields    => {
>>                 'ValueOfStatus' => $status,
>>                 'ValueOfActor'  => $username,
>>         },
>>         button    => 'DoSearch'
>> );
>>
>> my $data = $agent->content();
>> print $data;
>>
>>
>> What this will do is return to me HTML source with a list of work
>> tickets and
>> all pertinent, associated data.  The purpose of setting this up is to
>> allow me
>> to pull out email addresses of any work ticket created as a result of
>> spam.
>>
>> For anyone not familiar with Request Tracker from Best Practical
>> Solutions, the
>> 'from' email address on any incoming email received by Request Tracker is
>> automatically turned into a user account.  With the amount of spam
>> flying around
>> the the Net these days those user accounts add up.
>>
>> All those spam tickets are assigned to me so I can eliminate them and
>> the users
>> created as a result of them from our database.  My goal is to parse
>> $data to
>> pull out all the email addresses which I will then sift through to
>> remove any
>> legitimate addresses.
>>
>> You'll notice I declare the use of HTML::TokeParser.  This leads to my
>> next
>> question.  Do I need to use that?  Would it be simpler to just parse
>> the data
>> matching against a regex and put any matches into a file?  I imagine I
>> don't
>> need to sift through all the HTML tags just to get to the email
>> addresses since
>> they are fairly easy to spot.
> 
> Hi Mathew
> 
> Ordinarily I would insist that you use a proper HTML parser, but I see
> no harm in
> searching for email addresses as their format is well defined. Use the
> Email::Address module, like this:
> 
> use Email::Address;
> 
> my @email = Email::Address->parse($agent->content);
> print $_->address, "\n" foreach @email;
> 
> HTH,
> 
> Rob
> 
>


I don't know if maybe there is a bug in the Email::Address module or not.  I've
changed nothing other than what you've suggested.  Now I'm gettting a
Segmentation Fault.

Here's my code as it stands now:
#!/usr/bin/perl

use warnings;
use strict;
use WWW::Mechanize;
use Email::Address;

my $user = 'msnyder';
my $pass = 'xxxxxxx';
my $status   = 'open';
my $queue    = 'Security';

my $agent = WWW::Mechanize->new();
$agent->get('https://rt.ops.xxxxxxxxxxx.com/');

$agent->submit_form(
        form_name => 'login',
        fields    => {
                'user' => $user,
                'pass' => $pass,
        }
);

$agent->follow_link(text => "Tickets");

$agent->submit_form(
        form_name => 'BuildQuery',
        fields    => {
                'ValueOfStatus' => $status,
                'ValueOfActor'  => $user,
                'ValueOfQueue'  => $queue,
        },
        button    => 'DoSearch'
);

my $data     = $agent->content();
my @emails = Email::Address->parse($data);

foreach my $email (@emails){
        print $email;
};

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Help with WWW::Mechanize - Next Question

Reply via email to