Hi folks,

The handler interface requires you to implement two functions:
Void processDriver(..)
Boolean shouldProcessURL

The processDriver function can do any manipulation of the web driver that
you’d like. The content will be pulled out of the body tag of the document
when this function returns. It is given to your handler preloaded with the
URL for the current page being fetched. You should be able to take that
and do the manipulations necessary.

shouldProcessURL is used to check whether the handler should be loaded for
a particular URL. If you want the handler to run over every URL then just
have it return true. If you want to have it run on only certain URLs then
you can implement that logic in there.

As for documentation, the Selenium docs [1] are pretty good. If you need
to handle authentication that can be a pain. I don’t have too many
recommendations there. You’ll have to just search around and figure out
best recommendations. Stackoverflow is always good =D [2]

[1] http://www.seleniumhq.org/docs/03_webdriver.jsp
[2] 
https://stackoverflow.com/questions/24304752/how-to-handle-authentication-p
opup-with-selenium-webdriver-using-java

Hope that helps
--
Michael J. Joyce
Scientific Applications Software Engineer
Instrument Software and Science Data Systems
NASA Jet Propulsion Laboratory
California Institute of Technology
4800 Oak Grove Drive
Pasadena, California 91109
Mail Stop: 158-242
Cel: (626) 788-7511
Tel: (818) 354-7550
Fax: (818) 393-1370





On 10/1/15, 6:50 AM, "Christian Alan Mattmann" <mattm...@usc.edu> wrote:

>Hi Team 18,
>
>This is great and you are headed in the right direction.
>
>MikeJ - can you suggest a sample reference to take a look
>at for the team?
>
>Cheers,
>Chris
>
>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Adjunct Associate Professor, Computer Science Department
>University of Southern California
>Los Angeles, CA 90089 USA
>Email: mattm...@usc.edu
>WWW: http://sunset.usc.edu/
>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>-----Original Message-----
>From: Mithun Maragiri <marag...@usc.edu>
>Date: Thursday, October 1, 2015 at 12:21 AM
>To: jpluser <mattm...@usc.edu>
>Cc: "ramac...@usc.edu" <ramac...@usc.edu>, Charan Shampur
><sham...@usc.edu>, Sharan Kadagad <kada...@usc.edu>
>Subject: Team 18: Selenium handler question
>
>>Hello Professor,
>>
>>
>>We want some help in writing selenium handler code.
>>We crawled the URLs for 30 rounds and we ended up with a few URLs which
>>were not fetched.
>>We wrote a python script to filter these URLs whose status code is not
>>OK/SUCCESS. 
>>Once we had these URLs we manually checked any one of the URLs as to why
>>it is not fetched.
>>We discovered that the website was behind the form and needed
>>authentication to access its web pages.
>>All the fetch requests made by crawler are http GET requests but for
>>these unfetched URLs we need to make POST request. We are thinking of
>>this approach
>>
>>
>>Approach:
>>> Write a script which filters all the URLs whose status code is not
>>>success.
>>> create a webDriver for each of these URLs in the DefaultHandler()
>>> manually sign up to each of these unfetched URLs with the same login
>>>credentials: Example: login= Team18; Password= team18Password
>>> once driver is created, create a POST request with the URL and append
>>>our login credentials and then make an AJAX call
>>> after studying materials online we realized that the purpose of
>>>selenium is exactly the same. But we cannot find any examples online
>>>where someone has written a handler. We are finding it hard to
>>>understand how to write the handler.
>>
>>
>>Can you please provide some example code writing the handler? we will use
>>that as the reference and try to write as per our need
>>
>>
>>Thanks,
>>
>>Team 18
>>
>>
>>
>>
>

Reply via email to