Hi folks, The handler interface requires you to implement two functions: Void processDriver(..) Boolean shouldProcessURL
The processDriver function can do any manipulation of the web driver that you’d like. The content will be pulled out of the body tag of the document when this function returns. It is given to your handler preloaded with the URL for the current page being fetched. You should be able to take that and do the manipulations necessary. shouldProcessURL is used to check whether the handler should be loaded for a particular URL. If you want the handler to run over every URL then just have it return true. If you want to have it run on only certain URLs then you can implement that logic in there. As for documentation, the Selenium docs [1] are pretty good. If you need to handle authentication that can be a pain. I don’t have too many recommendations there. You’ll have to just search around and figure out best recommendations. Stackoverflow is always good =D [2] [1] http://www.seleniumhq.org/docs/03_webdriver.jsp [2] https://stackoverflow.com/questions/24304752/how-to-handle-authentication-p opup-with-selenium-webdriver-using-java Hope that helps -- Michael J. Joyce Scientific Applications Software Engineer Instrument Software and Science Data Systems NASA Jet Propulsion Laboratory California Institute of Technology 4800 Oak Grove Drive Pasadena, California 91109 Mail Stop: 158-242 Cel: (626) 788-7511 Tel: (818) 354-7550 Fax: (818) 393-1370 On 10/1/15, 6:50 AM, "Christian Alan Mattmann" <mattm...@usc.edu> wrote: >Hi Team 18, > >This is great and you are headed in the right direction. > >MikeJ - can you suggest a sample reference to take a look >at for the team? > >Cheers, >Chris > >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Chris Mattmann, Ph.D. >Adjunct Associate Professor, Computer Science Department >University of Southern California >Los Angeles, CA 90089 USA >Email: mattm...@usc.edu >WWW: http://sunset.usc.edu/ >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > >-----Original Message----- >From: Mithun Maragiri <marag...@usc.edu> >Date: Thursday, October 1, 2015 at 12:21 AM >To: jpluser <mattm...@usc.edu> >Cc: "ramac...@usc.edu" <ramac...@usc.edu>, Charan Shampur ><sham...@usc.edu>, Sharan Kadagad <kada...@usc.edu> >Subject: Team 18: Selenium handler question > >>Hello Professor, >> >> >>We want some help in writing selenium handler code. >>We crawled the URLs for 30 rounds and we ended up with a few URLs which >>were not fetched. >>We wrote a python script to filter these URLs whose status code is not >>OK/SUCCESS. >>Once we had these URLs we manually checked any one of the URLs as to why >>it is not fetched. >>We discovered that the website was behind the form and needed >>authentication to access its web pages. >>All the fetch requests made by crawler are http GET requests but for >>these unfetched URLs we need to make POST request. We are thinking of >>this approach >> >> >>Approach: >>> Write a script which filters all the URLs whose status code is not >>>success. >>> create a webDriver for each of these URLs in the DefaultHandler() >>> manually sign up to each of these unfetched URLs with the same login >>>credentials: Example: login= Team18; Password= team18Password >>> once driver is created, create a POST request with the URL and append >>>our login credentials and then make an AJAX call >>> after studying materials online we realized that the purpose of >>>selenium is exactly the same. But we cannot find any examples online >>>where someone has written a handler. We are finding it hard to >>>understand how to write the handler. >> >> >>Can you please provide some example code writing the handler? we will use >>that as the reference and try to write as per our need >> >> >>Thanks, >> >>Team 18 >> >> >> >> >