Hi Michael May be you can custom html parser plugin to parse javascript.
On Tue, Jan 15, 2013 at 6:43 PM, Tejas Patil <[email protected]>wrote: > AFAIK, you cannot configure Fetcher to make use of firefox or htmlunit. You > will perhaps have to change the nutch source by yourself. > > Thanks, > Tejas Patil > > > On Tue, Jan 15, 2013 at 12:02 AM, Michael Gang <[email protected] > >wrote: > > > Hi, > > > > I understand. > > Is there a way to use for a set of predefined pages another browser as > > fetcher? > > For example, would it be possible to say nutch that he should use firefox > > or htmlunit as a fetcher? > > There are many internet sites with ajax loads and where a click makes a > > form submit, where no real html snippets exist. > > > > Thanks, > > David > > > > > > On Sun, Jan 13, 2013 at 8:08 PM, Lewis John Mcgibbney < > > [email protected]> wrote: > > > > > This should be correct yes. > > > If you look at the plugin source you can see the patterns it uses to > > > extract links. > > > Also you can check what's iyour crawldb using the readdb command > > > Hth > > > Lewis > > > > > > On Saturday, January 12, 2013, Michael Gang <[email protected]> > > wrote: > > > > Hi, > > > > > > > > So if there is a javascript which actually submits a form, nutch > won't > > > > follow the link, because it just deals with urls. > > > > Is this correct? > > > > > > > > Thanks, > > > > David > > > > > > > > > > > > On Tue, Jan 8, 2013 at 5:15 PM, Michael Gang <[email protected]> > > > wrote: > > > > > > > >> Hi all, > > > >> > > > >> From the features of nutch > > > >> http://wiki.apache.org/nutch/Features > > > >> i understand that there is a sort of javascript support > > > >> > > > >> JavaScript (for extracting links only?) (parse-js) > > > >> > > > >> I don't understand what this exactly means. > > > >> Let's say if i have a link > > > >> <a onclick="do_something"> > > > >> or a jquery binding in onready > > > >> and in this code i open a new window and show there a result of a > form > > > >> submit > > > >> will nutch extract for me the resulting page as link ? > > > >> > > > >> Thanks, > > > >> David > > > >> > > > >> > > > > > > > > > > -- > > > *Lewis* > > > > > > -- Don't Grow Old, Grow Up... :-)

