so, There is no way to crawl if they blocked their web-sites to crawl ? I've
one idea, But seems little bit foolish(not works/I've to Modify whole
architecture) still I'm telling, If I use Html-Parser(Jsoup) Instead of
fetcher then? Anyhow Html-parser easily takes all contents of the
web-page.Can i do this.. I think rest of the
parts(segments,updater,indexer,parser) I've to write all these things, I
think it'll(Html-parser) not work with the already existing (parts) if i
replace fetcher with Html-parser.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ANNOUNCE-Web-Crawler-tp2607833p4078228.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to