Hello,

Use Droids, it's much simpler than Nutch or Heritrix:

http://incubator.apache.org/droids/

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Phan The Dai <thienthanhom...@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Sat, January 16, 2010 2:20:47 AM
> Subject: A way to download URLs and index better ?
> 
> Hi everyone, please help me this question:
> I need downloading some webpages from a list of URLs (about 200 links) and
> then index them by Lucene.
> This list is not fixed, because it depends on definition of my process.
> Currently, in my web application, I wrote class for downloading, but it
> download time is too long.
> 
> Please recommend me a Java library suitable with my situation for optimize
> downloading.
> More its examples are very wonderful (INPUT: list of URLs; OUTPUT: webpages
> content, or indexed repository)
> Thank you very much.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to