Hello, Use Droids, it's much simpler than Nutch or Heritrix:
http://incubator.apache.org/droids/ Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch ----- Original Message ---- > From: Phan The Dai <thienthanhom...@gmail.com> > To: java-user@lucene.apache.org > Sent: Sat, January 16, 2010 2:20:47 AM > Subject: A way to download URLs and index better ? > > Hi everyone, please help me this question: > I need downloading some webpages from a list of URLs (about 200 links) and > then index them by Lucene. > This list is not fixed, because it depends on definition of my process. > Currently, in my web application, I wrote class for downloading, but it > download time is too long. > > Please recommend me a Java library suitable with my situation for optimize > downloading. > More its examples are very wonderful (INPUT: list of URLs; OUTPUT: webpages > content, or indexed repository) > Thank you very much. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org