nutch-dev
Thread
Date
Earlier messages
Later messages
Messages by Thread
[jira] Commented: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
JIRA
[jira] Commented: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Edward Quick (JIRA)
[jira] Updated: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Edward Quick (JIRA)
[jira] Commented: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Edward Quick (JIRA)
[jira] Commented: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
JIRA
[jira] Updated: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
JIRA
[jira] Assigned: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Chris A. Mattmann (JIRA)
[jira] Work started: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Chris A. Mattmann (JIRA)
[jira] Updated: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Sami Siren (JIRA)
[jira] Commented: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Chris A. Mattmann (JIRA)
[jira] Resolved: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Sami Siren (JIRA)
[jira] Commented: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
hasan (JIRA)
Problem compiling plugins
Pau
Re: Problem compiling plugins
ogjunk-nutch
Re: Problem compiling plugins
Pau
Welcome Otis Gospodnetic as Nutch committer
Andrzej Bialecki
Re: Welcome Otis Gospodnetic as Nutch committer
Dennis Kubes
Re: Welcome Otis Gospodnetic as Nutch committer
ogjunk-nutch
Re: Welcome Otis Gospodnetic as Nutch committer
wuqi
[jira] Commented: (NUTCH-594) Serve Nutch search results in XML and JSON
wojtek kolodziejczyk (JIRA)
[jira] Commented: (NUTCH-594) Serve Nutch search results in XML and JSON
Dennis Kubes (JIRA)
Internet crawl: CrawlDb getting big!
Mathijs Homminga
Re: Internet crawl: CrawlDb getting big!
wuqi
Re: Internet crawl: CrawlDb getting big!
Mathijs Homminga
Re: Internet crawl: CrawlDb getting big!
wuqi
Re: Internet crawl: CrawlDb getting big!
Mathijs Homminga
Re: Internet crawl: CrawlDb getting big!
ogjunk-nutch
[Nutch Wiki] Update of "FortuneCookies" by OtisGospodnetic
Apache Wiki
[Nutch Wiki] Update of "Support" by OtisGospodnetic
Apache Wiki
[Nutch Wiki] Update of "Support" by OtisGospodnetic
Apache Wiki
[Nutch Wiki] Update of "Support" by OtisGospodnetic
Apache Wiki
[jira] Created: (NUTCH-630) Error caused by index-more plugin in the latest svn revision - 652259
taknev ivrok (JIRA)
[jira] Closed: (NUTCH-630) Error caused by index-more plugin in the latest svn revision - 652259
JIRA
Nutch 2 Architecture
info
Re: Nutch 2 Architecture
Dennis Kubes
[Nutch Wiki] Update of "Nutch2Architecture" by DennisKubes
Apache Wiki
[Nutch Wiki] Update of "Nutch2Architecture" by DennisKubes
Apache Wiki
[Nutch Wiki] Update of "FetchCycleOverlap" by OtisGospodnetic
Apache Wiki
[Nutch Wiki] Update of "FetchCycleOverlap" by OtisGospodnetic
Apache Wiki
[Nutch Wiki] Update of "FetchCycleOverlap" by OtisGospodnetic
Apache Wiki
[Nutch Wiki] Update of "GettingNutchRunningWithDebian" by StevenHayles
Apache Wiki
Re: Fetching inefficiency
ogjunk-nutch
Re: Fetching inefficiency
Ken Krugler
Fw: [jira] Closed: (INFRA-1583) Wiki => email not working for Nutch wiki
ogjunk-nutch
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
ogjunk-nutch
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
ogjunk-nutch
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
ogjunk-nutch
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki
[jira] Closed: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
[jira] Resolved: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
Hudson (JIRA)
[jira] Assigned: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
[jira] Updated: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
[jira] Updated: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
[jira] Updated: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
Caspar MacRae (JIRA)
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
Caspar MacRae (JIRA)
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
julien nioche (JIRA)
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
James Tan (JIRA)
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
James Tan (JIRA)
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
Guillaume Smet (JIRA)
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
Nick Tkach (JIRA)
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Vladimir Garvardt (JIRA)
Re: [jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Julien Nioche
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
James Tan (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
James Tan (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Dmitry Grinberg (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Enis Soztutar (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
julien nioche (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Felix Z. (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Felix Z. (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
julien nioche (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Tony Wang (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Aaron Hammond (JIRA)
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Hudson (JIRA)
Wiki -> email -> nutch-dev?
ogjunk-nutch
Re: Wiki -> email -> nutch-dev?
Dennis Kubes
Re: Wiki -> email -> nutch-dev?
ogjunk-nutch
Re: Wiki -> email -> nutch-dev?
Dennis Kubes
Re: Wiki -> email -> nutch-dev?
ogjunk-nutch
[jira] Created: (NUTCH-629) Detect slow and timeout servers and drop their URLs
Otis Gospodnetic (JIRA)
[jira] Updated: (NUTCH-629) Detect slow and timeout servers and drop their URLs
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-629) Detect slow and timeout servers and drop their URLs
Otis Gospodnetic (JIRA)
[jira] Assigned: (NUTCH-629) Detect slow and timeout servers and drop their URLs
Otis Gospodnetic (JIRA)
[jira] Created: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Updated: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Updated: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Issue Comment Edited: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Issue Comment Edited: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki (JIRA)
[jira] Issue Comment Edited: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Updated: (NUTCH-628) Host database to keep track of host-level information
JIRA
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Updated: (NUTCH-628) Host database to keep track of host-level information
JIRA
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Hudson (JIRA)
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
[jira] Updated: (NUTCH-628) Host database to keep track of host-level information
Chris A. Mattmann (JIRA)
Keywords in documents
Amit Kumar Verma
Re: Keywords in documents
ogjunk-nutch
Fetcher2 Reduce Phase Question
Sandeep Tata
Re: Fetcher2 Reduce Phase Question
Andrzej Bialecki
[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java
Ned Rockson (JIRA)
[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java
Julien Nioche (JIRA)
[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java
Dmitry Lihachev (JIRA)
[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java
Chris A. Mattmann (JIRA)
Re: [jira] Updated: (NUTCH-627) Minimize host address lookup
ogjunk-nutch
[jira] Closed: (NUTCH-500) Add hadoop masters configuration file into conf folder
Dennis Kubes (JIRA)
[jira] Created: (NUTCH-627) Minimize host address lookup
Otis Gospodnetic (JIRA)
[jira] Updated: (NUTCH-627) Minimize host address lookup
Otis Gospodnetic (JIRA)
Re: [jira] Updated: (NUTCH-627) Minimize host address lookup
Andrzej Bialecki
Re: [jira] Updated: (NUTCH-627) Minimize host address lookup
Dennis Kubes
Re: [jira] Updated: (NUTCH-627) Minimize host address lookup
Chris Mattmann
[jira] Assigned: (NUTCH-627) Minimize host address lookup
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-627) Minimize host address lookup
Andrzej Bialecki (JIRA)
[jira] Resolved: (NUTCH-627) Minimize host address lookup
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-627) Minimize host address lookup
Hudson (JIRA)
[jira] Closed: (NUTCH-627) Minimize host address lookup
JIRA
Hudson build is back to normal: Nutch-trunk #416
Apache Hudson Server
Re: what is the difference between nutch and some other opensource search engines
ogjunk-nutch
found a bug in plugin/protocol-http
cybercouf
Build failed in Hudson: Nutch-trunk #413
Apache Hudson Server
Hudson build is back to normal: Nutch-trunk #414
Apache Hudson Server
[jira] Created: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
Remco Verhoef (JIRA)
[jira] Updated: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
Remco Verhoef (JIRA)
[jira] Assigned: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
JIRA
[jira] Updated: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
Otis Gospodnetic (JIRA)
[jira] Updated: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
JIRA
[jira] Resolved: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
Sami Siren (JIRA)
[jira] Commented: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
Hudson (JIRA)
Build failed in Hudson: Nutch-trunk #411
Apache Hudson Server
Hudson build is back to normal: Nutch-trunk #412
Apache Hudson Server
Build failed in Hudson: Nutch-trunk #408
Apache Hudson Server
Build failed in Hudson: Nutch-trunk #409
Apache Hudson Server
Hudson build is back to normal: Nutch-trunk #410
Apache Hudson Server
Is there any LSI implementation?
Edward J. Yoon
Re: Is there any LSI implementation?
ogjunk-nutch
[jira] Commented: (NUTCH-500) Add hadoop masters configuration file into conf folder
Dennis Kubes (JIRA)
[jira] Commented: (NUTCH-500) Add hadoop masters configuration file into conf folder
Hudson (JIRA)
[jira] Updated: (NUTCH-500) Add hadoop masters configuration file into conf folder
Dennis Kubes (JIRA)
[jira] Assigned: (NUTCH-16) boost documents matching a url pattern
Dennis Kubes (JIRA)
[jira] Assigned: (NUTCH-48) "Did you mean" query enhancement/refignment feature request
Dennis Kubes (JIRA)
[jira] Closed: (NUTCH-75) Patch for WebDBReader to get more detailed information about WebDBs
Dennis Kubes (JIRA)
[jira] Assigned: (NUTCH-213) checkstyle
Dennis Kubes (JIRA)
[jira] Assigned: (NUTCH-295) More description for fetcher.threads.fetch property
Dennis Kubes (JIRA)
[jira] Assigned: (NUTCH-291) OpenSearchServlet should return "date" as well as "lastModified"
Dennis Kubes (JIRA)
[jira] Assigned: (NUTCH-249) black- white list url filtering
Dennis Kubes (JIRA)
[jira] Resolved: (NUTCH-447) Dmoz Structure Parser Tool
Dennis Kubes (JIRA)
[jira] Closed: (NUTCH-447) Dmoz Structure Parser Tool
Dennis Kubes (JIRA)
[jira] Assigned: (NUTCH-500) Add hadoop masters configuration file into conf folder
Dennis Kubes (JIRA)
[jira] Closed: (NUTCH-555) StackOverflowError in DomContentUtils
Dennis Kubes (JIRA)
[jira] Resolved: (NUTCH-555) StackOverflowError in DomContentUtils
Dennis Kubes (JIRA)
[jira] Updated: (NUTCH-609) Allow Plugins to be Loaded from Jar File(s)
Dennis Kubes (JIRA)
[jira] Updated: (NUTCH-609) Allow Plugins to be Loaded from Jar File(s)
Sami Siren (JIRA)
[jira] Updated: (NUTCH-609) Allow Plugins to be Loaded from Jar File(s)
Chris A. Mattmann (JIRA)
[jira] Created: (NUTCH-625) Non-ascii character broken in dumped content for mixed encoding (utf-8 and multi-byte)
Vinci (JIRA)
[jira] Updated: (NUTCH-625) Non-ascii character broken in dumped content for mixed encoding (utf-8 and multi-byte)
Vinci (JIRA)
[jira] Updated: (NUTCH-625) Non-ascii character broken in dumped content for mixed encoding (utf-8 and multi-byte)
JIRA
[jira] Created: (NUTCH-624) Better parsed text
Vinci (JIRA)
[jira] Updated: (NUTCH-624) Better parsed text by default parser
Vinci (JIRA)
[jira] Closed: (NUTCH-624) Better parsed text by default parser
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-624) Better parsed text by default parser
Andrzej Bialecki (JIRA)
Re: [jira] Created: (NUTCH-624) Better parsed text
ogjunk-nutch
Re: [jira] Created: (NUTCH-624) Better parsed text
Vinci
siteinfo.xml
Chen, Tao
[jira] Created: (NUTCH-623) Change name of plugin source directory from "languageidentifier" to "language-identifier"
Ignacio J. Ortega (JIRA)
[jira] Updated: (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier"
Ignacio J. Ortega (JIRA)
Build failed in Hudson: Nutch-trunk #404
Apache Hudson Server
Build failed in Hudson: Nutch-trunk #405
Apache Hudson Server
Hudson build is back to normal: Nutch-trunk #406
Apache Hudson Server
Glitches debuggging on eclipse with languageidentifier plugin
Nacho (Derecho.com)
Earlier messages
Later messages