nutch-dev
Thread
Date
Earlier messages
Later messages
Messages by Date
2008/06/30
[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Michael Gottesman (JIRA)
2008/06/30
[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Andrzej Bialecki (JIRA)
2008/06/28
[jira] Closed: (NUTCH-614) Order Inlinks by OPIC score of parent page
Dennis Kubes (JIRA)
2008/06/28
howto make nutch search only files whose path has certain string in it?
Mr Shore
2008/06/26
some technical advice
Winton Davies
2008/06/24
Re: some doubt on name of class files
Mr Shore
2008/06/24
some doubt on name of class files
Mr Shore
2008/06/24
Re: ask for help, about patch - nutch - hadoop0.17
Lincoln Ritter
2008/06/24
[jira] Updated: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Lincoln Ritter (JIRA)
2008/06/24
Re: how do add a new filed and sort on this field
Mr Shore
2008/06/23
Re: how do add a new filed and sort on this field
Mr Shore
2008/06/23
Re: problem with URLS/nutch
Drew Hite
2008/06/23
Re: problem with URLS/nutch
All day coders
2008/06/23
Plugin Class Loading
Tyler Wykoff
2008/06/23
Re: [jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Julien Nioche
2008/06/23
problem with URLS/nutch
yogesh somvanshi
2008/06/23
[jira] Created: (NUTCH-636) Http client plug-in https doesn't work on IBM JRE
Curtis d'Entremont (JIRA)
2008/06/23
Re: how do add a new filed and sort on this field
Mr Shore
2008/06/23
Re: how do add a new filed and sort on this field
Mr Shore
2008/06/23
Re: how do add a new filed and sort on this field
All day coders
2008/06/23
Re: how do add a new filed and sort on this field
Mr Shore
2008/06/23
Re: how do add a new filed and sort on this field
Mr Shore
2008/06/23
Re: how do add a new filed and sort on this field
Mr Shore
2008/06/22
Re: how do add a new filed and sort on this field
Mr Shore
2008/06/22
Re: Timeline for 1.0 release?
Otis Gospodnetic
2008/06/22
Timeline for 1.0 release?
David Grandinetti
2008/06/22
Re: need some help about distribution
Otis Gospodnetic
2008/06/22
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
julien nioche (JIRA)
2008/06/22
need some help about distribution
Mohammad Monirul Hoque
2008/06/21
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Vladimir Garvardt (JIRA)
2008/06/20
[jira] Updated: (NUTCH-635) LinkAnalysis Tool for Nutch
Dennis Kubes (JIRA)
2008/06/20
Re: how do add a new filed and sort on this field
All day coders
2008/06/20
Boolean query
All day coders
2008/06/19
how do add a new filed and sort on this field
Mr Shore
2008/06/19
Re: nutch 2.0
Dennis Kubes
2008/06/19
nutch 2.0
Marko Bauhardt
2008/06/17
Hadoop get together @ Berlin
idrost
2008/06/15
[jira] Commented: (NUTCH-635) LinkAnalysis Tool for Nutch
Dennis Kubes (JIRA)
2008/06/15
[jira] Updated: (NUTCH-635) LinkAnalysis Tool for Nutch
Dennis Kubes (JIRA)
2008/06/15
[jira] Commented: (NUTCH-635) LinkAnalysis Tool for Nutch
Andrzej Bialecki (JIRA)
2008/06/14
[jira] Updated: (NUTCH-635) LinkAnalysis Tool for Nutch
Dennis Kubes (JIRA)
2008/06/14
[jira] Commented: (NUTCH-635) LinkAnalysis Tool for Nutch
Dennis Kubes (JIRA)
2008/06/14
[jira] Commented: (NUTCH-635) LinkAnalysis Tool for Nutch
Andrzej Bialecki (JIRA)
2008/06/13
[jira] Updated: (NUTCH-635) LinkAnalysis Tool for Nutch
Dennis Kubes (JIRA)
2008/06/12
java.lang.StackOverflowError in HTMLMetaProcessor.getMetaTagsHelper
Siddhartha Reddy
2008/06/12
[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Michael Gottesman (JIRA)
2008/06/12
Re: SegmentMerger "no input paths" problem and "special files/directories"
Lincoln Ritter
2008/06/12
Re: [jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Andrzej Bialecki
2008/06/12
Re: [jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Lincoln Ritter
2008/06/12
[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Andrzej Bialecki (JIRA)
2008/06/12
[jira] Updated: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Lincoln Ritter (JIRA)
2008/06/12
[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Lincoln Ritter (JIRA)
2008/06/12
[jira] Updated: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Michael Gottesman (JIRA)
2008/06/12
[jira] Updated: (NUTCH-635) LinkAnalysis Tool for Nutch
Dennis Kubes (JIRA)
2008/06/12
[jira] Created: (NUTCH-635) LinkAnalysis Tool for Nutch
Dennis Kubes (JIRA)
2008/06/11
[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage
Chris A. Mattmann (JIRA)
2008/06/11
Re: SegmentMerger "no input paths" problem and "special files/directories"
ogjunk-nutch
2008/06/11
SegmentMerger "no input paths" problem and "special files/directories"
Lincoln Ritter
2008/06/10
[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Andrzej Bialecki (JIRA)
2008/06/10
[jira] Assigned: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Andrzej Bialecki (JIRA)
2008/06/10
[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage
Grant Ingersoll (JIRA)
2008/06/10
[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage
Sami Siren (JIRA)
2008/06/10
[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage
Chris A. Mattmann (JIRA)
2008/06/10
[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage
Grant Ingersoll (JIRA)
2008/06/09
[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Michael Gottesman (JIRA)
2008/06/09
[jira] Created: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Michael Gottesman (JIRA)
2008/06/09
Re: nutch-0.9 and hadoop-0.15.0
ogjunk-nutch
2008/06/09
nutch-0.9 and hadoop-0.15.0
m.harig
2008/06/08
Re: nutch file content limit
m.harig
2008/06/06
Re: svn nutch with hadoop .17
ogjunk-nutch
2008/06/06
Re: nutch file content limit
ogjunk-nutch
2008/06/06
upgrade nutch-0.9 hadoop-0.17
m.harig
2008/06/06
Re: nutch file content limit
m.harig
2008/06/05
Re: svn nutch with hadoop .17
Michael Gottesman
2008/06/05
Re: svn nutch with hadoop .17
Lincoln Ritter
2008/06/05
svn nutch with hadoop .17
Michael Gottesman
2008/06/05
Re: svn nutch with hadoop 0.17
Lincoln Ritter
2008/06/05
recrawl in 1.0
scottyd
2008/06/05
Re: nutch file content limit
ogjunk-nutch
2008/06/05
Re: nutch file content limit
m.harig
2008/06/04
[jira] Commented: (NUTCH-618) Tika error "Media type alias already exists"
Hudson (JIRA)
2008/06/04
Re: nutch file content limit
ogjunk-nutch
2008/06/04
[jira] Resolved: (NUTCH-618) Tika error "Media type alias already exists"
Chris A. Mattmann (JIRA)
2008/06/03
nutch file content limit
m.harig
2008/06/03
[jira] Assigned: (NUTCH-621) Nutch needs to declare it's crypto usage
Chris A. Mattmann (JIRA)
2008/06/03
[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage
Chris A. Mattmann (JIRA)
2008/06/01
[jira] Commented: (NUTCH-618) Tika error "Media type alias already exists"
Chris A. Mattmann (JIRA)
2008/06/01
[jira] Updated: (NUTCH-618) Tika error "Media type alias already exists"
Chris A. Mattmann (JIRA)
2008/06/01
[jira] Work logged: (NUTCH-618) Tika error "Media type alias already exists"
Chris A. Mattmann (JIRA)
2008/06/01
[jira] Updated: (NUTCH-618) Tika error "Media type alias already exists"
Chris A. Mattmann (JIRA)
2008/05/31
[jira] Updated: (NUTCH-621) Nutch needs to declare it's crypto usage
Grant Ingersoll (JIRA)
2008/05/29
[jira] Commented: (NUTCH-618) Tika error "Media type alias already exists"
Chris A. Mattmann (JIRA)
2008/05/29
[Nutch Wiki] Update of "DownloadingNutch" by ChrisAnderson
Apache Wiki
2008/05/29
Re: Patch Nutch -> Hadoop .17
Andrzej Bialecki
2008/05/28
Running nutch tests with a special configuration
gabriele renzi
2008/05/27
Re: Crawler Data
kranthi reddy
2008/05/27
Patch Nutch -> Hadoop .17
Michael Gottesman
2008/05/25
RE: Nutch Crawling - Failed for internet crawling
Sivakumar Sivagnanam NCS
2008/05/24
Re: Nutch Crawling - Failed for internet crawling
All day coders
2008/05/21
[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java
Ned Rockson (JIRA)
2008/05/21
Re: Adding Otis to JIRA
ogjunk-nutch
2008/05/21
[jira] Assigned: (NUTCH-629) Detect slow and timeout servers and drop their URLs
Otis Gospodnetic (JIRA)
2008/05/21
[jira] Assigned: (NUTCH-627) Minimize host address lookup
Otis Gospodnetic (JIRA)
2008/05/21
[jira] Updated: (NUTCH-570) Improvement of URL Ordering in Generator.java
Otis Gospodnetic (JIRA)
2008/05/21
Re: Adding Otis to JIRA
Andrzej Bialecki
2008/05/21
Adding Otis to JIRA
Otis Gospodnetic
2008/05/20
Nutch Crawling - Failed for internet crawling
Sivakumar_NCS
2008/05/19
[jira] Created: (NUTCH-632) Bug in TextParser with encoding
Antony Bowesman (JIRA)
2008/05/19
[nutch-dev] Nutch experts wanted
Jim R. Wilson
2008/05/18
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
2008/05/18
[Nutch Wiki] Update of "Nutch 0.9 Crawl Script Tutorial" by AlessioTomasino
Apache Wiki
2008/05/15
Re: Bug in NutchAnalysis.java
ivrokv
2008/05/15
Re: Bug in NutchAnalysis.java
ogjunk-nutch
2008/05/15
Bug in NutchAnalysis.java
ivrokv
2008/05/14
Bug in Content+TextParser?
Bowesman Antony
2008/05/13
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
Caspar MacRae (JIRA)
2008/05/12
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
Caspar MacRae (JIRA)
2008/05/12
Re: Writing a plugin
Pau
2008/05/12
[Nutch Wiki] Update of "PublicServers" by Finbar Dineen
Apache Wiki
2008/05/11
Re: Writing a plugin
ogjunk-nutch
2008/05/11
Writing a plugin
Pau
2008/05/10
Re: Problem compiling plugins
Pau
2008/05/09
[jira] Created: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Stefan Will (JIRA)
2008/05/09
Re: Problem compiling plugins
ogjunk-nutch
2008/05/09
Problem compiling plugins
Pau
2008/05/08
Re: Welcome Otis Gospodnetic as Nutch committer
wuqi
2008/05/08
[jira] Commented: (NUTCH-594) Serve Nutch search results in XML and JSON
Dennis Kubes (JIRA)
2008/05/08
Re: Welcome Otis Gospodnetic as Nutch committer
Dennis Kubes
2008/05/08
Re: Welcome Otis Gospodnetic as Nutch committer
ogjunk-nutch
2008/05/08
Welcome Otis Gospodnetic as Nutch committer
Andrzej Bialecki
2008/05/08
[jira] Commented: (NUTCH-594) Serve Nutch search results in XML and JSON
wojtek kolodziejczyk (JIRA)
2008/05/07
[Nutch Wiki] Update of "FetchCycleOverlap" by OtisGospodnetic
Apache Wiki
2008/05/07
Re: Internet crawl: CrawlDb getting big!
ogjunk-nutch
2008/05/07
Re: Internet crawl: CrawlDb getting big!
Mathijs Homminga
2008/05/07
Re: Internet crawl: CrawlDb getting big!
wuqi
2008/05/07
Re: Internet crawl: CrawlDb getting big!
Mathijs Homminga
2008/05/06
Re: Internet crawl: CrawlDb getting big!
wuqi
2008/05/06
[Nutch Wiki] Update of "Support" by OtisGospodnetic
Apache Wiki
2008/05/06
Internet crawl: CrawlDb getting big!
Mathijs Homminga
2008/05/06
[Nutch Wiki] Update of "FortuneCookies" by OtisGospodnetic
Apache Wiki
2008/05/03
[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage
Grant Ingersoll (JIRA)
2008/05/01
[Nutch Wiki] Update of "Support" by OtisGospodnetic
Apache Wiki
2008/04/30
[jira] Created: (NUTCH-630) Error caused by index-more plugin in the latest svn revision - 652259
taknev ivrok (JIRA)
2008/04/28
Re: Nutch 2 Architecture
Dennis Kubes
2008/04/28
Nutch 2 Architecture
info
2008/04/27
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
2008/04/24
[Nutch Wiki] Update of "Nutch2Architecture" by DennisKubes
Apache Wiki
2008/04/24
[Nutch Wiki] Update of "Nutch2Architecture" by DennisKubes
Apache Wiki
2008/04/23
[Nutch Wiki] Update of "FetchCycleOverlap" by OtisGospodnetic
Apache Wiki
2008/04/22
[Nutch Wiki] Update of "FetchCycleOverlap" by OtisGospodnetic
Apache Wiki
2008/04/22
[Nutch Wiki] Update of "GettingNutchRunningWithDebian" by StevenHayles
Apache Wiki
2008/04/22
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki
2008/04/21
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
ogjunk-nutch
2008/04/21
Re: Fetching inefficiency
Ken Krugler
2008/04/21
Re: Fetching inefficiency
ogjunk-nutch
2008/04/20
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki
2008/04/19
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
ogjunk-nutch
2008/04/19
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki
2008/04/19
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki (JIRA)
2008/04/19
Fw: [jira] Closed: (INFRA-1583) Wiki => email not working for Nutch wiki
ogjunk-nutch
2008/04/18
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
Hudson (JIRA)
2008/04/18
Re: [jira] Commented: (NUTCH-628) Host database to keep track of host-level information
ogjunk-nutch
2008/04/18
[jira] Updated: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
2008/04/18
[jira] Closed: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
2008/04/18
[jira] Resolved: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
2008/04/18
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
2008/04/18
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
Andrzej Bialecki (JIRA)
2008/04/18
[jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
Otis Gospodnetic (JIRA)
2008/04/17
[jira] Issue Comment Edited: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
2008/04/17
[jira] Updated: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
2008/04/17
[jira] Assigned: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
2008/04/17
[jira] Updated: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS
JIRA
2008/04/17
[jira] Updated: (NUTCH-442) Integrate Solr/Nutch
JIRA
2008/04/17
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki (JIRA)
2008/04/16
[jira] Issue Comment Edited: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
2008/04/16
[jira] Issue Comment Edited: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
2008/04/16
[jira] Updated: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
2008/04/15
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
2008/04/15
[jira] Updated: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
2008/04/14
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Otis Gospodnetic (JIRA)
2008/04/14
Re: Wiki -> email -> nutch-dev?
ogjunk-nutch
2008/04/14
[jira] Commented: (NUTCH-629) Detect slow and timeout servers and drop their URLs
Otis Gospodnetic (JIRA)
2008/04/13
Re: Wiki -> email -> nutch-dev?
Dennis Kubes
2008/04/13
Re: Wiki -> email -> nutch-dev?
ogjunk-nutch
2008/04/13
Re: Wiki -> email -> nutch-dev?
Dennis Kubes
2008/04/12
Wiki -> email -> nutch-dev?
ogjunk-nutch
2008/04/12
[jira] Updated: (NUTCH-629) Detect slow and timeout servers and drop their URLs
Otis Gospodnetic (JIRA)
2008/04/12
[jira] Created: (NUTCH-629) Detect slow and timeout servers and drop their URLs
Otis Gospodnetic (JIRA)
2008/04/11
[jira] Created: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
2008/04/11
Re: Keywords in documents
ogjunk-nutch
2008/04/11
Keywords in documents
Amit Kumar Verma
2008/04/11
Re: Fetcher2 Reduce Phase Question
Andrzej Bialecki
2008/04/11
Fetcher2 Reduce Phase Question
Sandeep Tata
2008/04/10
[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java
Otis Gospodnetic (JIRA)
2008/04/10
Re: [jira] Updated: (NUTCH-627) Minimize host address lookup
ogjunk-nutch
2008/04/10
Re: [jira] Updated: (NUTCH-627) Minimize host address lookup
Chris Mattmann
2008/04/10
[jira] Closed: (NUTCH-500) Add hadoop masters configuration file into conf folder
Dennis Kubes (JIRA)
2008/04/10
Re: [jira] Updated: (NUTCH-627) Minimize host address lookup
Dennis Kubes
2008/04/10
Re: [jira] Updated: (NUTCH-627) Minimize host address lookup
Andrzej Bialecki
2008/04/09
[jira] Updated: (NUTCH-627) Minimize host address lookup
Otis Gospodnetic (JIRA)
Earlier messages
Later messages