nutch-dev
Thread
Date
Earlier messages
Later messages
Messages by Date
2009/12/14
[Nutch Wiki] Update of "TikaPlugin" by JulienNioche
Apache Wiki
2009/12/14
[Nutch Wiki] Update of "FrontPage" by JulienNioche
Apache Wiki
2009/12/13
Build failed in Hudson: Nutch-trunk #1011
Apache Hudson Server
2009/12/12
Build failed in Hudson: Nutch-trunk #1010
Apache Hudson Server
2009/12/11
Build failed in Hudson: Nutch-trunk #1009
Apache Hudson Server
2009/12/11
[jira] Commented: (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic
Morille Jerome (JIRA)
2009/12/11
[jira] Updated: (NUTCH-655) Injecting Crawl metadata
Julien Nioche (JIRA)
2009/12/10
Build failed in Hudson: Nutch-trunk #1008
Apache Hudson Server
2009/12/10
Filtering ParseSegment
MilleBii
2009/12/09
Build failed in Hudson: Nutch-trunk #1007
Apache Hudson Server
2009/12/09
RE: java.net.URL synchronization
Fuad Efendi
2009/12/09
RE: java.net.URL synchronization
Fuad Efendi
2009/12/09
java.net.URL synchronization
Otis Gospodnetic
2009/12/07
Re: State of nutchbase
Doğacan Güney
2009/12/06
Re: State of nutchbase
Andrzej Bialecki
2009/12/05
[jira] Issue Comment Edited: (NUTCH-770) Timebomb for Fetcher
MilleBii (JIRA)
2009/12/05
[jira] Commented: (NUTCH-770) Timebomb for Fetcher
MilleBii (JIRA)
2009/12/05
State of nutchbase
Alban Mouton
2009/12/05
[jira] Updated: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Julien Nioche (JIRA)
2009/12/05
[jira] Reopened: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Julien Nioche (JIRA)
2009/12/04
Hudson build is back to normal: Nutch-trunk #1002
Apache Hudson Server
2009/12/04
[jira] Commented: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Hudson (JIRA)
2009/12/04
[jira] Closed: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Andrzej Bialecki (JIRA)
2009/12/04
[jira] Updated: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Andrzej Bialecki (JIRA)
2009/12/03
Build failed in Hudson: Nutch-trunk #1001
Apache Hudson Server
2009/12/03
[jira] Updated: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Julien Nioche (JIRA)
2009/12/02
Build failed in Hudson: Nutch-trunk #1000
Apache Hudson Server
2009/12/02
[jira] Updated: (NUTCH-774) Retry interval in crawl date is set to 0
Reinhard Schwab (JIRA)
2009/12/02
[jira] Updated: (NUTCH-774) Retry interval in crawl date is set to 0
Reinhard Schwab (JIRA)
2009/12/02
[jira] Commented: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Andrzej Bialecki (JIRA)
2009/12/02
[jira] Reopened: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Andrzej Bialecki (JIRA)
2009/12/02
[jira] Created: (NUTCH-774) Retry interval in crawl date is set to 0
Reinhard Schwab (JIRA)
2009/12/01
[jira] Commented: (NUTCH-666) Analysis plugins for multiple language and new Language Identifier Tool
Raja Santosh Panda (JIRA)
2009/12/01
Build failed in Hudson: Nutch-trunk #999
Apache Hudson Server
2009/12/01
[jira] Commented: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Andrzej Bialecki (JIRA)
2009/12/01
[jira] Closed: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Andrzej Bialecki (JIRA)
2009/12/01
[jira] Updated: (NUTCH-767) Update Tika to v0.5 for the MimeType detection
Julien Nioche (JIRA)
2009/12/01
[jira] Updated: (NUTCH-767) Update Tika to v5.0 for the MimeType detection
Julien Nioche (JIRA)
2009/12/01
[jira] Closed: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions
Andrzej Bialecki (JIRA)
2009/12/01
[jira] Commented: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions
Andrzej Bialecki (JIRA)
2009/12/01
[jira] Closed: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20
Dennis Kubes (JIRA)
2009/12/01
[jira] Closed: (NUTCH-770) Timebomb for Fetcher
Andrzej Bialecki (JIRA)
2009/12/01
[jira] Commented: (NUTCH-770) Timebomb for Fetcher
Andrzej Bialecki (JIRA)
2009/12/01
[jira] Commented: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20
Andrzej Bialecki (JIRA)
2009/11/30
[jira] Commented: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20
Dennis Kubes (JIRA)
2009/11/30
Build failed in Hudson: Nutch-trunk #998
Apache Hudson Server
2009/11/30
Re: wrong wiki front page
Alban Mouton
2009/11/30
Re: wrong wiki front page
Andrzej Bialecki
2009/11/30
Re: wrong wiki front page
Alban Mouton
2009/11/30
[jira] Updated: (NUTCH-770) Timebomb for Fetcher
Julien Nioche (JIRA)
2009/11/30
[jira] Updated: (NUTCH-770) Timebomb for Fetcher
Julien Nioche (JIRA)
2009/11/30
[jira] Commented: (NUTCH-770) Timebomb for Fetcher
Andrzej Bialecki (JIRA)
2009/11/30
[jira] Commented: (NUTCH-770) Timebomb for Fetcher
Julien Nioche (JIRA)
2009/11/30
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Julien Nioche (JIRA)
2009/11/29
[jira] Issue Comment Edited: (NUTCH-770) Timebomb for Fetcher
MilleBii (JIRA)
2009/11/28
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop
Hudson (JIRA)
2009/11/28
[jira] Commented: (NUTCH-746) NutchBeanConstructor does not close NutchBean upon contextDestroyed, causing resource leak in the container.
Hudson (JIRA)
2009/11/28
[jira] Commented: (NUTCH-712) ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers
Hudson (JIRA)
2009/11/28
[jira] Commented: (NUTCH-741) Job file includes multiple copies of nutch config files.
Hudson (JIRA)
2009/11/28
[jira] Commented: (NUTCH-738) Close SegmentUpdater when FetchedSegments is closed
Hudson (JIRA)
2009/11/28
[Nutch Wiki] Trivial Update of "Automating_Fetches_wi th_Python" by newacct
Apache Wiki
2009/11/28
[jira] Commented: (NUTCH-712) ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Closed: (NUTCH-712) ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Closed: (NUTCH-741) Job file includes multiple copies of nutch config files.
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Commented: (NUTCH-741) Job file includes multiple copies of nutch config files.
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Commented: (NUTCH-755) DomainURLFilter crashes on malformed URL
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Closed: (NUTCH-755) DomainURLFilter crashes on malformed URL
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Closed: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Closed: (NUTCH-738) Close SegmentUpdater when FetchedSegments is closed
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Commented: (NUTCH-738) Close SegmentUpdater when FetchedSegments is closed
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Commented: (NUTCH-746) NutchBeanConstructor does not close NutchBean upon contextDestroyed, causing resource leak in the container.
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Closed: (NUTCH-746) NutchBeanConstructor does not close NutchBean upon contextDestroyed, causing resource leak in the container.
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Commented: (NUTCH-770) Timebomb for Fetcher
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Commented: (NUTCH-770) Timebomb for Fetcher
MilleBii (JIRA)
2009/11/28
[jira] Commented: (NUTCH-770) Timebomb for Fetcher
Julien Nioche (JIRA)
2009/11/28
[jira] Commented: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions
Julien Nioche (JIRA)
2009/11/28
[jira] Updated: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions
Julien Nioche (JIRA)
2009/11/28
[jira] Updated: (NUTCH-770) Timebomb for Fetcher
MilleBii (JIRA)
2009/11/28
[jira] Commented: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions
Andrzej Bialecki (JIRA)
2009/11/28
[jira] Commented: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer
Hudson (JIRA)
2009/11/28
[jira] Commented: (NUTCH-760) Allow field mapping from nutch to solr index
Hudson (JIRA)
2009/11/28
[jira] Commented: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer
Hudson (JIRA)
2009/11/28
[jira] Commented: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1
Hudson (JIRA)
2009/11/28
[jira] Commented: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java
Hudson (JIRA)
2009/11/28
[jira] Commented: (NUTCH-753) Prevent new Fetcher to retrieve the robots twice
Hudson (JIRA)
2009/11/27
[Nutch Wiki] Update of "FrontPage" by Davinder
Apache Wiki
2009/11/27
[Nutch Wiki] Update of "FrontPage" by Davinder
Apache Wiki
2009/11/27
[Nutch Wiki] Update of "FrontPage" by Davinder
Apache Wiki
2009/11/27
[Nutch Wiki] Update of "FrontPage" by Davinder
Apache Wiki
2009/11/26
Re: wrong wiki front page
Alban Mouton
2009/11/25
[jira] Resolved: (NUTCH-185) XMLParser is configurable xml parser plugin.
Chris A. Mattmann (JIRA)
2009/11/25
[jira] Updated: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20
Dennis Kubes (JIRA)
2009/11/25
[jira] Closed: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Commented: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Commented: (NUTCH-760) Allow field mapping from nutch to solr index
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Closed: (NUTCH-760) Allow field mapping from nutch to solr index
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Commented: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Closed: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Commented: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Closed: (NUTCH-753) Prevent new Fetcher to retrieve the robots twice
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Commented: (NUTCH-753) Prevent new Fetcher to retrieve the robots twice
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Commented: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Closed: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java
Andrzej Bialecki (JIRA)
2009/11/25
Re: svn commit: r884075 - /lucene/nutch/trunk/src/java/org/apache/nutch/indexer/solr/SolrIndexer.java
david.stu...@progressivealliance.co.uk
2009/11/25
[jira] Updated: (NUTCH-760) Allow field mapping from nutch to solr index
Andrzej Bialecki (JIRA)
2009/11/25
Re: svn commit: r884075 - /lucene/nutch/trunk/src/java/org/apache/nutch/indexer/solr/SolrIndexer.java
Andrzej Bialecki
2009/11/25
Re: svn commit: r884075 - /lucene/nutch/trunk/src/java/org/apache/nutch/indexer/solr/SolrIndexer.java
david.stu...@progressivealliance.co.uk
2009/11/25
[jira] Updated: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java
Reinhard Schwab (JIRA)
2009/11/25
[jira] Updated: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java
Reinhard Schwab (JIRA)
2009/11/25
[jira] Created: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java
Reinhard Schwab (JIRA)
2009/11/25
Re: svn commit: r884075 - /lucene/nutch/trunk/src/java/org/apache/nutch/indexer/solr/SolrIndexer.java
Dennis Kubes
2009/11/25
[jira] Updated: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1
Andrzej Bialecki (JIRA)
2009/11/25
[jira] Created: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1
Andrzej Bialecki (JIRA)
2009/11/24
Re: Plugin Developement Help
david.stu...@progressivealliance.co.uk
2009/11/24
[jira] Commented: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20
Andrzej Bialecki (JIRA)
2009/11/24
[jira] Commented: (NUTCH-771) Add WebGraph classes to the bin/nutch script
Andrzej Bialecki (JIRA)
2009/11/24
[jira] Commented: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20
Dennis Kubes (JIRA)
2009/11/24
[jira] Created: (NUTCH-771) Add WebGraph classes to the bin/nutch script
Dennis Kubes (JIRA)
2009/11/24
wrong wiki front page
Alban Mouton
2009/11/24
[Nutch Wiki] Update of "FrontPage" by DennisKubes
Apache Wiki
2009/11/24
[Nutch Wiki] Update of "OptimizingCrawls" by DennisKube s
Apache Wiki
2009/11/24
Re: Plugin Developement Help
david.stu...@progressivealliance.co.uk
2009/11/24
Re: Plugin Developement Help
David Stuart
2009/11/24
Re: Plugin Developement Help
david.stu...@progressivealliance.co.uk
2009/11/24
Re: Plugin Developement Help
Andrzej Bialecki
2009/11/24
Plugin Developement Help
david.stu...@progressivealliance.co.uk
2009/11/23
[jira] Updated: (NUTCH-770) Timebomb for Fetcher
Julien Nioche (JIRA)
2009/11/23
[jira] Created: (NUTCH-770) Timebomb for Fetcher
Julien Nioche (JIRA)
2009/11/23
[jira] Updated: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions
Julien Nioche (JIRA)
2009/11/23
[jira] Created: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions
Julien Nioche (JIRA)
2009/11/22
Now Hbase 20
work only
2009/11/21
[jira] Created: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20
Dennis Kubes (JIRA)
2009/11/21
[jira] Closed: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer
Dennis Kubes (JIRA)
2009/11/21
[jira] Resolved: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer
Dennis Kubes (JIRA)
2009/11/21
[jira] Assigned: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer
Dennis Kubes (JIRA)
2009/11/18
[Nutch Wiki] Trivial Update of "NutchHadoopTutorial" by ilgiz
Apache Wiki
2009/11/18
[Nutch Wiki] Update of "NutchHadoopTutorial" by ilgiz
Apache Wiki
2009/11/18
[jira] Commented: (NUTCH-767) Update version of Tika for the MimeType detection
Chris A. Mattmann (JIRA)
2009/11/18
[jira] Assigned: (NUTCH-767) Update version of Tika for the MimeType detection
Chris A. Mattmann (JIRA)
2009/11/18
[jira] Updated: (NUTCH-767) Update version of Tika for the MimeType detection
Julien Nioche (JIRA)
2009/11/18
[jira] Created: (NUTCH-767) Update version of Tika for the MimeType detection
Julien Nioche (JIRA)
2009/11/18
[jira] Updated: (NUTCH-766) Tika parser
Julien Nioche (JIRA)
2009/11/18
[jira] Updated: (NUTCH-766) Tika parser
Julien Nioche (JIRA)
2009/11/18
[jira] Created: (NUTCH-766) Tika parser
Julien Nioche (JIRA)
2009/11/17
Filtering Pages while crawling
sumittyagi
2009/11/17
Re: Update on Integration with Tika
Andrzej Bialecki
2009/11/17
Re: Update on Integration with Tika
Julien Nioche
2009/11/17
Re: Update on Integration with Tika
Andrzej Bialecki
2009/11/17
Re: Update on Integration with Tika
Julien Nioche
2009/11/17
Re: Update on Integration with Tika
Andrzej Bialecki
2009/11/17
Re: Update on Integration with Tika
Jukka Zitting
2009/11/17
Re: Update on Integration with Tika
Julien Nioche
2009/11/16
Re: Update on Integration with Tika
Ken Krugler
2009/11/16
Re: Update on Integration with Tika
Andrzej Bialecki
2009/11/16
Update on Integration with Tika
Julien Nioche
2009/11/14
Re: Plugin Help
Dennis Kubes
2009/11/14
Re: Plugin Help
david.stu...@progressivealliance.co.uk
2009/11/14
Plugin Help
david.stu...@progressivealliance.co.uk
2009/11/14
[Nutch Wiki] Update of "RunNutchInEclipse1.0" by Anas Elghafari
Apache Wiki
2009/11/13
Treating files of Office 2007
BrunoWL
2009/11/12
[jira] Created: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer
Dennis Kubes (JIRA)
2009/11/12
[jira] Updated: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer
Dennis Kubes (JIRA)
2009/11/12
Re: Integration with Tika
Kirby Bohling
2009/11/12
Re: Integration with Tika
Julien Nioche
2009/11/11
[jira] Commented: (NUTCH-573) Multiple Domains - Query Search
Srikarthik Venkataraman (JIRA)
2009/11/10
[jira] Commented: (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss
Andrzej Bialecki (JIRA)
2009/11/10
Re: Integration with Tika
Andrzej Bialecki
2009/11/10
Re: Patch to trunk process
Andrzej Bialecki
2009/11/10
Integration with Tika
BrunoWL
2009/11/10
[jira] Commented: (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss
tcur...@approachingpi.com (JIRA)
2009/11/10
Re: Patch to trunk process
david.stu...@progressivealliance.co.uk
2009/11/10
Re: Patch to trunk process
Andrzej Bialecki
2009/11/10
Patch to trunk process
David Stuart
2009/11/10
[jira] Commented: (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss
Andrzej Bialecki (JIRA)
2009/11/09
[jira] Updated: (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss
tcur...@approachingpi.com (JIRA)
2009/11/09
[jira] Created: (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss
tcur...@approachingpi.com (JIRA)
2009/11/09
[Nutch Wiki] Update of "FrontPage" by TerrenceCurran
Apache Wiki
2009/11/09
[Nutch Wiki] Update of "GettingNutchRunningWithJboss" b y TerrenceCurran
Apache Wiki
2009/11/07
Hudson build is back to normal: Nutch-trunk #986
Apache Hudson Server
2009/11/04
Re: Free live video streaming of ApacheCon US 2009
Israel Ekpo
2009/11/04
[Nutch Wiki] Update of "ApacheConUs2009MeetUp" by Andrz ejBialecki
Apache Wiki
2009/11/04
[Nutch Wiki] Update of "ApacheConUs2009MeetUp" by KenKr ugler
Apache Wiki
2009/11/04
[Nutch Wiki] Update of "ApacheConUs2009MeetUp" by KenKr ugler
Apache Wiki
2009/11/04
[Nutch Wiki] Update of "ApacheConUs2009MeetUp" by KenKr ugler
Apache Wiki
2009/11/04
Free live video streaming of ApacheCon US 2009
Michael McCandless
2009/11/03
[jira] Updated: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB
Julien Nioche (JIRA)
2009/11/03
[jira] Created: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB
Julien Nioche (JIRA)
2009/11/03
[jira] Updated: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer
Julien Nioche (JIRA)
2009/11/03
[jira] Created: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer
Julien Nioche (JIRA)
2009/10/29
[jira] Commented: (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed
David Stuart (JIRA)
2009/10/27
[Nutch Wiki] Update of "DownloadingNutch" by SteveKearn s
Apache Wiki
2009/10/27
[Nutch Wiki] Update of "ApacheConUs2009MeetUp" by KenKr ugler
Apache Wiki
2009/10/27
[jira] Commented: (NUTCH-760) Allow field mapping from nutch to solr index
David Stuart (JIRA)
2009/10/27
[jira] Updated: (NUTCH-760) Allow field mapping from nutch to solr index
David Stuart (JIRA)
2009/10/26
How to index files only with specific type
Dmitriy Fundak
2009/10/26
[jira] Commented: (NUTCH-755) DomainURLFilter crashes on malformed URL
Reinhard Schwab (JIRA)
2009/10/24
[Nutch Wiki] Trivial Update of "首页" by yongping8204
Apache Wiki
2009/10/21
Re: datanode.BlockAlreadyExistsException
Jesse Hires
Earlier messages
Later messages