Messages by Thread
-
[Nutch Wiki] Update of "RunningNutchAndSolr" by Dmitriu s
Apache Wiki
-
[jira] Updated: (NUTCH-162) country code "jp" is used instead of language code "ja" for Japanese
Hiroaki Kawai (JIRA)
-
[VOTE] Apache Nutch 1.1 Release Candidate #3
Mattmann, Chris A (388J)
-
[jira] Created: (NUTCH-817) parse-(html)does follow links of full html page, parse-(tika) does follow any links and stops at level 1
matthew a. grisius (JIRA)
-
[jira] Created: (NUTCH-816) Add zip target to build.xml
Chris A. Mattmann (JIRA)
-
[jira] Work stopped: (NUTCH-466) Flexible segment format
Andrzej Bialecki (JIRA)
-
[jira] Created: (NUTCH-814) SegmentMerger bug
Andrzej Bialecki (JIRA)
-
Re: Running ANT; was -- Re: [VOTE] Apache Nutch 1.1 Release Candidate #2
Mattmann, Chris A (388J)
-
[VOTE] Apache Nutch 1.1 Release Candidate #2
Mattmann, Chris A (388J)
-
TLP Status
Grant Ingersoll
-
[jira] Created: (NUTCH-813) Repetitive crawl 403 status page
Nguyen Manh Tien (JIRA)
-
Developing Nutch for semantic search
Adarsh malu
-
[jira] Created: (NUTCH-812) Crawl.java incorrectly uses the Generator API resulting in NPE
Andrzej Bialecki (JIRA)
-
[jira] Created: (NUTCH-811) Develop an ORM framework
Enis Soztutar (JIRA)
-
[jira] Resolved: (NUTCH-570) Improvement of URL Ordering in Generator.java
Otis Gospodnetic (JIRA)
-
[VOTE 2] Board resolution for Nutch as TLP
Andrzej Bialecki
-
[VOTE] Board resolution for Nutch as TLP
Andrzej Bialecki
-
Adding jpeg parser to nutch
Gombkötő Dávid
-
[DISCUSS] Board resolution for Nutch as TLP
Andrzej Bialecki
-
[Nutch Wiki] Update of "Nutch2Roadmap" by JulienNioche
Apache Wiki
-
[VOTE] Apache Nutch 1.1 Release Candidate #1
Mattmann, Chris A (388J)
-
Nutch 2.0 roadmap
Julien Nioche
-
release of 1.1?
Julien Nioche
-
[jira] Created: (NUTCH-810) Upgrade to Tika 0.7
Julien Nioche (JIRA)
-
Question: Nutch 0.8.2 and Nutch 0.7.3?
Mattmann, Chris A (388J)
-
[jira] Created: (NUTCH-809) Parse-metatags plugin
Julien Nioche (JIRA)
-
Re: [VOTE] Apache Tika 0.7 Release Candidate #1
Mattmann, Chris A (388J)
-
[jira] Created: (NUTCH-808) Evaluate ORM Frameworks which support non-relational column-oriented datastores and RDBMs
Enis Soztutar (JIRA)
-
[jira] Created: (NUTCH-807) JSParseFilter produces weired URL
Minyao Zhu (JIRA)
-
[jira] Updated: (NUTCH-475) Adaptive crawl delay
Chris A. Mattmann (JIRA)
-
[jira] Updated: (NUTCH-541) Index url field untokenized
Chris A. Mattmann (JIRA)
-
[jira] Updated: (NUTCH-460) RDF parser plugin
Chris A. Mattmann (JIRA)
-
[jira] Updated: (NUTCH-540) some problem about the Nutch cache
Chris A. Mattmann (JIRA)
-
[jira] Updated: (NUTCH-564) External parser supports encoding attribute
Chris A. Mattmann (JIRA)
-
[jira] Updated: (NUTCH-577) Use explicit tika-config.xml file to enable mime magic detection to be turned on and off
Chris A. Mattmann (JIRA)
-
[jira] Updated: (NUTCH-763) Separate configuration files from resources to be included in the job file
Chris A. Mattmann (JIRA)
-
[jira] Created: (NUTCH-806) Merge CrawlDBScanner with CrawlDBReader
Julien Nioche (JIRA)
-
[jira] Issue Comment Edited: (NUTCH-224) Nutch doesn't handle Korean text at all
Attila Pados (JIRA)
-
[jira] Commented: (NUTCH-224) Nutch doesn't handle Korean text at all
Attila Pados (JIRA)
-
[jira] Created: (NUTCH-805) Unable to resolve the url-blah-blah, skipping
P Kaustubh (JIRA)
-
[jira] Created: (NUTCH-804) CrawlDatum.statNames can be modified
Mike Baranczak (JIRA)
-
Will Nutch move to HBase 0.20
work only
-
[Nutch Wiki] Update of "Support" by Christopher Bader
Apache Wiki
-
[Nutch Wiki] Update of "FAQ" by Ankit Dangi
Apache Wiki
-
[DISCUSS] Nutch as a top level project (TLP)?
Andrzej Bialecki
-
[jira] Created: (NUTCH-803) Upgrade Hadoop to 0.20.2
Andrzej Bialecki (JIRA)
-
Crawling authenticated websites !
Ranganath Cuddapah
-
[jira] Created: (NUTCH-802) Problems managing outlinks with large url length
JIRA
-
[jira] Updated: (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic
garpinc (JIRA)
-
[Nutch Wiki] Trivial Update of "Crawl" by susam
Apache Wiki
-
[Nutch Wiki] Trivial Update of "HttpAuthenticationSchemes " by susam
Apache Wiki
-
[Nutch Wiki] Update of "HttpAuthenticationSchemes" by s usam
Apache Wiki
-
Creating new linked entries in crawlDB
nikinch
-
Increasing the score of especific pages
Santiago Pérez
-
[jira] Created: (NUTCH-801) Remove RTF and MP3 parse plugins
Julien Nioche (JIRA)
-
1.1 release?
Mattmann, Chris A (388J)
-
adding an Index attribute
Sahil Shah
-
hi
hussam hamdan
-
[jira] Created: (NUTCH-800) Generator builds a URL list that is not encoded
Jesse Campbell (JIRA)
-
problem while crawling with nucht 1.0
kadiyalasubhash
-
Ning's HTTP Client Library
Lukáš Vlček
-
[Nutch Wiki] Update of "Becoming_A_Nutch_Developer " by maqboolzee
Apache Wiki
-
[jira] Created: (NUTCH-799) SOLRIndexer to commit once all reducers have finished
Julien Nioche (JIRA)
-
Hudson build is back to normal : Nutch-trunk #1080
Apache Hudson Server
-
[jira] Created: (NUTCH-798) Upgrade to SOLR1.4
Julien Nioche (JIRA)
-
[Nutch Wiki] Update of "Evaluations" by IvanKelly
Apache Wiki
-
New attachment added to page Evaluations on Nutch Wiki
Apache Wiki
-
[jira] Created: (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"
Robert Hohman (JIRA)