This is an automated email from the ASF dual-hosted git repository.
jnioche pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git
The following commit(s) were added to refs/heads/master by this push:
new 9823911 Update CHANGES.txt
9823911
This is an automated email from the ASF dual-hosted git repository.
jnioche pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git.
from 9249bb0 Syntax issue
adds 7b0103f fix for NUTCH-2046 contributed by jnioche
new 6219a16 Merge
This is an automated email from the ASF dual-hosted git repository.
jnioche pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git
commit 6219a168293aaef3232ca548d44ef10be6828f16
Merge: 9249bb0 7b0103f
Author: Julien Nioche <jul...@digitalpebble.
Author: jnioche
Date: Tue Dec 1 12:47:42 2015
New Revision: 1717412
URL: http://svn.apache.org/viewvc?rev=1717412=rev
Log:
NUTCH-2177 Generator produces only one partition even in distributed mode
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/java/org/apache/nutch/crawl
Author: jnioche
Date: Fri Nov 27 10:00:50 2015
New Revision: 1716818
URL: http://svn.apache.org/viewvc?rev=1716818=rev
Log:
(Trivial)nutch-default: mentioned AdaptiveFetchSchedule in description of
scheduling class config
Modified:
nutch/trunk/conf/nutch-default.xml
Modified: nutch/trunk
Author: jnioche
Date: Tue Nov 24 15:46:47 2015
New Revision: 1716182
URL: http://svn.apache.org/viewvc?rev=1716182=rev
Log:
Fixed minor issue with description for elastic.port config
Modified:
nutch/trunk/conf/nutch-default.xml
Modified: nutch/trunk/conf/nutch-default.xml
URL:
http
Author: jnioche
Date: Tue Sep 22 14:04:10 2015
New Revision: 1704634
URL: http://svn.apache.org/viewvc?rev=1704634=rev
Log:
NUTCH-2102 WARC Exporter
Added:
nutch/trunk/src/java/org/apache/nutch/tools/warc/
nutch/trunk/src/java/org/apache/nutch/tools/warc/WARCExporter.java
nutch/trunk
Author: jnioche
Date: Fri Sep 11 11:57:56 2015
New Revision: 1702434
URL: http://svn.apache.org/r1702434
Log:
added eclipse codeformat
Added:
nutch/trunk/eclipse-codeformat.xml
Added: nutch/trunk/eclipse-codeformat.xml
URL:
http://svn.apache.org/viewvc/nutch/trunk/eclipse-codeformat.xml
Author: jnioche
Date: Wed Sep 2 17:23:52 2015
New Revision: 964073
Log:
closer.lua instead of closer.cgi
Added:
websites/production/nutch/content/
- copied from r964072, websites/staging/nutch/trunk/content/
Author: jnioche
Date: Wed Aug 26 12:41:53 2015
New Revision: 1697911
URL: http://svn.apache.org/r1697911
Log:
CloudSearch indexer
Added:
nutch/trunk/src/plugin/indexer-cloudsearch/
nutch/trunk/src/plugin/indexer-cloudsearch/README.md
nutch/trunk/src/plugin/indexer-cloudsearch
Author: jnioche
Date: Thu Jun 25 13:56:14 2015
New Revision: 1687522
URL: http://svn.apache.org/r1687522
Log:
Adding some continuous crawl goodies to the crawl script NUTCH-2036
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/bin/crawl
Modified: nutch/trunk/CHANGES.txt
URL:
http
Author: jnioche
Date: Wed Apr 22 09:55:23 2015
New Revision: 1675305
URL: http://svn.apache.org/r1675305
Log:
NUTCH-1990 Use URI.normalise() in BasicURLNormalizer (snagel, jnioche)
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/urlnormalizer-basic/src/java/org/apache/nutch/net
Author: jnioche
Date: Fri Jan 30 09:06:23 2015
New Revision: 1655966
URL: http://svn.apache.org/r1655966
Log:
NUTCH-1918 TikaParser specifies a default namespace when generating DOM
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika
Author: jnioche
Date: Thu Oct 16 14:01:20 2014
New Revision: 1632313
URL: http://svn.apache.org/r1632313
Log:
NUTCH-1876 Upgrade to Crawler-Commons 0.5
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/ivy/ivy.xml
Modified: nutch/branches/2.x/CHANGES.txt
URL:
http
Author: jnioche
Date: Fri Sep 12 07:45:33 2014
New Revision: 1624471
URL: http://svn.apache.org/r1624471
Log:
NUTCH-1840
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrIndexWriter.java
Modified: nutch/trunk
Author: jnioche
Date: Fri Sep 12 07:47:29 2014
New Revision: 1624472
URL: http://svn.apache.org/r1624472
Log:
NUTCH-1840
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrIndexWriter.java
Modified: nutch
Author: jnioche
Date: Mon Sep 8 20:37:19 2014
New Revision: 1623562
URL: http://svn.apache.org/r1623562
Log:
NUTCH-1837 Upgrade to Tika 1.6 (jnioche)
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/ivy/ivy.xml
nutch/trunk/src/plugin/parse-tika/ivy.xml
nutch/trunk/src/plugin/parse
Author: jnioche
Date: Fri Aug 29 08:08:55 2014
New Revision: 1621258
URL: http://svn.apache.org/r1621258
Log:
Changed title of page
Modified:
nutch/cms_site/trunk/content/version_control.md
Modified: nutch/cms_site/trunk/content/version_control.md
URL:
http://svn.apache.org/viewvc/nutch
Author: jnioche
Date: Fri Aug 29 08:10:49 2014
New Revision: 920681
Log:
Changed title of the page
Added:
websites/production/nutch/content/
- copied from r920680, websites/staging/nutch/trunk/content/
Author: jnioche
Date: Fri Aug 29 11:22:46 2014
New Revision: 1621285
URL: http://svn.apache.org/r1621285
Log:
NUTCH-1828 bin/crawl : incorrect handling of nutch errors
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/bin/crawl
Modified: nutch/branches/2.x/CHANGES.txt
URL
Author: jnioche
Date: Wed Jul 30 08:55:24 2014
New Revision: 1614586
URL: http://svn.apache.org/r1614586
Log:
NUTCH-1561 improve usability of parse-metatags and index-metadata
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/conf/nutch-default.xml
nutch/trunk/src/plugin/index-metadata
Author: jnioche
Date: Thu Jul 17 09:17:38 2014
New Revision: 1611303
URL: http://svn.apache.org/r1611303
Log:
Wrong task called in deps-jar for urlfilter-* plugins prevents ant runtime from
working
Modified:
nutch/trunk/src/plugin/urlfilter-automaton/build.xml
nutch/trunk/src/plugin
Author: jnioche
Date: Thu Jul 17 12:42:01 2014
New Revision: 1611343
URL: http://svn.apache.org/r1611343
Log:
NUTCH-1818 Add deps-test-compile task for building plugins
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/build-plugin.xml
nutch/trunk/src/plugin/urlfilter
Author: jnioche
Date: Wed Jul 16 10:11:01 2014
New Revision: 1610956
URL: http://svn.apache.org/r1610956
Log:
NUTCH-1817 Remove pom.xml from source (jnioche)
Removed:
nutch/trunk/pom.xml
Modified:
nutch/trunk/CHANGES.txt
Modified: nutch/trunk/CHANGES.txt
URL:
http://svn.apache.org
Author: jnioche
Date: Wed Jul 16 10:12:16 2014
New Revision: 1610957
URL: http://svn.apache.org/r1610957
Log:
NUTCH-1817 Remove pom.xml from source (jnioche)
Removed:
nutch/branches/2.x/pom.xml
Modified:
nutch/branches/2.x/CHANGES.txt
Modified: nutch/branches/2.x/CHANGES.txt
URL:
http
Author: jnioche
Date: Tue Jul 15 08:39:16 2014
New Revision: 1610624
URL: http://svn.apache.org/r1610624
Log:
NUTCH-1804 Move JUnit dependency to test scope
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/build.xml
nutch/trunk/ivy/ivy.xml
nutch/trunk/src/plugin/build-plugin.xml
Author: jnioche
Date: Tue Jul 15 09:16:47 2014
New Revision: 1610628
URL: http://svn.apache.org/r1610628
Log:
NUTCH-1502 Test for CrawlDatum state transitions (snagel)
Added:
nutch/trunk/src/test/org/apache/nutch/crawl/ContinuousCrawlTestUtil.java
nutch/trunk/src/test/org/apache/nutch
Author: jnioche
Date: Tue Jul 15 10:18:49 2014
New Revision: 1610635
URL: http://svn.apache.org/r1610635
Log:
build : resolve-test calls init task so that the ivy jar gets imported
Modified:
nutch/trunk/build.xml
Modified: nutch/trunk/build.xml
URL:
http://svn.apache.org/viewvc/nutch/trunk
Author: jnioche
Date: Wed Jul 9 14:01:20 2014
New Revision: 1609158
URL: http://svn.apache.org/r1609158
Log:
NUTCH-1799 ANT Eclipse task discovers all plugin jars automatically (jnioche)
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/build.xml
Modified: nutch/trunk/CHANGES.txt
URL
Author: jnioche
Date: Mon Jul 7 12:38:23 2014
New Revision: 1608431
URL: http://svn.apache.org/r1608431
Log:
NUTCH-578 URL fetched with 403 is generated over and over again
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/java/org/apache/nutch/crawl/CrawlDbReducer.java
Modified: nutch
Author: jnioche
Date: Mon Jun 30 12:38:58 2014
New Revision: 1606715
URL: http://svn.apache.org/r1606715
Log:
NUTCH 1803 Put test dependencies in a separate lib dir
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/build.xml
nutch/trunk/default.properties
Modified: nutch/trunk
Author: jnioche
Date: Mon Jun 30 13:40:06 2014
New Revision: 1606730
URL: http://svn.apache.org/r1606730
Log:
NUTCH-1802 Move TestbedProxy to test environment (jnioche)
Added:
nutch/trunk/src/test/org/apache/nutch/tools/
nutch/trunk/src/test/org/apache/nutch/tools/proxy/
nutch/trunk
Author: jnioche
Date: Fri Jun 27 07:31:02 2014
New Revision: 1605975
URL: http://svn.apache.org/r1605975
Log:
NUTCH-1798 Crawl script not calling index command correctly (Aaron Bedward via
jnioche)
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/bin/crawl
Modified
Author: jnioche
Date: Fri Jun 27 07:38:45 2014
New Revision: 1605978
URL: http://svn.apache.org/r1605978
Log:
NUTCH-385 Improve description of thread related configuration for Fetcher
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/conf/nutch-default.xml
Modified: nutch/trunk/CHANGES.txt
Author: jnioche
Date: Fri Jun 27 07:49:05 2014
New Revision: 1605979
URL: http://svn.apache.org/r1605979
Log:
NUTCH-385 Improve description of thread related configuration for Fetcher
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/conf/nutch-default.xml
Modified: nutch
Author: jnioche
Date: Wed Jun 25 11:01:38 2014
New Revision: 1605331
URL: http://svn.apache.org/r1605331
Log:
NUTCH-1633 slf4j is provided by hadoop and should not be included in the job
file (kaveh minooie via jnioche)
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/build.xml
Modified
Author: jnioche
Date: Tue Jun 17 08:41:57 2014
New Revision: 1603094
URL: http://svn.apache.org/r1603094
Log:
NUTCH-1793 HttpRobotRulesParser not configured properly (jnioche)
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api
Author: jnioche
Date: Tue Jun 17 14:11:23 2014
New Revision: 1603179
URL: http://svn.apache.org/r1603179
Log:
NUTCH-1590 [SECURITY] Frame injection vulnerability in published Javadoc
(jnioche)
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/build.xml
Modified: nutch/trunk/CHANGES.txt
Author: jnioche
Date: Tue Jun 17 14:16:39 2014
New Revision: 1603181
URL: http://svn.apache.org/r1603181
Log:
NUTCH-1590 [SECURITY] Frame injection vulnerability in published Javadoc
(jnioche)
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/build.xml
Modified: nutch
Author: jnioche
Date: Wed Jun 11 15:56:20 2014
New Revision: 1601937
URL: http://svn.apache.org/r1601937
Log:
NUTCH-1736 Can't fetch page if http response header contains
Transfer-Encoding:chunked
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/java/org/apache/nutch
Author: jnioche
Date: Thu Jun 5 11:09:42 2014
New Revision: 1600599
URL: http://svn.apache.org/r1600599
Log:
Fixing blunder in Nutch-1781
Modified:
nutch/branches/2.x/conf/gora-hbase-mapping.xml
Modified: nutch/branches/2.x/conf/gora-hbase-mapping.xml
URL:
http://svn.apache.org/viewvc
Author: jnioche
Date: Fri May 30 14:55:51 2014
New Revision: 1598622
URL: http://svn.apache.org/r1598622
Log:
NUTCH-1768 Upgrade to ElasticSearch 1.1.0
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/ivy/ivy.xml
nutch/branches/2.x/src/plugin/indexer-elastic/ivy.xml
Author: jnioche
Date: Fri May 16 07:59:05 2014
New Revision: 1595137
URL: http://svn.apache.org/r1595137
Log:
NUTCH-1772 Injector does not need merging if no pre-existing crawldb
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/java/org/apache/nutch/crawl/Injector.java
Modified: nutch
Author: jnioche
Date: Fri May 16 13:32:35 2014
New Revision: 1595193
URL: http://svn.apache.org/r1595193
Log:
NUTCH-1676 Add rudimentary SSL support to protocol-http
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api
Author: jnioche
Date: Sat May 10 12:39:11 2014
New Revision: 1593694
URL: http://svn.apache.org/r1593694
Log:
NUTCH-207 Bandwidth target for fetcher rather than a thread count
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/conf/nutch-default.xml
nutch/trunk/src/java/org/apache/nutch
Author: jnioche
Date: Mon May 12 07:59:01 2014
New Revision: 1593901
URL: http://svn.apache.org/r1593901
Log:
NUTCH-1766
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/java/org/apache/nutch/crawl/Generator.java
Modified: nutch/trunk/CHANGES.txt
URL:
http://svn.apache.org/viewvc
Author: jnioche
Date: Mon May 12 12:58:41 2014
New Revision: 1593954
URL: http://svn.apache.org/r1593954
Log:
NUTCH-1613 Timeouts in protocol-httpclient when crawling same host with 2
threads
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/plugin/protocol-httpclient/src
Author: jnioche
Date: Wed Apr 30 13:14:40 2014
New Revision: 1591282
URL: http://svn.apache.org/r1591282
Log:
NUTCH-1720 Duplicate lines in HttpBase.java (Walter Tietze via jnioche)
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/plugin/lib-http/src/java/org/apache/nutch
Author: jnioche
Date: Mon Apr 28 11:01:31 2014
New Revision: 1590600
URL: http://svn.apache.org/r1590600
Log:
NUTCH-1759 Upgrade to Crawler Commons 0.4
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/ivy/ivy.xml
Modified: nutch/trunk/CHANGES.txt
URL:
http://svn.apache.org/viewvc/nutch
Author: jnioche
Date: Mon Apr 28 11:02:58 2014
New Revision: 1590601
URL: http://svn.apache.org/r1590601
Log:
NUTCH-1759 Upgrade to Crawler Commons 0.4
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/ivy/ivy.xml
Modified: nutch/branches/2.x/CHANGES.txt
URL:
http
Author: jnioche
Date: Fri Apr 18 17:33:07 2014
New Revision: 1588523
URL: http://svn.apache.org/r1588523
Log:
minor change : added count of waitng threads in status report
Modified:
nutch/trunk/src/java/org/apache/nutch/fetcher/Fetcher.java
Modified: nutch/trunk/src/java/org/apache/nutch
Author: jnioche
Date: Thu Apr 17 10:55:42 2014
New Revision: 1588206
URL: http://svn.apache.org/r1588206
Log:
updated list of committers in pom and mvn.template
Modified:
nutch/trunk/ivy/mvn.template
nutch/trunk/pom.xml
Modified: nutch/trunk/ivy/mvn.template
URL:
http://svn.apache.org
Author: jnioche
Date: Thu Apr 17 10:56:28 2014
New Revision: 1588207
URL: http://svn.apache.org/r1588207
Log:
updated list of committers in pom and mvn.template
Modified:
nutch/branches/2.x/ivy/mvn.template
nutch/branches/2.x/pom.xml
Modified: nutch/branches/2.x/ivy/mvn.template
URL
Author: jnioche
Date: Wed Apr 16 14:29:23 2014
New Revision: 1587923
URL: http://svn.apache.org/r1587923
Log:
NUTCH1720 Duplicate code in HttpBase
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
Modified: nutch
Author: jnioche
Date: Wed Apr 16 14:38:13 2014
New Revision: 1587928
URL: http://svn.apache.org/r1587928
Log:
NUTCH-1603
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/parse-zip/src/java/org/apache/nutch/parse/zip/ZipParser.java
Modified: nutch/trunk/CHANGES.txt
URL:
http
Author: jnioche
Date: Wed Apr 16 14:56:09 2014
New Revision: 1587935
URL: http://svn.apache.org/r1587935
Log:
NUTCH-1743
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/java/org/apache/nutch/parse/ParserChecker.java
Modified: nutch/trunk/CHANGES.txt
URL:
http://svn.apache.org/viewvc
Author: jnioche
Date: Wed Apr 16 14:58:29 2014
New Revision: 1587936
URL: http://svn.apache.org/r1587936
Log:
NUTCH-1743
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/java/org/apache/nutch/parse/ParserChecker.java
Modified: nutch/branches/2.x/CHANGES.txt
URL:
http
Author: jnioche
Date: Fri Feb 21 12:07:15 2014
New Revision: 1570546
URL: http://svn.apache.org/r1570546
Log:
NUTCH-1729 upgrade tika 1.5
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/ivy/ivy.xml
nutch/branches/2.x/src/plugin/parse-tika/ivy.xml
nutch/branches/2.x
Author: jnioche
Date: Mon Nov 18 12:08:45 2013
New Revision: 1543010
URL: http://svn.apache.org/r1543010
Log:
NUTCH-1668 Remove package org.apache.nutch.indexer.solr
Removed:
nutch/trunk/src/java/org/apache/nutch/indexer/solr/
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/ivy/ivy.xml
Author: jnioche
Date: Fri Nov 15 09:20:03 2013
New Revision: 1542208
URL: http://svn.apache.org/r1542208
Log:
Removed all in one Crawl class (NUTCH-1621)
Modified:
nutch/branches/2.x/src/java/org/apache/nutch/api/JobManager.java
nutch/branches/2.x/src/java/org/apache/nutch/api/impl
Author: jnioche
Date: Thu Nov 14 11:55:33 2013
New Revision: 1541883
URL: http://svn.apache.org/r1541883
Log:
NUTCH-656 Generic Deduplicator (jnioche, snagel)
Added:
nutch/trunk/src/java/org/apache/nutch/crawl/DeduplicationJob.java
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src
Author: jnioche
Date: Thu Nov 14 12:11:36 2013
New Revision: 1541885
URL: http://svn.apache.org/r1541885
Log:
Removed all in one Crawl class (NUTCH-1621)
Removed:
nutch/trunk/src/java/org/apache/nutch/crawl/Crawl.java
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/bin/nutch
Author: jnioche
Date: Thu Nov 14 12:12:32 2013
New Revision: 1541886
URL: http://svn.apache.org/r1541886
Log:
Removed all in one Crawl class (NUTCH-1621)
Removed:
nutch/branches/2.x/src/java/org/apache/nutch/crawl/Crawler.java
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2
Author: jnioche
Date: Thu Nov 14 14:36:12 2013
New Revision: 1541917
URL: http://svn.apache.org/r1541917
Log:
Giving Cleaning and Deduplication jobs a name to display
Modified:
nutch/trunk/src/java/org/apache/nutch/crawl/DeduplicationJob.java
nutch/trunk/src/java/org/apache/nutch/indexer
Author: jnioche
Date: Mon Nov 11 10:15:03 2013
New Revision: 1540654
URL: http://svn.apache.org/r1540654
Log:
NUTCH-1666 Optimisation for BasicURLNormalizer (jnioche)
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/urlnormalizer-basic/src/java/org/apache/nutch/net/urlnormalizer
Author: jnioche
Date: Wed Oct 23 09:21:07 2013
New Revision: 1534962
URL: http://svn.apache.org/r1534962
Log:
Fix NUTCH-1640
Modified:
nutch/trunk/src/java/org/apache/nutch/parse/ParseSegment.java
Modified: nutch/trunk/src/java/org/apache/nutch/parse/ParseSegment.java
URL:
http
Author: jnioche
Date: Tue Oct 22 13:38:31 2013
New Revision: 1534633
URL: http://svn.apache.org/r1534633
Log:
moved ab and siren to former committers
Modified:
nutch/site/forrest/src/documentation/content/xdocs/credits.xml
nutch/site/publish/credits.html
nutch/site/publish
Author: jnioche
Date: Mon Oct 14 09:45:46 2013
New Revision: 1531833
URL: http://svn.apache.org/r1531833
Log:
NUTCH-1606
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/java/org/apache/nutch/crawl/FetchScheduleFactory.java
nutch/trunk/src/java/org/apache/nutch/crawl
Author: jnioche
Date: Thu Oct 10 13:50:17 2013
New Revision: 1530979
URL: http://svn.apache.org/r1530979
Log:
NUTCH-1653 AbstractScoringFilter
Added:
nutch/trunk/src/java/org/apache/nutch/scoring/AbstractScoringFilter.java
Modified:
nutch/trunk/CHANGES.txt
Modified: nutch/trunk
Author: jnioche
Date: Mon Oct 7 10:08:43 2013
New Revision: 1529813
URL: http://svn.apache.org/r1529813
Log:
NUTCH-1562
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/conf/nutch-default.xml
nutch/trunk/src/java/org/apache/nutch/indexer/IndexingFilters.java
nutch/trunk/src/java
Author: jnioche
Date: Fri Sep 20 08:03:24 2013
New Revision: 1524931
URL: http://svn.apache.org/r1524931
Log:
NUTCH-1641 Log timings for main jobs
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/java/org/apache/nutch/crawl/DbUpdaterJob.java
nutch/branches/2.x/src/java
Author: jnioche
Date: Thu Aug 29 11:27:45 2013
New Revision: 1518594
URL: http://svn.apache.org/r1518594
Log:
(NUTCH-1622) Create Outlinks with metadata
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/java/org/apache/nutch/parse/Outlink.java
nutch/trunk/src/java/org/apache/nutch
Author: jnioche
Date: Fri Aug 23 08:40:19 2013
New Revision: 1516746
URL: http://svn.apache.org/r1516746
Log:
NUTCH-1629 Injector skips empty lines
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/java/org/apache/nutch/crawl/Injector.java
Modified: nutch/trunk/CHANGES.txt
URL:
http
Author: jnioche
Date: Mon Jul 29 13:38:23 2013
New Revision: 1508065
URL: http://svn.apache.org/r1508065
Log:
NUTCH-806 Merge CrawlDBScanner with CrawlDBReader
Removed:
nutch/trunk/src/java/org/apache/nutch/tools/CrawlDBScanner.java
Modified:
nutch/trunk/CHANGES.txt
Modified: nutch
Author: jnioche
Date: Mon Jul 8 08:50:08 2013
New Revision: 1500610
URL: http://svn.apache.org/r1500610
Log:
NUTCH-1604 Protocol-factory not thread-safe
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/java/org/apache/nutch/protocol/ProtocolFactory.java
Modified: nutch
Author: jnioche
Date: Mon Jun 10 11:24:14 2013
New Revision: 1491420
URL: http://svn.apache.org/r1491420
Log:
NUTCH-1522 Upgrade to Tika 1.3
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/ivy/ivy.xml
nutch/trunk/src/plugin/parse-tika/ivy.xml
nutch/trunk/src/plugin/parse-tika
Author: jnioche
Date: Mon Jun 10 11:25:36 2013
New Revision: 1491421
URL: http://svn.apache.org/r1491421
Log:
NUTCH-1522 Upgrade to Tika 1.3
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/ivy/ivy.xml
nutch/branches/2.x/src/plugin/parse-tika/ivy.xml
nutch/branches/2.x
Author: jnioche
Date: Thu Apr 18 09:08:16 2013
New Revision: 1469231
URL: http://svn.apache.org/r1469231
Log:
Fixed detection of job file for distributed mode
Modified:
nutch/trunk/src/bin/crawl
Modified: nutch/trunk/src/bin/crawl
URL:
http://svn.apache.org/viewvc/nutch/trunk/src/bin/crawl
Author: jnioche
Date: Thu Apr 18 09:26:07 2013
New Revision: 1469241
URL: http://svn.apache.org/r1469241
Log:
Fixed detection of job file for distributed mode
Modified:
nutch/branches/2.x/src/bin/crawl
Modified: nutch/branches/2.x/src/bin/crawl
URL:
http://svn.apache.org/viewvc/nutch
Author: jnioche
Date: Fri Jan 18 11:53:13 2013
New Revision: 1435101
URL: http://svn.apache.org/viewvc?rev=1435101view=rev
Log:
NUTCH-840 Port tests from parse-html to parse-tika
Added:
nutch/trunk/src/plugin/parse-tika/src/test/org/apache/nutch/tika/TestDOMContentUtils.java
nutch
Author: jnioche
Date: Sat Oct 20 09:14:40 2012
New Revision: 1400397
URL: http://svn.apache.org/viewvc?rev=1400397view=rev
Log:
NUTCH-1433 Upgrade to Tika 1.2
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/ivy/ivy.xml
nutch/branches/2.x/src/plugin/parse-tika/ivy.xml
Author: jnioche
Date: Tue Jul 10 08:44:31 2012
New Revision: 1359526
URL: http://svn.apache.org/viewvc?rev=1359526view=rev
Log:
Renaming nutchgora branch into 2.x
Added:
nutch/branches/2.x/
- copied from r1359525, nutch/branches/nutchgora/
Removed:
nutch/branches/nutchgora/
Author: jnioche
Date: Tue Jul 10 15:21:33 2012
New Revision: 1359725
URL: http://svn.apache.org/viewvc?rev=1359725view=rev
Log:
Fixed change file after release 1.5.1
Modified:
nutch/trunk/CHANGES.txt
Modified: nutch/trunk/CHANGES.txt
URL:
http://svn.apache.org/viewvc/nutch/trunk
Author: jnioche
Date: Wed Jun 20 09:28:16 2012
New Revision: 1352008
URL: http://svn.apache.org/viewvc?rev=1352008view=rev
Log:
NUTCH-1400 + changed version to 1.5.1-SNAPSHOT
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/default.properties
nutch/trunk/src/bin/nutch
Modified: nutch
Author: jnioche
Date: Wed Jun 20 10:34:29 2012
New Revision: 1352037
URL: http://svn.apache.org/viewvc?rev=1352037view=rev
Log:
NUTCH-1391 readdb -stats fires java.io.EOFException
Modified:
nutch/branches/nutchgora/CHANGES.txt
nutch/branches/nutchgora/src/java/org/apache/nutch/crawl
Author: jnioche
Date: Fri Jun 15 14:07:59 2012
New Revision: 1350630
URL: http://svn.apache.org/viewvc?rev=1350630view=rev
Log:
NUTCH-1398 Upgrade to Hadoop 1.0.3
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/ivy/ivy.xml
Modified: nutch/trunk/CHANGES.txt
URL:
http://svn.apache.org
Author: jnioche
Date: Tue May 22 09:10:00 2012
New Revision: 1341365
URL: http://svn.apache.org/viewvc?rev=1341365view=rev
Log:
Minor changes to the Maven template
Modified:
nutch/trunk/ivy/mvn.template
Modified: nutch/trunk/ivy/mvn.template
URL:
http://svn.apache.org/viewvc/nutch/trunk
Author: jnioche
Date: Tue May 22 09:38:47 2012
New Revision: 1341375
URL: http://svn.apache.org/viewvc?rev=1341375view=rev
Log:
fix mvn template
Modified:
nutch/trunk/ivy/mvn.template
Modified: nutch/trunk/ivy/mvn.template
URL:
http://svn.apache.org/viewvc/nutch/trunk/ivy/mvn.template?rev
Author: jnioche
Date: Thu Apr 19 10:49:56 2012
New Revision: 1327896
URL: http://svn.apache.org/viewvc?rev=1327896view=rev
Log:
new ANT target for packaging binary release
Modified:
nutch/trunk/build.xml
Modified: nutch/trunk/build.xml
URL:
http://svn.apache.org/viewvc/nutch/trunk
Author: jnioche
Date: Tue Apr 3 12:39:38 2012
New Revision: 1308865
URL: http://svn.apache.org/viewvc?rev=1308865view=rev
Log:
NUTCH-1208 Don't include KEYS file in bin distribution
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/build.xml
Modified: nutch/trunk/CHANGES.txt
URL:
http
Author: jnioche
Date: Mon Apr 2 11:50:35 2012
New Revision: 1308310
URL: http://svn.apache.org/viewvc?rev=1308310view=rev
Log:
NUTCH-1234 Upgrade to Tika 1.1 (jnioche, markus)
Removed:
nutch/trunk/src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/TikaConfig.java
Modified
Author: jnioche
Date: Wed Mar 21 12:43:19 2012
New Revision: 1303371
URL: http://svn.apache.org/viewvc?rev=1303371view=rev
Log:
NUTCH-809 Parse-metatags plugin (jnioche)
Added:
nutch/trunk/src/plugin/parse-metatags/
nutch/trunk/src/plugin/parse-metatags/README.txt
nutch/trunk/src
Author: jnioche
Date: Thu Jan 5 11:05:43 2012
New Revision: 1227553
URL: http://svn.apache.org/viewvc?rev=1227553view=rev
Log:
NUTCH-1146 Prevent generation of _SUCCESS files in output
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/conf/nutch-default.xml
Modified: nutch/trunk
Author: jnioche
Date: Thu Jan 5 12:09:09 2012
New Revision: 1227565
URL: http://svn.apache.org/viewvc?rev=1227565view=rev
Log:
NUTCH-1243 Unit Jar removed from lib
Modified:
nutch/trunk/ivy/ivy.xml
Modified: nutch/trunk/ivy/ivy.xml
URL:
http://svn.apache.org/viewvc/nutch/trunk/ivy/ivy.xml
Author: jnioche
Date: Sat Oct 1 08:24:32 2011
New Revision: 1177944
URL: http://svn.apache.org/viewvc?rev=1177944view=rev
Log:
NUTCH-1141 Configurable Fetcher queue depth
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/conf/nutch-default.xml
nutch/trunk/src/java/org/apache/nutch
Author: jnioche
Date: Wed Sep 28 11:15:53 2011
New Revision: 1176823
URL: http://svn.apache.org/viewvc?rev=1176823view=rev
Log:
NUTCH-937 Put plugins in classes/plugins in job file (Claudio Martella, Ferdy
Galema, jnioche)
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/build.xml
Author: jnioche
Date: Wed Sep 28 11:18:21 2011
New Revision: 1176824
URL: http://svn.apache.org/viewvc?rev=1176824view=rev
Log:
NUTCH-937 Put plugins in classes/plugins in job file (Claudio Martella, Ferdy
Galema, jnioche)
Modified:
nutch/branches/nutchgora/CHANGES.txt
nutch/branches
Author: jnioche
Date: Sun Sep 25 20:47:41 2011
New Revision: 1175571
URL: http://svn.apache.org/viewvc?rev=1175571view=rev
Log:
NUTCH-1131 Rely on published artefacts for GORA (jnioche)
Modified:
nutch/branches/nutchgora/CHANGES.txt
nutch/branches/nutchgora/ivy/ivy.xml
nutch/branches
Author: jnioche
Date: Thu Sep 1 13:08:23 2011
New Revision: 1164064
URL: http://svn.apache.org/viewvc?rev=1164064view=rev
Log:
NUTCH-1073 Renamed parameters 'fetcher.threads.per.host.by.ip' and
'fetcher.threads.per.host'
Modified:
nutch/branches/branch-1.4/CHANGES.txt
nutch/branches
Author: jnioche
Date: Thu Sep 1 15:15:33 2011
New Revision: 1164108
URL: http://svn.apache.org/viewvc?rev=1164108view=rev
Log:
NUTCH-1096 Empty (not null) ContentLength results in failure of fetch (Ferdy
Galema via jnioche)
Modified:
nutch/branches/branch-1.4/CHANGES.txt
nutch
1 - 100 of 149 matches
Mail list logo