[
https://issues.apache.org/jira/browse/NUTCH-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328012#comment-15328012
]
Lewis John McGibbney commented on NUTCH-2274:
-
Thanks for registering [~bmzhao] are you
[
https://issues.apache.org/jira/browse/NUTCH-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2273:
---
Assignee: Lewis John McGibbney
> Selenium and InteractiveSelenium Do Not
[
https://issues.apache.org/jira/browse/NUTCH-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2274:
---
Assignee: Lewis John McGibbney
> InteractiveSelenium Plugin's DefaultHandler
[
https://issues.apache.org/jira/browse/NUTCH-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15325106#comment-15325106
]
Lewis John McGibbney commented on NUTCH-2271:
-
Please take a look at the nutch-default.xml
[
https://issues.apache.org/jira/browse/NUTCH-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2271.
-
Resolution: Not A Bug
Nutch 1.12 supports Solr 5.4.1 not 6. Also Nutch 1.12 is
[
https://issues.apache.org/jira/browse/NUTCH-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310680#comment-15310680
]
Lewis John McGibbney commented on NUTCH-2271:
-
No. Please check build.xml what which version
[
https://issues.apache.org/jira/browse/NUTCH-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2265:
Fix Version/s: (was: 1.12)
1.13
> Write A Test Package for
[
https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298842#comment-15298842
]
Lewis John McGibbney commented on NUTCH-2234:
-
bq. I can update the patch or open a PR on
[
https://issues.apache.org/jira/browse/NUTCH-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2089:
Fix Version/s: (was: 2.5)
2.4
> Move Nutch 2.x to compile on
[
https://issues.apache.org/jira/browse/NUTCH-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2089:
Summary: Move Nutch 2.x to compile on JDK 8 (was: Move Nutch to compile on
JDK 8)
[
https://issues.apache.org/jira/browse/NUTCH-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2266.
-
Resolution: Fixed
Thanks [~kamaci] :)
> Fix dead link in build.xml for javadoc
>
[
https://issues.apache.org/jira/browse/NUTCH-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2266:
Fix Version/s: (was: 2.5)
2.4
> Fix dead link in build.xml
[
https://issues.apache.org/jira/browse/NUTCH-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296981#comment-15296981
]
Lewis John McGibbney edited comment on NUTCH-2089 at 5/23/16 7:59 PM:
--
[
https://issues.apache.org/jira/browse/NUTCH-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296981#comment-15296981
]
Lewis John McGibbney commented on NUTCH-2089:
-
What about javadoc?
> Move Nutch to compile on
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295673#comment-15295673
]
Lewis John McGibbney commented on NUTCH-:
-
Thanks Furkan. Do you have a unit test which
[
https://issues.apache.org/jira/browse/NUTCH-2263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2263:
Fix Version/s: (was: 2.4.1)
1.12
> Support for mingram and
[
https://issues.apache.org/jira/browse/NUTCH-2263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2263.
-
Resolution: Fixed
Assignee: Furkan KAMACI
Thank you [~kamaci] nice patch
>
[
https://issues.apache.org/jira/browse/NUTCH-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2122:
Summary: Implement Javadoc package-info.html for webui packages (was:
Implement
[
https://issues.apache.org/jira/browse/NUTCH-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292454#comment-15292454
]
Lewis John McGibbney commented on NUTCH-2122:
-
I agree :)
> Implement Javadoc package.html
[
https://issues.apache.org/jira/browse/NUTCH-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291595#comment-15291595
]
Lewis John McGibbney commented on NUTCH-1858:
-
AFAIK a script or two exist to do the
[
https://issues.apache.org/jira/browse/NUTCH-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291400#comment-15291400
]
Lewis John McGibbney commented on NUTCH-1858:
-
I honestly do no know. This is a huge amount of
[
https://issues.apache.org/jira/browse/NUTCH-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291395#comment-15291395
]
Lewis John McGibbney commented on NUTCH-2122:
-
Hi Furkan, please change the 'service' to
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15287572#comment-15287572
]
Lewis John McGibbney commented on NUTCH-:
-
[~kamaci] can you please check this issue out
[
https://issues.apache.org/jira/browse/NUTCH-2112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2112:
Fix Version/s: 2.4
> Missing org.restlet.jee when building with gora-solr
>
[
https://issues.apache.org/jira/browse/NUTCH-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reopened NUTCH-2248:
-
> CSS parser plugin
> -
>
> Key: NUTCH-2248
>
[
https://issues.apache.org/jira/browse/NUTCH-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2251:
Fix Version/s: (was: 1.10)
1.13
> Make
[
https://issues.apache.org/jira/browse/NUTCH-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2260.
-
Resolution: Fixed
> JAVA_HOME and hbase-common dependency absent from hbase
[
https://issues.apache.org/jira/browse/NUTCH-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2259.
-
Resolution: Fixed
> Nutch 2.x HBase Docker requires a logs folder to run
[
https://issues.apache.org/jira/browse/NUTCH-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283643#comment-15283643
]
Lewis John McGibbney commented on NUTCH-2259:
-
PR is available at
Lewis John McGibbney created NUTCH-2259:
---
Summary: Nutch 2.x HBase Docker requires a logs folder to run
exception free
Key: NUTCH-2259
URL: https://issues.apache.org/jira/browse/NUTCH-2259
[
https://issues.apache.org/jira/browse/NUTCH-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281054#comment-15281054
]
Lewis John McGibbney commented on NUTCH-2258:
-
I accidentally created this issue for NUTCH
Lewis John McGibbney created NUTCH-2258:
---
Summary: Provide Javadoc for ScriptInput/OutputFormat's
Key: NUTCH-2258
URL: https://issues.apache.org/jira/browse/NUTCH-2258
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2252.
-
Resolution: Fixed
Thanks [~kwhitehall] and folks.
> Allow phantomjs as a
[
https://issues.apache.org/jira/browse/NUTCH-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276727#comment-15276727
]
Lewis John McGibbney commented on NUTCH-1824:
-
Hi [~xjtujiyong] I'll try and scope later and
[
https://issues.apache.org/jira/browse/NUTCH-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2252:
Fix Version/s: 1.12
> Allow phantomjs as a browser for selenium options
>
[
https://issues.apache.org/jira/browse/NUTCH-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2252:
Affects Version/s: 1.12
> Allow phantomjs as a browser for selenium options
>
[
https://issues.apache.org/jira/browse/NUTCH-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2252:
---
Assignee: Lewis John McGibbney
> Allow phantomjs as a browser for selenium
[
https://issues.apache.org/jira/browse/NUTCH-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2188:
Fix Version/s: (was: 1.9)
1.12
> While crawling with solr
[
https://issues.apache.org/jira/browse/NUTCH-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2217:
Fix Version/s: (was: 1.11)
2.5
> Crawl pages with specified
[
https://issues.apache.org/jira/browse/NUTCH-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2238.
-
Resolution: Fixed
Thank you [~ptorrestr]
> Indexer for Elasticsearch 2.x
>
[
https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1741:
Fix Version/s: (was: 2.3.2)
2.4
> Support of Sitemaps in
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-:
Fix Version/s: (was: 2.3.2)
2.4
> re-fetch deletes all
[
https://issues.apache.org/jira/browse/NUTCH-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2238:
Assignee: Pablo Torres
> Indexer for Elasticsearch 2.x
>
[
https://issues.apache.org/jira/browse/NUTCH-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2238:
Fix Version/s: (was: 2.3.2)
> Indexer for Elasticsearch 2.x
>
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230393#comment-15230393
]
Lewis John McGibbney commented on NUTCH-:
-
I've committed and closed the MemStore
[
https://issues.apache.org/jira/browse/NUTCH-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213527#comment-15213527
]
Lewis John McGibbney commented on NUTCH-2191:
-
Thanks [~karanjeets] good job
> Add
[
https://issues.apache.org/jira/browse/NUTCH-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213390#comment-15213390
]
Lewis John McGibbney commented on NUTCH-2191:
-
[~karanjeets] if you can please step through
[
https://issues.apache.org/jira/browse/NUTCH-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212656#comment-15212656
]
Lewis John McGibbney commented on NUTCH-2089:
-
Compiler warnings, etc. Feel free to analyze
[
https://issues.apache.org/jira/browse/NUTCH-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210241#comment-15210241
]
Lewis John McGibbney commented on NUTCH-2005:
-
[~mefaraz...@gmail.com], please check out the
[
https://issues.apache.org/jira/browse/NUTCH-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2005:
Description:
Recent developments within the tracing community have brought projects
[
https://issues.apache.org/jira/browse/NUTCH-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2005:
Fix Version/s: 2.4
> Implement HTrace'ing in Nutch
> -
[
https://issues.apache.org/jira/browse/NUTCH-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2005:
Description:
Recent developments within the tracing community have brought projects
[
https://issues.apache.org/jira/browse/NUTCH-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1756:
Description:
It will be beneficial to have a security layer for NutchServer once we
[
https://issues.apache.org/jira/browse/NUTCH-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1756:
Labels: gsoc2016 (was: )
> Security layer for NutchServer
>
[
https://issues.apache.org/jira/browse/NUTCH-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209411#comment-15209411
]
Lewis John McGibbney commented on NUTCH-2005:
-
[~mefaraz...@gmail.com] are you still
[
https://issues.apache.org/jira/browse/NUTCH-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198869#comment-15198869
]
Lewis John McGibbney commented on NUTCH-1492:
-
[~renato2099] what about this shit?
> Support
[
https://issues.apache.org/jira/browse/NUTCH-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2185:
Description: I'm finishing off a Nutch protocol implementation for
interacting with
[
https://issues.apache.org/jira/browse/NUTCH-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187238#comment-15187238
]
Lewis John McGibbney commented on NUTCH-2202:
-
[~robertmeusel] please don't look into it yet.
[
https://issues.apache.org/jira/browse/NUTCH-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2202:
---
Assignee: Lewis John McGibbney
> Integration of Anthelion (Focused Crawling
[
https://issues.apache.org/jira/browse/NUTCH-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2202:
Description: We have recently released anthelion, which is a focused
crawler plugin
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178568#comment-15178568
]
Lewis John McGibbney commented on NUTCH-:
-
Nice [~abenjell],
In Nutch we use
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176122#comment-15176122
]
Lewis John McGibbney commented on NUTCH-2184:
-
sh*t, I didn't push up my assertions. I'll get
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176048#comment-15176048
]
Lewis John McGibbney edited comment on NUTCH- at 3/2/16 5:34 PM:
-
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176048#comment-15176048
]
Lewis John McGibbney commented on NUTCH-:
-
I don't think there is any workaround no. The
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171487#comment-15171487
]
Lewis John McGibbney commented on NUTCH-:
-
Hi, I can replicate this on
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-:
Fix Version/s: 2.3.2
> re-fetch deletes all metadata except _csh_ and _rs_
>
[
https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1741:
Fix Version/s: (was: 2.4)
2.3.2
> Support of Sitemaps in
[
https://issues.apache.org/jira/browse/NUTCH-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169521#comment-15169521
]
Lewis John McGibbney commented on NUTCH-2234:
-
Out or curiosity. What versions of httpcore and
[
https://issues.apache.org/jira/browse/NUTCH-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168253#comment-15168253
]
Lewis John McGibbney commented on NUTCH-2235:
-
The source of this issue is ordering of Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168253#comment-15168253
]
Lewis John McGibbney edited comment on NUTCH-2235 at 2/26/16 1:38 AM:
--
[
https://issues.apache.org/jira/browse/NUTCH-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168156#comment-15168156
]
Lewis John McGibbney commented on NUTCH-2235:
-
Looks like the issue is with httpcore instead
[
https://issues.apache.org/jira/browse/NUTCH-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168097#comment-15168097
]
Lewis John McGibbney commented on NUTCH-2235:
-
{code}
jar tf apache-nutch-1.12-SNAPSHOT.job |
[
https://issues.apache.org/jira/browse/NUTCH-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168095#comment-15168095
]
Lewis John McGibbney commented on NUTCH-2235:
-
This issue is commonly associated with
Lewis John McGibbney created NUTCH-2235:
---
Summary: Classpath discrepancy with protocol-selenium in deploy
mode
Key: NUTCH-2235
URL: https://issues.apache.org/jira/browse/NUTCH-2235
Project:
[
https://issues.apache.org/jira/browse/NUTCH-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167963#comment-15167963
]
Lewis John McGibbney commented on NUTCH-1712:
-
Is the Nutch codebase now acting off of Git? If
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167309#comment-15167309
]
Lewis John McGibbney edited comment on NUTCH- at 2/25/16 3:21 PM:
--
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167309#comment-15167309
]
Lewis John McGibbney edited comment on NUTCH- at 2/25/16 3:20 PM:
--
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167309#comment-15167309
]
Lewis John McGibbney commented on NUTCH-:
-
We need to step through crawl steps and find
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-:
Description:
This problem happens at the the second time I crawl a page
{code}
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-:
Summary: re-fetch deletes all metadata except _csh_ and _rs_ (was: fetch
deletes
[
https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-:
---
Assignee: Lewis John McGibbney
> fetch deletes all metadata except _csh_
[
https://issues.apache.org/jira/browse/NUTCH-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15152751#comment-15152751
]
Lewis John McGibbney commented on NUTCH-2218:
-
Nice Mike thanks
> Switch CrawlCompletion arg
[
https://issues.apache.org/jira/browse/NUTCH-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2035:
---
Assignee: Lewis John McGibbney
> Regex filter using case sensitive rules.
>
[
https://issues.apache.org/jira/browse/NUTCH-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2033:
Fix Version/s: 1.12
> parse-tika skips valid documents.
>
[
https://issues.apache.org/jira/browse/NUTCH-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2033:
---
Assignee: Lewis John McGibbney
> parse-tika skips valid documents.
>
[
https://issues.apache.org/jira/browse/NUTCH-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2035:
Fix Version/s: 1.12
> Regex filter using case sensitive rules.
>
[
https://issues.apache.org/jira/browse/NUTCH-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2034:
---
Assignee: Lewis John McGibbney
> CrawlDB filtered documents counter.
>
[
https://issues.apache.org/jira/browse/NUTCH-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2032:
Fix Version/s: 1.12
> Plugin to index the raw content of a readable document.
>
[
https://issues.apache.org/jira/browse/NUTCH-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2032:
---
Assignee: Lewis John McGibbney
> Plugin to index the raw content of a
[
https://issues.apache.org/jira/browse/NUTCH-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2034:
Fix Version/s: 1.12
> CrawlDB filtered documents counter.
>
[
https://issues.apache.org/jira/browse/NUTCH-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2046:
Fix Version/s: 1.12
> The crawl script should be able to skip an initial injection.
[
https://issues.apache.org/jira/browse/NUTCH-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2046:
---
Assignee: Lewis John McGibbney
> The crawl script should be able to skip an
[
https://issues.apache.org/jira/browse/NUTCH-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2144:
Fix Version/s: 1.12
> Plugin to override db.ignore.external to exempt interesting
[
https://issues.apache.org/jira/browse/NUTCH-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141213#comment-15141213
]
Lewis John McGibbney commented on NUTCH-2144:
-
Hi [~thammegowda], limitations I see are as
[
https://issues.apache.org/jira/browse/NUTCH-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2005:
Labels: gsoc2016 (was: )
> Implement HTrace'ing in Nutch
>
[
https://issues.apache.org/jira/browse/NUTCH-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141296#comment-15141296
]
Lewis John McGibbney commented on NUTCH-2144:
-
bq. [~chrismattmann] I am not sure if Tika can
[
https://issues.apache.org/jira/browse/NUTCH-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-1314:
---
Assignee: Lewis John McGibbney
> Impose a limit on the length of outlink
[
https://issues.apache.org/jira/browse/NUTCH-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15137375#comment-15137375
]
Lewis John McGibbney commented on NUTCH-1314:
-
Committed @ revisions 1729218 and 1729219 in
[
https://issues.apache.org/jira/browse/NUTCH-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129575#comment-15129575
]
Lewis John McGibbney commented on NUTCH-1314:
-
Yep, if someone can consolidate the patches
[
https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1741:
Attachment: NUTCH-1741v7.patch
Managed to update this at the weekend and forgot to
601 - 700 of 2828 matches
Mail list logo