[
https://issues.apache.org/jira/browse/NUTCH-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2467:
-
Attachment: NUTCH-2467.patch
Incredible stupid patch but i did it because the sitemap.type thing
[
https://issues.apache.org/jira/browse/NUTCH-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2467:
-
Patch Info: Patch Available
> Sitemap type field can be n
Markus Jelsma created NUTCH-2467:
Summary: Sitemap type field can be null
Key: NUTCH-2467
URL: https://issues.apache.org/jira/browse/NUTCH-2467
Project: Nutch
Issue Type: Bug
Affects
[
https://issues.apache.org/jira/browse/NUTCH-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2466:
-
Patch Info: Patch Available
> Sitemap processor to follow redire
[
https://issues.apache.org/jira/browse/NUTCH-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2466:
-
Attachment: NUTCH-2466.patch
Patch for master!
> Sitemap processor to follow redire
Markus Jelsma created NUTCH-2466:
Summary: Sitemap processor to follow redirects
Key: NUTCH-2466
URL: https://issues.apache.org/jira/browse/NUTCH-2466
Project: Nutch
Issue Type: Bug
[
https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2439:
-
Summary: Upgrade to Apache Tika 1.17 (was: Upgrade to Apache Tika 1.16)
> Upgrade to Apache T
[
https://issues.apache.org/jira/browse/NUTCH-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251361#comment-16251361
]
Markus Jelsma commented on NUTCH-2368:
--
Ah, can you open a new issue for this?
Thanks!
> Varia
[
https://issues.apache.org/jira/browse/NUTCH-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-2458.
--
Resolution: Fixed
Committed to 9e4d9544..705686e7 master -> master
> TikaParser doesn'
[
https://issues.apache.org/jira/browse/NUTCH-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246018#comment-16246018
]
Markus Jelsma commented on NUTCH-2458:
--
Will commit this one shortly..
> TikaParser doesn't w
[
https://issues.apache.org/jira/browse/NUTCH-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2458:
-
Attachment: NUTCH-2458.patch
Patch for master!
> TikaParser doesn't work with tika-config.
Markus Jelsma created NUTCH-2458:
Summary: TikaParser doesn't work with tika-config.xml set
Key: NUTCH-2458
URL: https://issues.apache.org/jira/browse/NUTCH-2458
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16244761#comment-16244761
]
Markus Jelsma commented on NUTCH-2456:
--
What will this patch achieve then? Just the case of ignoring
[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16244638#comment-16244638
]
Markus Jelsma commented on NUTCH-2456:
--
It will have side-effects, the same as you already mentioned
[
https://issues.apache.org/jira/browse/NUTCH-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16244230#comment-16244230
]
Markus Jelsma commented on NUTCH-2368:
--
Hello - see NUTCH-2420 .
> Variable generate.max.co
[
https://issues.apache.org/jira/browse/NUTCH-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240491#comment-16240491
]
Markus Jelsma commented on NUTCH-2420:
--
Committed and created NUTCH-2455
Thanks!
> Bug in varia
[
https://issues.apache.org/jira/browse/NUTCH-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2455:
-
Description:
Citing Sebastian at NUTCH-2420:
??The correct solution would be to use <host,sc
Markus Jelsma created NUTCH-2455:
Summary: Speed up the merging of HostDb entries for variable fetch
delay
Key: NUTCH-2455
URL: https://issues.apache.org/jira/browse/NUTCH-2455
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-2420.
--
Resolution: Fixed
remote:517dbdf..6199492 6199492f5e1e8811022257c88dbf63f1e1c739d0
[
https://issues.apache.org/jira/browse/NUTCH-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237641#comment-16237641
]
Markus Jelsma commented on NUTCH-2368:
--
Could you open a new issue, this one has already been
[
https://issues.apache.org/jira/browse/NUTCH-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1932:
-
Description:
Orphan scoring filter that determines whether a page has become orphaned, e.g
[
https://issues.apache.org/jira/browse/NUTCH-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218704#comment-16218704
]
Markus Jelsma commented on NUTCH-1932:
--
Yes! Indeed! Go ahead and thanks for taking this one over
[
https://issues.apache.org/jira/browse/NUTCH-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218583#comment-16218583
]
Markus Jelsma commented on NUTCH-2394:
--
A crap, the trim() habit, a time where i was very
[
https://issues.apache.org/jira/browse/NUTCH-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-2386.
Thanks Sebastian!
> BasicURLNormalizer does not encode curly bra
[
https://issues.apache.org/jira/browse/NUTCH-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-2386.
--
Resolution: Fixed
4cfec6e3..bd8c8476 master -> master
> BasicURLNormalize
[
https://issues.apache.org/jira/browse/NUTCH-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218558#comment-16218558
]
Markus Jelsma commented on NUTCH-1932:
--
This looks fine to me! I am happy to see that you removed my
[
https://issues.apache.org/jira/browse/NUTCH-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1932:
-
Fix Version/s: 1.14
> Automatically remove orphaned pa
[
https://issues.apache.org/jira/browse/NUTCH-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1932:
-
Affects Version/s: 1.13
> Automatically remove orphaned pa
[
https://issues.apache.org/jira/browse/NUTCH-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215804#comment-16215804
]
Markus Jelsma commented on NUTCH-2448:
--
Fine, go ahead!
> Allow Sending an empty http.agent.vers
[
https://issues.apache.org/jira/browse/NUTCH-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215190#comment-16215190
]
Markus Jelsma commented on NUTCH-1932:
--
I agree! I will try to make some time for it tomorrow
[
https://issues.apache.org/jira/browse/NUTCH-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-2445.
Thanks Sebastian!
> Fetcher following outlinks to keep track of already fetched it
[
https://issues.apache.org/jira/browse/NUTCH-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-2445.
--
Resolution: Fixed
remote:3c21a6b..0cdd095 0cdd095c881eed52dc461e559ce6ae278e99157f
[
https://issues.apache.org/jira/browse/NUTCH-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-2444.
--
Resolution: Fixed
remote:602c663..3c21a6b 3c21a6b2abaa17ecc66a1c76d1239c213c56ba4e
[
https://issues.apache.org/jira/browse/NUTCH-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-2444.
Thanks!
> HostDB CSV dumper to emit field header by defa
[
https://issues.apache.org/jira/browse/NUTCH-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2445:
-
Attachment: NUTCH-2445.patch
Updated patch!
> Fetcher following outlinks to keep tr
[
https://issues.apache.org/jira/browse/NUTCH-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2447:
-
Attachment: NUTCH-2447.patch
Added comment indicating its filthy code with reference to here
[
https://issues.apache.org/jira/browse/NUTCH-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215057#comment-16215057
]
Markus Jelsma commented on NUTCH-2447:
--
As a side note, also pay attention to this incredible ugly
[
https://issues.apache.org/jira/browse/NUTCH-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2447:
-
Attachment: NUTCH-2447.patch
Patch for master! Keep in mind, this only work for protocol-http
[
https://issues.apache.org/jira/browse/NUTCH-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215055#comment-16215055
]
Markus Jelsma edited comment on NUTCH-2447 at 10/23/17 12:36 PM:
-
Patch
[
https://issues.apache.org/jira/browse/NUTCH-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2447:
-
Description:
Nutch is unable to crawl some websites, regardless of protocol plugin you are
using
[
https://issues.apache.org/jira/browse/NUTCH-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2447:
-
Description:
{code}
2017-10-23 12:43:52,911 INFO api.HttpRobotRulesParser - Couldn't get
Markus Jelsma created NUTCH-2447:
Summary: Work-around SSLProtocolException: handshake alert:
unrecognized_name
Key: NUTCH-2447
URL: https://issues.apache.org/jira/browse/NUTCH-2447
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2445:
-
Attachment: NUTCH-2445.patch
Correct patch!
> Fetcher following outlinks to keep tr
[
https://issues.apache.org/jira/browse/NUTCH-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2445:
-
Attachment: (was: NUTCH-2445.patch)
> Fetcher following outlinks to keep track of alre
[
https://issues.apache.org/jira/browse/NUTCH-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2445:
-
Attachment: NUTCH-2445.patch
Patch!
> Fetcher following outlinks to keep track of alre
Markus Jelsma created NUTCH-2445:
Summary: Fetcher following outlinks to keep track of already
fetched items
Key: NUTCH-2445
URL: https://issues.apache.org/jira/browse/NUTCH-2445
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212397#comment-16212397
]
Markus Jelsma commented on NUTCH-2444:
--
Will commit is one shortly unless objections
> HostDB
[
https://issues.apache.org/jira/browse/NUTCH-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2444:
-
Attachment: NUTCH-2444.patch
Patch!
> HostDB CSV dumper to emit field header by defa
[
https://issues.apache.org/jira/browse/NUTCH-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212381#comment-16212381
]
Markus Jelsma commented on NUTCH-1932:
--
Hello - i don't disagree, i merely updated to master
[
https://issues.apache.org/jira/browse/NUTCH-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1932:
-
Attachment: NUTCH-1932.patch
Updated patch for master.
> Automatically remove orphaned pa
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Attachment: NUTCH-2411.patch
Crap, previous version was wrong too!
> Index-metadata to supp
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Attachment: NUTCH-2411.patch
The patch was missing support
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209124#comment-16209124
]
Markus Jelsma commented on NUTCH-2411:
--
Will commit shortly unless objections
> Index-metad
[
https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208332#comment-16208332
]
Markus Jelsma commented on NUTCH-2439:
--
No idea, but probably someone on Tika's user list will so i
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Attachment: NUTCH-2411.patch
Don't add empty fields.
> Index-metadata to support index
[
https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2439:
-
Attachment: (was: NUTCH-2439.patch)
> Upgrade to Apache Tika 1
[
https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2439:
-
Attachment: NUTCH-2439.patch
updated patch
> Upgrade to Apache Tika 1
[
https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2439:
-
Attachment: NUTCH-2439.patch
> Upgrade to Apache Tika 1
[
https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200495#comment-16200495
]
Markus Jelsma commented on NUTCH-2439:
--
Ah, i removed slf4j-api from plugin.xml and it works
[
https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2439:
-
Attachment: NUTCH-2439.patch
Patch for master! Upgrade went fine, getting the libraries was also
Markus Jelsma created NUTCH-2439:
Summary: Upgrade to Apache Tika 1.16
Key: NUTCH-2439
URL: https://issues.apache.org/jira/browse/NUTCH-2439
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16184069#comment-16184069
]
Markus Jelsma commented on NUTCH-2418:
--
This was due to a custom parser emitting null for the title
[
https://issues.apache.org/jira/browse/NUTCH-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-2418.
> NPE in org.apache.hadoop.io.Text from FetcherThr
[
https://issues.apache.org/jira/browse/NUTCH-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-2418.
--
Resolution: Not A Problem
> NPE in org.apache.hadoop.io.Text from FetcherThr
[
https://issues.apache.org/jira/browse/NUTCH-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2434:
-
Attachment: NUTCH-2434.patch
Patch for master
> Option to reset parameters HTMLMetaT
Markus Jelsma created NUTCH-2434:
Summary: Option to reset parameters HTMLMetaTags
Key: NUTCH-2434
URL: https://issues.apache.org/jira/browse/NUTCH-2434
Project: Nutch
Issue Type: Task
[
https://issues.apache.org/jira/browse/NUTCH-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2418:
-
Description:
{code}
2017-09-05 15:28:54,539 INFO [FetcherThread
[
https://issues.apache.org/jira/browse/NUTCH-2432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2432:
-
Attachment: NUTCH-2432.patch
Patch for master!
> Protocol httpclient to disable cook
Markus Jelsma created NUTCH-2432:
Summary: Protocol httpclient to disable cookies if
http.enable.cookie.header is false
Key: NUTCH-2432
URL: https://issues.apache.org/jira/browse/NUTCH-2432
Project
[
https://issues.apache.org/jira/browse/NUTCH-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2420:
-
Attachment: NUTCH-2420.patch
Patch for master! This calls open and reset each time a HostDatum
Markus Jelsma created NUTCH-2420:
Summary: Bug in variable generate.max.count and
fetcher.server.delay
Key: NUTCH-2420
URL: https://issues.apache.org/jira/browse/NUTCH-2420
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2419:
-
Attachment: NUTCH-2419.patch
Patch for trunk!
> Domain blacklist URL filter does not resp
[
https://issues.apache.org/jira/browse/NUTCH-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2417:
-
Attachment: (was: NUTCH-2417.patch)
> Support for variable fetch delay via FreeGenera
[
https://issues.apache.org/jira/browse/NUTCH-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155138#comment-16155138
]
Markus Jelsma commented on NUTCH-2417:
--
No patch, wrong ticket!
> Support for variable fetch de
[
https://issues.apache.org/jira/browse/NUTCH-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2417:
-
Attachment: NUTCH-2417.patch
Patch for trnk!
> Support for variable fetch delay
Markus Jelsma created NUTCH-2419:
Summary: Domain blacklist URL filter does not respect command-line
override for file
Key: NUTCH-2419
URL: https://issues.apache.org/jira/browse/NUTCH-2419
Project
Markus Jelsma created NUTCH-2418:
Summary: NPE in org.apache.hadoop.io.Text from FetcherThread
Key: NUTCH-2418
URL: https://issues.apache.org/jira/browse/NUTCH-2418
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148906#comment-16148906
]
Markus Jelsma commented on NUTCH-2411:
--
Added param to explicitly list the fields that are supposed
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Description:
{code}
index.metadata.separator
Separator to use if you want to index
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Attachment: NUTCH-2411-1.13.patch
> Index-metadata to support indexing multiple val
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Attachment: NUTCH-2411-1.13.patch
Patch for 1.13
> Index-metadata to support indexing multi
[
https://issues.apache.org/jira/browse/NUTCH-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147994#comment-16147994
]
Markus Jelsma commented on NUTCH-2415:
--
[~jorgelbg] good points, please feel free to assign this one
Markus Jelsma created NUTCH-2417:
Summary: Support for variable fetch delay via FreeGenerator
Key: NUTCH-2417
URL: https://issues.apache.org/jira/browse/NUTCH-2417
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2416:
-
Attachment: NUTCH-2416.patch
Didn't save changes at previous patch!
> Fetcher to log thread
[
https://issues.apache.org/jira/browse/NUTCH-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2416:
-
Attachment: NUTCH-2416.patch
> Fetcher to log thread
[
https://issues.apache.org/jira/browse/NUTCH-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146915#comment-16146915
]
Markus Jelsma commented on NUTCH-2416:
--
It seems Thread.currentThread().toString() prints the same
Markus Jelsma created NUTCH-2416:
Summary: Fetcher to log thread ID
Key: NUTCH-2416
URL: https://issues.apache.org/jira/browse/NUTCH-2416
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2416:
-
Attachment: NUTCH-2416.patch
> Fetcher to log thread
[
https://issues.apache.org/jira/browse/NUTCH-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144237#comment-16144237
]
Markus Jelsma commented on NUTCH-2414:
--
Although filtering on lang field is a good idea, i think we
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Affects Version/s: 1.13
> Index-metadata to support indexing multiple values for a fi
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Fix Version/s: 1.14
> Index-metadata to support indexing multiple values for a fi
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Description:
{code}
index.metadata.separator
Separator to use if you want to index
[
https://issues.apache.org/jira/browse/NUTCH-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2411:
-
Attachment: NUTCH-2411.patch
Patch!
> Index-metadata to support indexing multiple val
Markus Jelsma created NUTCH-2411:
Summary: Index-metadata to support indexing multiple values for a
field
Key: NUTCH-2411
URL: https://issues.apache.org/jira/browse/NUTCH-2411
Project: Nutch
Although it has not troubled me so far, if there is something you think there
is to improve i would most certainly welcome it! Feel free to open an issue at
our Jira, provide a patch or pull request so it can be dealt with.
Regards,
Markus
-Original message-
> From:kenneth mcfarland
[
https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130206#comment-16130206
]
Markus Jelsma commented on NUTCH-2335:
--
Ok, with this modification, it doesnt print with -noFilter
[
https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130204#comment-16130204
]
Markus Jelsma commented on NUTCH-2335:
--
I moved the sys.out to:
{code}
if (filters != null
[
https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130191#comment-16130191
]
Markus Jelsma commented on NUTCH-2335:
--
I see, will modify the println to the correct branch
[
https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130110#comment-16130110
]
Markus Jelsma commented on NUTCH-2335:
--
passing -noFilter does not change anything. I am staring
[
https://issues.apache.org/jira/browse/NUTCH-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130105#comment-16130105
]
Markus Jelsma commented on NUTCH-2335:
--
Sebastian, there is a problem with either this patch
401 - 500 of 3217 matches
Mail list logo