[
https://issues.apache.org/jira/browse/NUTCH-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833858#comment-16833858
]
Sebastian Nagel commented on NUTCH-2715:
Hi [~yossi], thanks! Unfortunately both WARC writer
[
https://issues.apache.org/jira/browse/NUTCH-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2650:
--
Assignee: Sebastian Nagel
> -addBinaryContent -base64 flags are causing "String
[
https://issues.apache.org/jira/browse/NUTCH-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2706:
---
Fix Version/s: 1.16
> -addBinaryContent flag can cause "String length must be a multiple of
[
https://issues.apache.org/jira/browse/NUTCH-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2706:
--
Assignee: Sebastian Nagel
> -addBinaryContent flag can cause "String length must be a
[
https://issues.apache.org/jira/browse/NUTCH-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2650:
---
Fix Version/s: 1.16
> -addBinaryContent -base64 flags are causing "String length must be a
[
https://issues.apache.org/jira/browse/NUTCH-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832841#comment-16832841
]
Sebastian Nagel commented on NUTCH-2706:
Hi [~pemanuel], I was able to reproduce the issue and
[
https://issues.apache.org/jira/browse/NUTCH-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832568#comment-16832568
]
Sebastian Nagel commented on NUTCH-2585:
PR including fix is open:
[
https://issues.apache.org/jira/browse/NUTCH-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832542#comment-16832542
]
Sebastian Nagel commented on NUTCH-2585:
Ok, this is reproduced using parallel streams (see
[
https://issues.apache.org/jira/browse/NUTCH-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819060#comment-16819060
]
Sebastian Nagel commented on NUTCH-2706:
Thanks for the hint, [~pemanuel]! I'll have a look.
>
[
https://issues.apache.org/jira/browse/NUTCH-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2709:
---
Component/s: protocol
> Remove unused properties and code related to HTTP protocol
>
Sebastian Nagel created NUTCH-2709:
--
Summary: Remove unused properties and code related to HTTP protocol
Key: NUTCH-2709
URL: https://issues.apache.org/jira/browse/NUTCH-2709
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2702:
--
Assignee: Sebastian Nagel
> Fetcher: suppress stack for frequent exceptions
>
[
https://issues.apache.org/jira/browse/NUTCH-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2704.
Resolution: Implemented
> Upgrade crawler-commons dependency to 1.0
>
[
https://issues.apache.org/jira/browse/NUTCH-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2704:
--
Assignee: Sebastian Nagel
> Upgrade crawler-commons dependency to 1.0
>
[
https://issues.apache.org/jira/browse/NUTCH-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2699:
--
Assignee: Sebastian Nagel
> Protocol-okhttp: needless loops to increment requested
[
https://issues.apache.org/jira/browse/NUTCH-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2699.
Resolution: Fixed
> Protocol-okhttp: needless loops to increment requested bytes counter
[
https://issues.apache.org/jira/browse/NUTCH-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2699 started by Sebastian Nagel.
--
> Protocol-okhttp: needless loops to increment requested bytes counter when
Sebastian Nagel created NUTCH-2708:
--
Summary: urlfilter-automaton: update library dependency
(dk.brics.automaton)
Key: NUTCH-2708
URL: https://issues.apache.org/jira/browse/NUTCH-2708
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815355#comment-16815355
]
Sebastian Nagel edited comment on NUTCH-2690 at 4/11/19 11:53 AM:
--
PR
[
https://issues.apache.org/jira/browse/NUTCH-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815355#comment-16815355
]
Sebastian Nagel commented on NUTCH-2690:
PR updated, squashed and rebased to current master.
I'll
[
https://issues.apache.org/jira/browse/NUTCH-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2279:
--
Assignee: Sebastian Nagel
> LinkRank fails when using Hadoop MR output compression
>
[
https://issues.apache.org/jira/browse/NUTCH-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2700:
--
Assignee: Sebastian Nagel
> Indexchecker: improve command-line help
>
[
https://issues.apache.org/jira/browse/NUTCH-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2700.
Resolution: Implemented
> Indexchecker: improve command-line help
>
[
https://issues.apache.org/jira/browse/NUTCH-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2700 started by Sebastian Nagel.
--
> Indexchecker: improve command-line help
>
[
https://issues.apache.org/jira/browse/NUTCH-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815282#comment-16815282
]
Sebastian Nagel commented on NUTCH-2703:
+1
But I would opt to make it configurable. I'll open a
[
https://issues.apache.org/jira/browse/NUTCH-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2701.
Resolution: Implemented
Merged/committed. Thanks, [~markus17]!
> Fetcher: log dates and
[
https://issues.apache.org/jira/browse/NUTCH-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2666.
Resolution: Implemented
Merged in to master, will be available in 1.16. Thanks,
[
https://issues.apache.org/jira/browse/NUTCH-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2666:
--
Assignee: Sebastian Nagel
> Increase default value for http.content.limit /
[
https://issues.apache.org/jira/browse/NUTCH-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2683.
Resolution: Implemented
> DeduplicationJob: add option to prefer https:// over http://
>
[
https://issues.apache.org/jira/browse/NUTCH-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2683:
--
Assignee: Sebastian Nagel
> DeduplicationJob: add option to prefer https:// over
[
https://issues.apache.org/jira/browse/NUTCH-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813732#comment-16813732
]
Sebastian Nagel commented on NUTCH-2688:
Thanks, [~roannel]! In general, yes it would be
[
https://issues.apache.org/jira/browse/NUTCH-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813724#comment-16813724
]
Sebastian Nagel commented on NUTCH-2706:
Hi [~pemanuel], can you share the document which causes
[
https://issues.apache.org/jira/browse/NUTCH-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813569#comment-16813569
]
Sebastian Nagel commented on NUTCH-2334:
Hi [~roannel], I think now with the AND and OR voting
[
https://issues.apache.org/jira/browse/NUTCH-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812257#comment-16812257
]
Sebastian Nagel commented on NUTCH-2669:
At least, IVY-1586 has been confirmed and assigned. Need
[
https://issues.apache.org/jira/browse/NUTCH-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811949#comment-16811949
]
Sebastian Nagel commented on NUTCH-2707:
Turns out that there are few more servers which does not
[
https://issues.apache.org/jira/browse/NUTCH-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2707:
---
Summary: protocol-okhttp fails to decompress content if Content-Encoding
header is wrong
Sebastian Nagel created NUTCH-2707:
--
Summary: protocol-okhttp fails to decompress gzip-encoded content
Key: NUTCH-2707
URL: https://issues.apache.org/jira/browse/NUTCH-2707
Project: Nutch
Sebastian Nagel created NUTCH-2705:
--
Summary: urlfilter-validator rejects IPv6 URLs
Key: NUTCH-2705
URL: https://issues.apache.org/jira/browse/NUTCH-2705
Project: Nutch
Issue Type: Bug
Sebastian Nagel created NUTCH-2704:
--
Summary: Upgrade crawler-commons dependency to 1.0
Key: NUTCH-2704
URL: https://issues.apache.org/jira/browse/NUTCH-2704
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2701:
--
Assignee: Sebastian Nagel
> Fetcher: log dates and times also in human-readable form
[
https://issues.apache.org/jira/browse/NUTCH-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799010#comment-16799010
]
Sebastian Nagel commented on NUTCH-2701:
PR which fixes the logging:
{noformat}
19/03/22 11:50:48
[
https://issues.apache.org/jira/browse/NUTCH-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2703:
---
Summary: parse-tika: Boilerpipe should not run for non-(X)HTML pages (was:
Boilerpipe
[
https://issues.apache.org/jira/browse/NUTCH-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2703:
---
Component/s: plugin
> parse-tika: Boilerpipe should not run for non-(X)HTML pages
>
Sebastian Nagel created NUTCH-2702:
--
Summary: Fetcher: suppress stack for frequent exceptions
Key: NUTCH-2702
URL: https://issues.apache.org/jira/browse/NUTCH-2702
Project: Nutch
Issue
Sebastian Nagel created NUTCH-2701:
--
Summary: Fetcher: log dates and times also in human-readable form
Key: NUTCH-2701
URL: https://issues.apache.org/jira/browse/NUTCH-2701
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792447#comment-16792447
]
Sebastian Nagel commented on NUTCH-2669:
Hi [~lewismc],
one point is that the work-around to
Sebastian Nagel created NUTCH-2700:
--
Summary: Indexchecker: improve command-line help
Key: NUTCH-2700
URL: https://issues.apache.org/jira/browse/NUTCH-2700
Project: Nutch
Issue Type:
Sebastian Nagel created NUTCH-2699:
--
Summary: Protocol-okhttp: needless loops to increment requested
bytes counter when more content is already buffered
Key: NUTCH-2699
URL:
[
https://issues.apache.org/jira/browse/NUTCH-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2696:
--
Assignee: Sebastian Nagel
> Nutch SegmentReader does not dump non-ASCII characters
[
https://issues.apache.org/jira/browse/NUTCH-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785836#comment-16785836
]
Sebastian Nagel commented on NUTCH-2683:
Any comments or objections? Thanks! Otherwise I'll
[
https://issues.apache.org/jira/browse/NUTCH-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785819#comment-16785819
]
Sebastian Nagel commented on NUTCH-2666:
Any objections? It's a huge jump but the it may be
[
https://issues.apache.org/jira/browse/NUTCH-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784851#comment-16784851
]
Sebastian Nagel commented on NUTCH-2292:
Yes, we can try to get a GSoC project. At a first
[
https://issues.apache.org/jira/browse/NUTCH-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782415#comment-16782415
]
Sebastian Nagel commented on NUTCH-2292:
Hi [~lewismc], hi [~thammegowda],
I've tried to rebase
[
https://issues.apache.org/jira/browse/NUTCH-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778154#comment-16778154
]
Sebastian Nagel commented on NUTCH-2697:
Thanks! I also had no success in getting this sorted
[
https://issues.apache.org/jira/browse/NUTCH-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776858#comment-16776858
]
Sebastian Nagel commented on NUTCH-2697:
Hi [~chrisgavin], thanks for the patch/PR. Can you
[
https://issues.apache.org/jira/browse/NUTCH-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2697:
---
Affects Version/s: 1.16
> Upgrade Ivy to fix the issue of an unset packaging.type property.
[
https://issues.apache.org/jira/browse/NUTCH-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2697:
---
Fix Version/s: 1.16
> Upgrade Ivy to fix the issue of an unset packaging.type property.
>
[
https://issues.apache.org/jira/browse/NUTCH-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776634#comment-16776634
]
Sebastian Nagel commented on NUTCH-2695:
Hi [~malcolmt], the build should succeed with the second
[
https://issues.apache.org/jira/browse/NUTCH-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2460.
Resolution: Implemented
Implemented as part of NUTCH-2676.
> use the headless option of
[
https://issues.apache.org/jira/browse/NUTCH-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2676.
Resolution: Fixed
Tested again successfully chrome and gecko drivers. Merged PR #430.
[
https://issues.apache.org/jira/browse/NUTCH-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2696:
---
Fix Version/s: 1.16
> Nutch SegmentReader does not dump non-ASCII characters with Hadoop 3.x
[
https://issues.apache.org/jira/browse/NUTCH-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2696:
---
Affects Version/s: 1.15
> Nutch SegmentReader does not dump non-ASCII characters with Hadoop
[
https://issues.apache.org/jira/browse/NUTCH-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775641#comment-16775641
]
Sebastian Nagel commented on NUTCH-2696:
Hi [~lhervaud], thanks for the bug report! This issue is
[
https://issues.apache.org/jira/browse/NUTCH-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775308#comment-16775308
]
Sebastian Nagel commented on NUTCH-2692:
Hi [~markus17], the commit added
[
https://issues.apache.org/jira/browse/NUTCH-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2684.
Resolution: Fixed
Thanks, [~roannel]!
> Add README.md file to all indexer writers plugins
[
https://issues.apache.org/jira/browse/NUTCH-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775264#comment-16775264
]
Sebastian Nagel commented on NUTCH-2692:
+1 Ideally, the new property should be also described
[
https://issues.apache.org/jira/browse/NUTCH-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2693.
Resolution: Fixed
Committed/merged.
> Misspelled configuration property names in
[
https://issues.apache.org/jira/browse/NUTCH-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2693:
--
Assignee: Sebastian Nagel
> Misspelled configuration property names in documentation
[
https://issues.apache.org/jira/browse/NUTCH-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2627.
Resolution: Implemented
Assignee: Sebastian Nagel
Committed/merged.
> Fetcher to
[
https://issues.apache.org/jira/browse/NUTCH-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2695.
Resolution: Fixed
Fix Version/s: 1.16
Fixed/merged. Thanks, [~malcolmt]!
> Fix
[
https://issues.apache.org/jira/browse/NUTCH-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2695:
--
Assignee: Sebastian Nagel
> Fix some alerts raised by LGTM
>
[
https://issues.apache.org/jira/browse/NUTCH-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775152#comment-16775152
]
Sebastian Nagel commented on NUTCH-2695:
Hi [~malcolmt], thanks! I'll merge the pull request and
[
https://issues.apache.org/jira/browse/NUTCH-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775104#comment-16775104
]
Sebastian Nagel commented on NUTCH-2694:
+1 but requires also few changes in ResolverThread, see
[
https://issues.apache.org/jira/browse/NUTCH-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2694:
---
Attachment: NUTCH-2694-2.patch
> HostDB to aggregate by long instead of integer
>
[
https://issues.apache.org/jira/browse/NUTCH-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774512#comment-16774512
]
Sebastian Nagel commented on NUTCH-2694:
+1
A HostDatum has no version byte in its serialization
Sebastian Nagel created NUTCH-2693:
--
Summary: Misspelled configuration property names in documentation
Key: NUTCH-2693
URL: https://issues.apache.org/jira/browse/NUTCH-2693
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2689.
Resolution: Implemented
Thanks, [~markus17]! Merged.
> Speed up urlfilter-regex and
[
https://issues.apache.org/jira/browse/NUTCH-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2691.
Resolution: Implemented
Merged. Thanks, [~yossi]!
> Improve logging from scoring-depth
[
https://issues.apache.org/jira/browse/NUTCH-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2685.
Resolution: Fixed
Merged. Thanks, [~roannel]!
> Add README.md file to all exchange
[
https://issues.apache.org/jira/browse/NUTCH-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748878#comment-16748878
]
Sebastian Nagel commented on NUTCH-2691:
+1
> Improve logging from scoring-depth plugin
>
Sebastian Nagel created NUTCH-2690:
--
Summary: Configurable and fast URL filter
Key: NUTCH-2690
URL: https://issues.apache.org/jira/browse/NUTCH-2690
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2686.
Resolution: Fixed
Thanks, [~roannel]! Resolving because PR is merged. The failure of unit
[
https://issues.apache.org/jira/browse/NUTCH-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748765#comment-16748765
]
Sebastian Nagel commented on NUTCH-2689:
The benchmark times before ...
{noformat}
% grep 'bench
[
https://issues.apache.org/jira/browse/NUTCH-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2689:
--
Assignee: Sebastian Nagel
> Speed up urlfilter-regex and urlfilter-automaton
>
Sebastian Nagel created NUTCH-2689:
--
Summary: Speed up urlfilter-regex and urlfilter-automaton
Key: NUTCH-2689
URL: https://issues.apache.org/jira/browse/NUTCH-2689
Project: Nutch
Issue
[
https://issues.apache.org/jira/browse/NUTCH-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2682.
Resolution: Fixed
Assignee: Sebastian Nagel
> Upgrade to Tika 1.20
>
[
https://issues.apache.org/jira/browse/NUTCH-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2629.
Resolution: Fixed
Thanks, [~roannel]! Looks good. I've added also a short note that the
[
https://issues.apache.org/jira/browse/NUTCH-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-2680:
--
Assignee: Sebastian Nagel
> Documentation: https supported by multiple protocol
[
https://issues.apache.org/jira/browse/NUTCH-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2680.
Resolution: Fixed
> Documentation: https supported by multiple protocol plugins not only
[
https://issues.apache.org/jira/browse/NUTCH-2663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2663.
Resolution: Fixed
Merged PR. Thanks, [~jorgelbg]!
> Improve index-jexl-filter syntax for
[
https://issues.apache.org/jira/browse/NUTCH-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2653.
Resolution: Fixed
Fixed contained in NUTCH-2678.
> ProtocolFactory.getProtocol(url)
[
https://issues.apache.org/jira/browse/NUTCH-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746230#comment-16746230
]
Sebastian Nagel commented on NUTCH-2678:
Yes. It includes the fix for the unit test.
> Allow for
[
https://issues.apache.org/jira/browse/NUTCH-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745926#comment-16745926
]
Sebastian Nagel commented on NUTCH-2678:
Hi [~markus17], great! I've tested everything again:
[
https://issues.apache.org/jira/browse/NUTCH-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746056#comment-16746056
]
Sebastian Nagel commented on NUTCH-2688:
Using block comments sounds reasonable. Also the rules
[
https://issues.apache.org/jira/browse/NUTCH-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746050#comment-16746050
]
Sebastian Nagel commented on NUTCH-2687:
+1
Just for completion - the HTTP header for the given
[
https://issues.apache.org/jira/browse/NUTCH-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745191#comment-16745191
]
Sebastian Nagel commented on NUTCH-2676:
Thanks, [~virt], the PR looks promising! If done,
[
https://issues.apache.org/jira/browse/NUTCH-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735780#comment-16735780
]
Sebastian Nagel commented on NUTCH-2676:
Great! Thanks!
> Update to the latest selenium and add
[
https://issues.apache.org/jira/browse/NUTCH-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2666:
---
Fix Version/s: 1.16
> Increase default value for http.content.limit / ftp.content.limit /
>
[
https://issues.apache.org/jira/browse/NUTCH-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2666:
---
Summary: Increase default value for http.content.limit / ftp.content.limit
/
1201 - 1300 of 3281 matches
Mail list logo