[jira] [Commented] (NUTCH-2557) protocol-http fails to follow redirections when an HTTP response body is invalid

2018-06-12 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509469#comment-16509469 ] Omkar Reddy commented on NUTCH-2557: A simple and wise solution. Thanks. > protocol

[jira] [Commented] (NUTCH-2565) MergeDB incorrectly handles unfetched CrawlDatums

2018-06-12 Thread Jurian Broertjes (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509573#comment-16509573 ] Jurian Broertjes commented on NUTCH-2565: - One solution would be to sum the retri

[jira] [Commented] (NUTCH-2565) MergeDB incorrectly handles unfetched CrawlDatums

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509596#comment-16509596 ] Sebastian Nagel commented on NUTCH-2565: I thought first about making the conditi

Re: Nutch 1.14 issues

2018-06-12 Thread Sebastian Nagel
Hi Arkadi, thanks for your feedback and suggestions. I can understand your frustration but I also want to clarify: - Arch is a nice project, for sure. But Arch is GPL licensed which makes contributions a one-way route (Nutch -> Arch) and causes me even not to look into the Arch sources. Sorry

[jira] [Commented] (NUTCH-2012) Merge parsechecker and indexchecker

2018-06-12 Thread Jurian Broertjes (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509686#comment-16509686 ] Jurian Broertjes commented on NUTCH-2012: - It looks like the process() function s

[jira] [Commented] (NUTCH-2595) Upgrade crawler-commons dependency to 0.10

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509763#comment-16509763 ] ASF GitHub Bot commented on NUTCH-2595: --- sebastian-nagel closed pull request #345:

[jira] [Resolved] (NUTCH-2595) Upgrade crawler-commons dependency to 0.10

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2595. Resolution: Implemented > Upgrade crawler-commons dependency to 0.10 >

[jira] [Assigned] (NUTCH-2595) Upgrade crawler-commons dependency to 0.10

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-2595: -- Assignee: Sebastian Nagel > Upgrade crawler-commons dependency to 0.10 > -

[jira] [Work started] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2576 started by Sebastian Nagel. -- > HTTP protocol plugin based on okhttp > ---

[jira] [Assigned] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-2576: -- Assignee: Sebastian Nagel > HTTP protocol plugin based on okhttp > ---

[jira] [Resolved] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2576. Resolution: Implemented > HTTP protocol plugin based on okhttp > --

[jira] [Commented] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509766#comment-16509766 ] ASF GitHub Bot commented on NUTCH-2576: --- sebastian-nagel closed pull request #328:

[jira] [Commented] (NUTCH-2040) Upgrade to recent version of Crawler-Commons

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509768#comment-16509768 ] ASF GitHub Bot commented on NUTCH-2040: --- sebastian-nagel closed pull request #346:

[jira] [Resolved] (NUTCH-2040) Upgrade to recent version of Crawler-Commons

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2040. Resolution: Implemented > Upgrade to recent version of Crawler-Commons > --

[jira] [Commented] (NUTCH-2012) Merge parsechecker and indexchecker

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509779#comment-16509779 ] ASF GitHub Bot commented on NUTCH-2012: --- sju opened a new pull request #348: NUTCH-

[jira] [Commented] (NUTCH-2576) HTTP protocol plugin based on okhttp

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509786#comment-16509786 ] Hudson commented on NUTCH-2576: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3533 (Se

[jira] [Commented] (NUTCH-2595) Upgrade crawler-commons dependency to 0.10

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509785#comment-16509785 ] Hudson commented on NUTCH-2595: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3533 (Se

[jira] [Commented] (NUTCH-2565) MergeDB incorrectly handles unfetched CrawlDatums

2018-06-12 Thread Jurian Broertjes (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509801#comment-16509801 ] Jurian Broertjes commented on NUTCH-2565: - Maybe it would be sufficient to only t

[jira] [Commented] (NUTCH-2549) protocol-http does not behave the same as browsers

2018-06-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509804#comment-16509804 ] ASF GitHub Bot commented on NUTCH-2549: --- sebastian-nagel closed pull request #347:

[jira] [Commented] (NUTCH-2040) Upgrade to recent version of Crawler-Commons

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509848#comment-16509848 ] Hudson commented on NUTCH-2040: --- SUCCESS: Integrated in Jenkins build Nutch-nutchgora #1612

[jira] [Commented] (NUTCH-2549) protocol-http does not behave the same as browsers

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509864#comment-16509864 ] Hudson commented on NUTCH-2549: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (Se

[jira] [Commented] (NUTCH-2559) protocol-http cannot handle colons after the HTTP status code

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509868#comment-16509868 ] Hudson commented on NUTCH-2559: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (Se

[jira] [Commented] (NUTCH-2563) HTTP header spellchecking issues

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509870#comment-16509870 ] Hudson commented on NUTCH-2563: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (Se

[jira] [Commented] (NUTCH-2558) protocol-http cannot handle a missing HTTP status line

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509869#comment-16509869 ] Hudson commented on NUTCH-2558: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (Se

[jira] [Commented] (NUTCH-2564) protocol-http throws an error when the content-length header is not a number

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509867#comment-16509867 ] Hudson commented on NUTCH-2564: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (Se

[jira] [Commented] (NUTCH-2557) protocol-http fails to follow redirections when an HTTP response body is invalid

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509871#comment-16509871 ] Hudson commented on NUTCH-2557: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (Se

[jira] [Commented] (NUTCH-2560) protocol-http throws an error when an http header spans over multiple lines

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509872#comment-16509872 ] Hudson commented on NUTCH-2560: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (Se

[jira] [Commented] (NUTCH-2555) URL normalization problem: path not starting with a '/'

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509865#comment-16509865 ] Hudson commented on NUTCH-2555: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (Se

[jira] [Commented] (NUTCH-2556) protocol-http makes invalid HTTP/1.0 requests

2018-06-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509866#comment-16509866 ] Hudson commented on NUTCH-2556: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (Se

[jira] [Resolved] (NUTCH-2555) URL normalization problem: path not starting with a '/'

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2555. Resolution: Fixed > URL normalization problem: path not starting with a '/' > -

[jira] [Resolved] (NUTCH-2556) protocol-http makes invalid HTTP/1.0 requests

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2556. Resolution: Fixed HTTP/1.1 is now the default for protocol-http but setting http.useHttp11

[jira] [Resolved] (NUTCH-2558) protocol-http cannot handle a missing HTTP status line

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2558. Resolution: Fixed > protocol-http cannot handle a missing HTTP status line > --

[jira] [Resolved] (NUTCH-2559) protocol-http cannot handle colons after the HTTP status code

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2559. Resolution: Fixed > protocol-http cannot handle colons after the HTTP status code > ---

[jira] [Resolved] (NUTCH-2561) protocol-http can be made to read arbitrarily large HTTP responses

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2561. Resolution: Fixed Thanks, [~gbouchar], esp. for the idea for the unit test server. > proto

[jira] [Resolved] (NUTCH-2557) protocol-http fails to follow redirections when an HTTP response body is invalid

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2557. Resolution: Fixed Thanks, [~gbouchar] and [~omkar20895]! > protocol-http fails to follow r

[jira] [Resolved] (NUTCH-2563) HTTP header spellchecking issues

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2563. Resolution: Fixed > HTTP header spellchecking issues > > >

[jira] [Resolved] (NUTCH-2549) protocol-http does not behave the same as browsers

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2549. Resolution: Fixed Thanks, [~gbouchar] for the careful analysis! > protocol-http does not b

[jira] [Updated] (NUTCH-2512) Nutch does not build under JDK9

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2512: --- Fix Version/s: (was: 1.15) 1.16 > Nutch does not build under JDK9 > --

[jira] [Resolved] (NUTCH-2560) protocol-http throws an error when an http header spans over multiple lines

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2560. Resolution: Cannot Reproduce Thanks, [~gbouchar]. There is now a unit test for multi-line h

[jira] [Resolved] (NUTCH-2564) protocol-http throws an error when the content-length header is not a number

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2564. Resolution: Fixed > protocol-http throws an error when the content-length header is not a n

[jira] [Updated] (NUTCH-2292) Mavenize the build for nutch-core and nutch-plugins

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2292: --- Fix Version/s: (was: 1.15) 1.16 > Mavenize the build for nutch-core an

[jira] [Updated] (NUTCH-2030) ParseZip plugin is not able to extract language from zip document,this could solve that problem.

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2030: --- Fix Version/s: (was: 1.15) 1.16 > ParseZip plugin is not able to extra

[jira] [Updated] (NUTCH-2334) Extension point for schedulers

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2334: --- Fix Version/s: (was: 1.15) 1.16 > Extension point for schedulers > ---

[jira] [Commented] (NUTCH-2030) ParseZip plugin is not able to extract language from zip document,this could solve that problem.

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510069#comment-16510069 ] Sebastian Nagel commented on NUTCH-2030: So, it's about parse-zip or the "lang" f

[jira] [Updated] (NUTCH-2032) Plugin to index the raw content of a readable document.

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2032: --- Fix Version/s: (was: 1.15) > Plugin to index the raw content of a readable document. > -

[jira] [Commented] (NUTCH-2140) Atomic update and optimistic concurrency update using Solr

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510075#comment-16510075 ] Sebastian Nagel commented on NUTCH-2140: Hi [~roannel], is this still a requireme

[jira] [Updated] (NUTCH-2369) Create a new GraphGenerator Tool for writing Nutch Records as a Full Web Graph

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2369: --- Fix Version/s: (was: 1.15) > Create a new GraphGenerator Tool for writing Nutch Records a

[jira] [Updated] (NUTCH-2267) Solr indexer fails at the end of the job with a java error message

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2267: --- Fix Version/s: (was: 1.15) > Solr indexer fails at the end of the job with a java error m

[jira] [Resolved] (NUTCH-2267) Solr indexer fails at the end of the job with a java error message

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2267. Resolution: Done PR has been merged. Closing this for now. Thanks to everyone involved! >

[jira] [Resolved] (NUTCH-2312) Support PhantomJS as a WebDriver in protocol-selenium

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2312. Resolution: Incomplete Fix Version/s: (was: 1.15) No patch/PR provided so far.

[jira] [Updated] (NUTCH-2382) indexer-hbase Nutch 1.x branch

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2382: --- Fix Version/s: (was: 1.15) 1.16 > indexer-hbase Nutch 1.x branch > ---

[jira] [Commented] (NUTCH-2382) indexer-hbase Nutch 1.x branch

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510086#comment-16510086 ] Sebastian Nagel commented on NUTCH-2382: After NUTCH-1480 the patch needs to be u

[jira] [Resolved] (NUTCH-2251) Make CommonCrawlFormatJackson instance reusable by properly handling object state

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2251. Resolution: Duplicate Fix Version/s: (was: 1.15) > Make CommonCrawlFormatJackson

[jira] [Updated] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2239: --- Fix Version/s: (was: 1.15) > Selenium Handlers for Ajax Patterns from Student submissions

[jira] [Commented] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510090#comment-16510090 ] Sebastian Nagel commented on NUTCH-2239: Hi [~chrismattmann], still in progress?

[jira] [Updated] (NUTCH-2265) Write A Test Package for Scoring Similarity

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2265: --- Fix Version/s: (was: 1.15) > Write A Test Package for Scoring Similarity > --

[jira] [Updated] (NUTCH-2249) WordNet Integration for Cosine Similarity

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2249: --- Fix Version/s: (was: 1.15) > WordNet Integration for Cosine Similarity >

[jira] [Updated] (NUTCH-2209) Improved Tokenization for Similarity Scoring plugin

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2209: --- Fix Version/s: (was: 1.15) > Improved Tokenization for Similarity Scoring plugin > --

[jira] [Resolved] (NUTCH-2209) Improved Tokenization for Similarity Scoring plugin

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2209. Resolution: Done This has been already committed (pull request merged) for Nutch 1.12. > I

[jira] [Updated] (NUTCH-2147) MetadataScoringFilter for Nutch

2018-06-12 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2147: --- Fix Version/s: (was: 1.15) > MetadataScoringFilter for Nutch > --

Re: Nutch 1.14 issues

2018-06-12 Thread Arkadi.Kosmynin
Hi Sebastian, Sorry, clarifying my objectives: I am not frustrated, just trying to help. I did not write this message to request fixes for Arch. All these issues have been fixed in Arch, except perhaps the native library issue, but I may fix it as well, if lucky enough. I wrote that message to