[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730203#comment-16730203 ] Markus Jelsma commented on NUTCH-2665: -- Thanks! > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Attachments: NUTCH-2665.patch, NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662234#comment-16662234 ] Markus Jelsma commented on NUTCH-2665: -- On my machine it really fails with the latest patch, weird! When removing the patch everything passes, patching again causes this one to fail. Also, what about duplicate NUTCH-2667. It has a patch but it doesn't entirely correspond to this one. Either one of the issues should be closed as duplicate. > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 2.4 > > Attachments: NUTCH-2665.patch, NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662187#comment-16662187 ] Sebastian Nagel commented on NUTCH-2665: Hi [~markus17], I do not see this test failure when applying the second patch and running {{ant clean runtime test}}. But all parse-tika tests fail with a NoSuchMethodError. Did you run "clean"? > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 2.4 > > Attachments: NUTCH-2665.patch, NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662072#comment-16662072 ] Jorge Luis Betancourt Gonzalez commented on NUTCH-2665: --- +1 [~markus17] I think it's safe to update the test. > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 2.4 > > Attachments: NUTCH-2665.patch, NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661983#comment-16661983 ] Markus Jelsma commented on NUTCH-2665: -- Helloe [~axr], yes it compiles fine, that is where the default.properties patch is for. Running tests: {code} ContentType http://127.0.0.1:47501/basic-http.jsp expected:<[application/xhtml+x]ml> but was:<[text/ht]ml> junit.framework.AssertionFailedError: ContentType http://127.0.0.1:47501/basic-http.jsp expected:<[application/xhtml+x]ml> but was:<[text/ht]ml> at org.apache.nutch.protocol.http.TestProtocolHttp.fetchPage(TestProtocolHttp.java:134) at org.apache.nutch.protocol.http.TestProtocolHttp.testStatusCode(TestProtocolHttp.java:79) {code} This fails, but i am actually fine with this response. I propose to change the test to assert for text/html instead. Opinions? > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 2.4 > > Attachments: NUTCH-2665.patch, NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661680#comment-16661680 ] Akshar Dave commented on NUTCH-2665: were you able to commit this change and successfully build? I am trying to build locally after merging all the changes and getting dependency related error: [ivy:resolve] :: [ivy:resolve] :: UNRESOLVED DEPENDENCIES :: [ivy:resolve] :: [ivy:resolve] :: javax.measure#unit-api;working@axr.local: not found [ivy:resolve] :: > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 2.4 > > Attachments: NUTCH-2665.patch, NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660625#comment-16660625 ] Markus Jelsma commented on NUTCH-2665: -- I'll commit this one later today, if i don't forget, unless further objections. > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 2.4 > > Attachments: NUTCH-2665.patch, NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660525#comment-16660525 ] Markus Jelsma commented on NUTCH-2665: -- Updated patch defining the property in ivysettings.xml. > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 2.4 > > Attachments: NUTCH-2665.patch, NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660518#comment-16660518 ] Sebastian Nagel commented on NUTCH-2665: +1 Thanks, [~markus17]! For 1.x I needed several trials to get the fix for the javax.ws dependency working on the [Jenkins builds|https://builds.apache.org/job/Nutch-trunk/]. Defining packaging.type=jar in the default.properties didn't work, also adding it as an ant param did not (equiv. to {{ant -Dpackaging.type=jar ...}}). Defining the property in the ivysettings.xml finally solved it, see [65c4fed|https://gitbox.apache.org/repos/asf?p=nutch.git;a=commitdiff;h=65c4fedfacdb873a050e97a50602ed366c7b5a98]. Can you integrate this change into your patch? > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 2.4 > > Attachments: NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2665) Upgrade to Apache Tika 1.19.1
[ https://issues.apache.org/jira/browse/NUTCH-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660455#comment-16660455 ] Markus Jelsma commented on NUTCH-2665: -- Patch for 2.x! > Upgrade to Apache Tika 1.19.1 > - > > Key: NUTCH-2665 > URL: https://issues.apache.org/jira/browse/NUTCH-2665 > Project: Nutch > Issue Type: Task > Components: parser >Affects Versions: 2.3.1 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 2.4 > > Attachments: NUTCH-2665.patch > > > Borrowing from [~wastl-nagel]'s efforts on NUTCH-2651, 2.x can be upgraded to > Apache Tika 1.19.1 as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)