[jira] [Updated] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series

2014-01-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasin Kılınç updated NUTCH-1478: Attachment: NUTCH-1478v4.patch I reviewed this patch and some bug fixed. +1 for commit. > Parse-me

[jira] [Created] (NUTCH-1718) update description of property http.robots.agent

2014-01-28 Thread Sebastian Nagel (JIRA)
Sebastian Nagel created NUTCH-1718: -- Summary: update description of property http.robots.agent Key: NUTCH-1718 URL: https://issues.apache.org/jira/browse/NUTCH-1718 Project: Nutch Issue Type

[jira] [Updated] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series

2014-01-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasin Kılınç updated NUTCH-1478: Attachment: NUTCH-1478v4.patch > Parse-metatags and index-metadata plugin for Nutch 2.x series > -

[jira] [Updated] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series

2014-01-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasin Kılınç updated NUTCH-1478: Attachment: (was: NUTCH-1478v4.patch) > Parse-metatags and index-metadata plugin for Nutch 2.x

[jira] [Updated] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series

2014-01-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasin Kılınç updated NUTCH-1478: Attachment: NUTCH-1478v4.patch > Parse-metatags and index-metadata plugin for Nutch 2.x series > -

[jira] [Updated] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series

2014-01-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasin Kılınç updated NUTCH-1478: Attachment: (was: NUTCH-1478v4.patch) > Parse-metatags and index-metadata plugin for Nutch 2.x

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-28 Thread Luke (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883986#comment-13883986 ] Luke commented on NUTCH-1414: - Thanks for the response. If I can follow up a bit more on passi

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-28 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883997#comment-13883997 ] Markus Jelsma commented on NUTCH-1414: -- Hi - you don't have to change it, it is alrea

[jira] [Commented] (NUTCH-1718) update description of property http.robots.agent

2014-01-28 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883998#comment-13883998 ] Markus Jelsma commented on NUTCH-1718: -- Yes +1 > update description of property http

[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to use GORA_94 branch

2014-01-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alparslan Avcı updated NUTCH-1714: -- Attachment: NUTCH-1714v2.patch I've uploaded a patch that fixes direct call of Host class const

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-28 Thread Luke (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884073#comment-13884073 ] Luke commented on NUTCH-1414: - I think I found the culprit. >From the link you gave, Solr want

[jira] [Comment Edited] (NUTCH-1414) Date extraction parse filter

2014-01-28 Thread Luke (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884073#comment-13884073 ] Luke edited comment on NUTCH-1414 at 1/28/14 12:42 PM: --- I think I fo

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-28 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884078#comment-13884078 ] Markus Jelsma commented on NUTCH-1414: -- Ah yes, you are right. That makes sense, the

[jira] [Commented] (NUTCH-1717) HostDB not to complain if filters/normalizers are disabled

2014-01-28 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884091#comment-13884091 ] Lewis John McGibbney commented on NUTCH-1717: - +1 > HostDB not to complain if

[jira] [Updated] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series

2014-01-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasin Kılınç updated NUTCH-1478: Attachment: (was: NUTCH-1478v4.patch) > Parse-metatags and index-metadata plugin for Nutch 2.x

[jira] [Updated] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series

2014-01-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasin Kılınç updated NUTCH-1478: Attachment: NUTCH-1478v4.patch I added new patch. It pass all test cases. > Parse-metatags and ind

[jira] [Commented] (NUTCH-1717) HostDB not to complain if filters/normalizers are disabled

2014-01-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884150#comment-13884150 ] Hudson commented on NUTCH-1717: --- SUCCESS: Integrated in Nutch-trunk #2509 (See [https://bui

[jira] [Created] (NUTCH-1719) DomainStatistics fails in 2.x because URL is not unreversed

2014-01-28 Thread Gerhard Gossen (JIRA)
Gerhard Gossen created NUTCH-1719: - Summary: DomainStatistics fails in 2.x because URL is not unreversed Key: NUTCH-1719 URL: https://issues.apache.org/jira/browse/NUTCH-1719 Project: Nutch

[jira] [Updated] (NUTCH-1719) DomainStatistics fails in 2.x because URL is not unreversed

2014-01-28 Thread Gerhard Gossen (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gerhard Gossen updated NUTCH-1719: -- Attachment: domainstats.patch > DomainStatistics fails in 2.x because URL is not unreversed > -

[jira] [Updated] (NUTCH-1719) DomainStatistics fails in 2.x because URL is not unreversed

2014-01-28 Thread Gerhard Gossen (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gerhard Gossen updated NUTCH-1719: -- Description: With Nutch 2.x, {{org.apache.nutch.util.domain.DomainStatistics}} always returns

[jira] [Updated] (NUTCH-1465) Support sitemaps in Nutch

2014-01-28 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1465: --- Attachment: NUTCH-1465-trunk.v5.patch Adding new patch 'v5' with below changes: 1. Added Apache licen

[jira] [Updated] (NUTCH-1718) update description of property http.robots.agent

2014-01-28 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1718: --- Attachment: NUTCH-1718-trunk.v1.patch Thanks [~wastl-nagel] for bringing this up. I should have updat

[jira] [Updated] (NUTCH-1719) DomainStatistics fails in 2.x because URL is not unreversed

2014-01-28 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1719: Fix Version/s: 2.3 > DomainStatistics fails in 2.x because URL is not unreversed >

[jira] [Commented] (NUTCH-1719) DomainStatistics fails in 2.x because URL is not unreversed

2014-01-28 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884548#comment-13884548 ] Lewis John McGibbney commented on NUTCH-1719: - +1, anyone else have comments?

[jira] [Commented] (NUTCH-1253) Incompatible neko and xerces versions

2014-01-28 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884550#comment-13884550 ] Lewis John McGibbney commented on NUTCH-1253: - I would like to commit by tomor

[jira] [Updated] (NUTCH-1253) Incompatible neko and xerces versions

2014-01-28 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1253: --- Attachment: NUTCH-1253-trunk.v2.patch Hi [~lewismc], the HTML which fails to parse looks not

[jira] [Comment Edited] (NUTCH-1253) Incompatible neko and xerces versions

2014-01-28 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884818#comment-13884818 ] Sebastian Nagel edited comment on NUTCH-1253 at 1/28/14 10:57 PM: --

[jira] [Commented] (NUTCH-1718) update description of property http.robots.agent

2014-01-28 Thread Daniel Kugel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885093#comment-13885093 ] Daniel Kugel commented on NUTCH-1718: - In that case I think that either the issue titl