[jira] [Commented] (NUTCH-1714) Nutch 2.x upgrade to use GORA_94 branch

2014-04-27 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982692#comment-13982692 ] Lewis John McGibbney commented on NUTCH-1714: - Can you elaborate? Do you mean

[jira] [Commented] (NUTCH-1129) Any23 Nutch plugin

2014-04-27 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982688#comment-13982688 ] Lewis John McGibbney commented on NUTCH-1129: - Did anyone get an opportunity t

[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to use GORA_94 branch

2014-04-27 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1714: Assignee: Alparslan Avcı > Nutch 2.x upgrade to use GORA_94 branch > --

[DISCUSS] Roadmap for 2.3 Release

2014-04-27 Thread Lewis John Mcgibbney
Hi Folks, I suggest we get in https://issues.apache.org/jira/browse/NUTCH-1714 then push a release anyone have any other suggestions/additions? Unless someone else wants to do RM then I can put time in if required. Thanks Lewis -- *Lewis*

[jira] [Commented] (NUTCH-1364) Add a counter in Generator for malformed urls

2014-04-27 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982684#comment-13982684 ] Lewis John McGibbney commented on NUTCH-1364: - [~diaa_abdallah] bq. Doesn't t

[jira] [Commented] (NUTCH-1364) Add a counter in Generator for malformed urls

2014-04-27 Thread Diaa (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982492#comment-13982492 ] Diaa commented on NUTCH-1364: - Why is the exception shown if it is already reported with Malfo

[jira] [Updated] (NUTCH-797) URL not properly constructed when link target begins with a "?"

2014-04-27 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-797: -- Attachment: NUTCH-797-2x-v2.patch Simplified patch for 2.x, without changes to fixEmbeddedParams

[jira] [Updated] (NUTCH-1767) remove special treatment of "params" in relative links

2014-04-27 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1767: --- Attachment: test_nutch_1767-2.html test_nutch_1767-1.html Test documents. Afa

[jira] [Created] (NUTCH-1767) remove special treatment of "params" in relative links

2014-04-27 Thread Sebastian Nagel (JIRA)
Sebastian Nagel created NUTCH-1767: -- Summary: remove special treatment of "params" in relative links Key: NUTCH-1767 URL: https://issues.apache.org/jira/browse/NUTCH-1767 Project: Nutch Issu

[jira] [Updated] (NUTCH-1766) Generator to unlock crawldb and remove tempdir if generate job fails

2014-04-27 Thread Diaa (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Diaa updated NUTCH-1766: Attachment: GeneratorErrorHandling.patch > Generator to unlock crawldb and remove tempdir if generate job fails > -

[jira] [Commented] (NUTCH-1766) Generator to unlock crawldb and remove tempdir if generate job fails

2014-04-27 Thread Diaa (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982485#comment-13982485 ] Diaa commented on NUTCH-1766: - Attaching Generator patch to fix the issue > Generator to unlo

[jira] [Created] (NUTCH-1766) Generator to unlock crawldb and remove tempdir if generate job fails

2014-04-27 Thread Diaa (JIRA)
Diaa created NUTCH-1766: --- Summary: Generator to unlock crawldb and remove tempdir if generate job fails Key: NUTCH-1766 URL: https://issues.apache.org/jira/browse/NUTCH-1766 Project: Nutch Issue Type:

[jira] [Commented] (NUTCH-566) Sun's URL class has bug in creation of relative query URLs

2014-04-27 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982276#comment-13982276 ] Sebastian Nagel commented on NUTCH-566: --- Linked to duplicate issues NUTCH-797 and NUT

[jira] [Commented] (NUTCH-952) fix outlink which started with '?' in html parser

2014-04-27 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982275#comment-13982275 ] Sebastian Nagel commented on NUTCH-952: --- Linked to duplicate issues NUTCH-797 and NUT

[jira] [Updated] (NUTCH-797) URL not properly constructed when link target begins with a "?"

2014-04-27 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-797: -- Summary: URL not properly constructed when link target begins with a "?" (was: parse-tika is no

[jira] [Commented] (NUTCH-797) URL not properly constructed when link target begins with a "?"

2014-04-27 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982272#comment-13982272 ] Sebastian Nagel commented on NUTCH-797: --- Changed title: it's also a problem of parse-

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2014-04-27 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982271#comment-13982271 ] Sebastian Nagel commented on NUTCH-797: --- Ok, then I'll take over to patch 2.x and res

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2014-04-27 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982215#comment-13982215 ] Julien Nioche commented on NUTCH-797: - No idea Seb. Have just assigned this one from An