[jira] [Updated] (NUTCH-1098) better url-normalizer basic

2011-11-04 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1098: --- Attachment: (was: patch-with-utf8-encoding.diff) > better url-normalizer basic >

[jira] [Updated] (NUTCH-1070) Run nutch under native windows (no cygwin)

2011-11-03 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1070: --- Attachment: (was: bash.c) > Run nutch under native windows (no cygwin) >

[jira] [Updated] (NUTCH-1070) Run nutch under native windows (no cygwin)

2011-11-03 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1070: --- Attachment: (was: chmod.c) > Run nutch under native windows (no cygwin) > ---

[jira] [Updated] (NUTCH-1070) Run nutch under native windows (no cygwin)

2011-11-03 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1070: --- Attachment: (was: nutch.bat) > Run nutch under native windows (no cygwin) > -

[jira] [Updated] (NUTCH-1194) CrawlDB lock should be released earlier

2011-11-03 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1194: --- Comment: was deleted (was: locking should be done in setup/cleanup task. Currently if you kill proce

[jira] [Updated] (NUTCH-1098) better url-normalizer basic

2011-11-02 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1098: --- Attachment: patch-with-utf8-encoding.diff Added support for encoding string to UTF-8 and then URL %es

[jira] [Updated] (NUTCH-1098) better url-normalizer basic

2011-11-02 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1098: --- Attachment: (was: patch-urlnormalizer.diff) > better url-normalizer basic > -

[jira] [Updated] (NUTCH-1098) better url-normalizer basic

2011-10-24 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1098: --- Attachment: patch-urlnormalizer.diff Do not decode # and / characters during %XX decoding. Unit tests

[jira] [Updated] (NUTCH-1098) better url-normalizer basic

2011-10-24 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1098: --- Attachment: (was: patch-urlnormalizer.diff) > better url-normalizer basic > -

[jira] [Updated] (NUTCH-1098) better url-normalizer basic

2011-10-24 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1098: --- Attachment: (was: nutch.diff) > better url-normalizer basic > --- > >

[jira] [Updated] (NUTCH-1098) better url-normalizer basic

2011-10-19 Thread Radim Kolar (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated NUTCH-1098: --- Attachment: patch-urlnormalizer.diff > better url-normalizer basic > ---