[jira] Commented: (NUTCH-706) Url regex normalizer

2009-02-27 Thread Meghna Kukreja (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677460#action_12677460 ] Meghna Kukreja commented on NUTCH-706: -- The pattern should be changed to: ([;_\?&]((?i)

[jira] Created: (NUTCH-706) Url regex normalizer

2009-02-27 Thread Meghna Kukreja (JIRA)
Url regex normalizer Key: NUTCH-706 URL: https://issues.apache.org/jira/browse/NUTCH-706 Project: Nutch Issue Type: Bug Affects Versions: 1.0.0 Reporter: Meghna Kukreja Priority: Minor

[jira] Commented: (NUTCH-374) when http.content.limit be set to -1 and Response.CONTENT_ENCODING is gzip or x-gzip , it can not fetch any thing.

2006-09-29 Thread Meghna Kukreja (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-374?page=comments#action_12438722 ] Meghna Kukreja commented on NUTCH-374: -- I have experienced this same problem and I fixed it by making this change to the function unzipBestEffort() in GZIPUtil