[
https://issues.apache.org/jira/browse/NUTCH-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677460#action_12677460
]
Meghna Kukreja commented on NUTCH-706:
--
The pattern should be changed to:
([;_\?&]((?i)
Url regex normalizer
Key: NUTCH-706
URL: https://issues.apache.org/jira/browse/NUTCH-706
Project: Nutch
Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Meghna Kukreja
Priority: Minor
[
http://issues.apache.org/jira/browse/NUTCH-374?page=comments#action_12438722 ]
Meghna Kukreja commented on NUTCH-374:
--
I have experienced this same problem and I fixed it by making this change to
the function unzipBestEffort() in GZIPUtil