[jira] [Updated] (NUTCH-2525) Metadata indexer cannot handle uppercase parse metadata

2019-05-07 Thread Jurian Broertjes (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurian Broertjes updated NUTCH-2525: Attachment: NUTCH-2525-p1.patch > Metadata indexer cannot handle uppercase parse metadata

[jira] [Commented] (NUTCH-2525) Metadata indexer cannot handle uppercase parse metadata

2019-05-07 Thread Jurian Broertjes (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834822#comment-16834822 ] Jurian Broertjes commented on NUTCH-2525: - Updated patch so it applies against master > Metadata

[jira] [Commented] (NUTCH-2715) WARCExporter fails on large records

2019-05-07 Thread Yossi Tamari (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834751#comment-16834751 ] Yossi Tamari commented on NUTCH-2715: - It seems to me like the commoncrawldump plugin is literally

[jira] [Updated] (NUTCH-2716) protocol-http: Response headers are not stored for a compressed response

2019-05-07 Thread Yossi Tamari (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yossi Tamari updated NUTCH-2716: Description: Even when store.http.headers=true, the HTTP headers are not saved for a gzipped or