[ https://issues.apache.org/jira/browse/NUTCH-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian Nagel reassigned NUTCH-2716: -------------------------------------- Assignee: Sebastian Nagel > protocol-http: Response headers are not stored for a compressed response > ------------------------------------------------------------------------ > > Key: NUTCH-2716 > URL: https://issues.apache.org/jira/browse/NUTCH-2716 > Project: Nutch > Issue Type: Bug > Components: protocol > Affects Versions: 1.15 > Reporter: Yossi Tamari > Assignee: Sebastian Nagel > Priority: Major > Fix For: 1.16 > > > Even when store.http.headers=true, the HTTP headers are not saved for a > gzipped or deflated response, because they may contain an incorrect > content-length header. > This causes WARCExporter to generate "resource" (headreless) entries instead > of "response" entries. > While I can see why reporting the wrong content-encoding and length may be a > bug, removing all the headers is not a fix. > I am not submitting a patch yet since I'm not sure what the best fix is, but > I guess the best patch is to remove those two header lines and store the rest > of the headers. If there is no objection, I can submit a patch that does > this. Otherwise, what would be a better fix? -- This message was sent by Atlassian JIRA (v7.6.3#76005)