Yeah, i've overwritten the Content-Length header with on the length of the
decompressed content byte array.
Luckily our clients' needs are modest in what they demand in their WARCs.
Many thanks,
Markus
Op wo 31 jul 2024 om 14:22 schreef Sebastian Nagel
:
> Hi Markus,
>
> >> And i do not agree
Hi Markus,
>> And i do not agree with it. Almost all content is compressed now, so this
>> will never work. We need the headers and response code stored for WARC
>> export and do not care about an incorrect length header.
No, don't do this. You need to rewrite the header. There are many WARC rea
Aah thanks Lewis. We're still on 1.15, glad to see this was fixed already,
and that i would have patched it in exactly the same way.
Thanks!
Op di 30 jul 2024 om 18:42 schreef lewis john mcgibbney :
> Hi Markus,
>
> Which version of Nutch are you referring to? I'm not seeing this exact
> code in
Hi Markus,
Which version of Nutch are you referring to? I'm not seeing this exact
code in master branch.
Is this roughly the code you are referencing?
https://github.com/apache/nutch/blob/master/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java#L304-L318
Thanks
le
4 matches
Mail list logo