Thanks, Bejamin, your patch is applied (trivial, no FSF copyright assignment required).
Regards, Tim On Freitag, 3. März 2017 09:00:57 CET Benjamin Esham wrote: > Hello, > > When producing WARC files, Wget records the requested URI in the > "WARC-Target-URI" field. I noticed that Wget encloses the value of this URI > within <angle brackets> in blocks with "WARC-Type: request", but not those > with types of "response", "resource", "revisit", or "metadata". Enclosing > URIs within angle brackets is required by the spec [1]. I'm attaching a > patch that adds the angle brackets for all block types. > > (Doing this for "request" blocks was the subject of bug 47281 [2], which was > fixed almost exactly a year ago. My patch simply extends the use of the > warc_write_header_uri function to the other appropriate places.) > > Here is a truncated example of the output from Wget 1.19.1: > > WARC/1.0 > WARC-Type: response > WARC-Record-ID: <urn:uuid:95D7B77A-C019-4E91-9BBB-7526B68864F2> > WARC-Warcinfo-ID: <urn:uuid:29F863DF-B273-498B-B91C-B50B2FD1BFCD> > WARC-Concurrent-To: <urn:uuid:EDCAF84C-D7A6-43CE-AE78-AEE16D3B7F4B> > WARC-Target-URI: https://www.gnu.org/software/wget/ > > And from the patched version: > > WARC/1.0 > WARC-Type: response > WARC-Record-ID: <urn:uuid:54F2170C-C3FA-4B05-A8B1-116466D92401> > WARC-Warcinfo-ID: <urn:uuid:29BCF957-0D4D-4933-9CA3-F7FF2218D144> > WARC-Concurrent-To: <urn:uuid:61FCAFA4-5DF9-4CC0-A6C6-BC233601EF1E> > WARC-Target-URI: <https://www.gnu.org/software/wget/> > > Best regards, > > Benjamin > > > [1] http://bibnum.bnf.fr/WARC/WARC_ISO_28500_version1_latestdraft.pdf > > [2] http://savannah.gnu.org/bugs/?47281
signature.asc
Description: This is a digitally signed message part.
