Re: research and reflections on HTTP header ordering (was: Re: [PATCH] FIFO header order support in HTTP::Headers)

Gisle Aas Mon, 01 Feb 2010 13:09:56 -0800

On Tue, Jan 26, 2010 at 17:34, Mark Stosberg <m...@summersault.com> wrote:
>
> In 2008 there was some discussion about an option to preserve the
> ordering of HTTP headers. Part of that thread is quoted below.
>
> The idea resurfaced in another form with the release of
> HTTP::Headers::Fast, which provided a method to get back the the
> headers unsorted. However, the motivation was different there--
> performance-- and the implementation as different as well. It returns
> headers in essentially random order instead the order in which which
> they were created or transmitted.
>
> I took an interest in the issue of HTTP header ordering and researched
> what several other Perl modules do in regards to this as well as Ruby's
> Rack. I published the result on my blog:
>
> http://mark.stosberg.com/blog/2010/01/generating-http-headers-sorted-or-unsorted.html
>
> The summary is that I support the option for unsorted headers in
> HTTP::Headers. Michael Greb made a good case for it, and the
> possibility for a performance improvement is attractive too.


I would prefer if there was a way to make the sorted headers as fast
as unsorted headers :-)

I still would like to see support for the ordering of headers
preserved at some point.

Instead of introducing the 'as_string_without_sort' method could we
achieve the same effect with a 'order' argument to 'as_string'?  Could
take values like 'sorted'/'original'/'dontcare'.

--Gisle


> On Sun, 7 Sep 2008 15:53:46 +0200
> "Gisle Aas" <gi...@aas.no> wrote:
>
>> On Sun, Sep 7, 2008 at 1:49 PM, Michael Greb <mg...@linode.com> wrote:
>> > On Sep 5, 2008, at 7:23 PM, Gisle Aas wrote:
>> >>
>> >> True; and in this case we need to define what happens when fields are
>> >> modified with 'push', 'set' or 'init' and 'remove' as that's the API
>> >> that modify stuff.  Let me suggest the following definition of the
>> >> behaviour:
>> >>
>> >> - 'push' always append the field at the end of all headers.  multiple
>> >> occurrences of a field name do not have to be consecutive.
>> >>
>> >> - 'init' either does nothing or it works like 'push'.
>> >>
>> >> - 'remove' will always remove all concurrences of a field.
>> >>
>> >> - 'set' will work like 'push' if no other occurrence of the field exists.
>> >>
>> >> - 'set' will update the first occurrence if the field exists (and
>> >> remove all other occurrences).  if multiple field values is provided
>> >> with 'set' they are basically all injected at the location of the
>> >> first existing value.
>> >
>> >
>> > On Sep 6, 2008 at 2:57 AM, Gisle Aas wrong:
>> >>
>> >> I think it makes sense to be able to enable them separately.
>> >> Suggested interface:
>> >>
>> >>  $h->scan(\&cb, original_order => 1, original_case => 1);
>> >>  $h->as_string(eol => "\n", original_order => 1, original_case => 1);'
>> >
>> > The attached patch uses the interface above and works towards the behavior
>> > outlined in the first message.  Due to the headers being stored as a hash,
>> > pushing does not currently preserve previous values, second and subsequent
>> > pushes of the same header will overwrite the previous value.  Supporting
>> > this would require a change in how the headers are stored within the 
>> > module.
>> >  Your thoughts?
>>
>> I think it's better to just use your original approach and just keep
>> the representation like used to be with the addition of an array that
>> records the original field names and their order.  This should lead to
>> a smaller patch as the only thing that need to change is the code that
>> sets headers and the scan method.  I also like header lockups to be
>> efficient and the representation compact.
>>
>> > Server: Fool/1.0
>> > content-encoding: gzip
>> > Content-Type: text/plain; charset="UTF-8"
>> > Content-Encoding: base64
>> > Date: Fri Sep  5 10:24:37 CEST 2008
>> >
>> > Would be stored as (assuming push_header):
>>
>> My suggestion would be:
>>
>> bless {
>>     "content-encoding" => ["\n gzip", "base64"],
>>     "content-type" => "text/plain; charset=\"UTF-8\"",
>>     "date" => "Fri Sep  5 10:24:37 CEST 2008",
>>     "server" => "Fool/1.0",
>>     "::original_fields" => [
>>         "Server",
>>         "content-encoding",
>>         "Content-Type",
>>         "Content-Encoding",
>>         "Date",
>>     ],
>> }, "HTTP::Headers";
>>
>> The invariant that needs to hold is that there is the same number of
>> elements in {"::original_fields"} as there are values for all the
>> others keys.
>>
>> Pushing a value is trivial; only change from what we have now is
>> appending the original field name to {"::original_fields"}.
>>
>> The only state modification operation that becomes more complex is
>> setting of a value header value.  It has to:
>>
>>   - update the values in the hash as before
>>   - locate the first occurence of the field name in
>> {"::original_fields"}  => $idx
>>   - remove all other occurrences of the field name
>>   - splice(@{"::original_fields"}, $idx, 1, ($orig_field_name) x
>> $numbers_of_values_set);
>>
>> When 'scan' wants to iterate over the original headers it would have
>> to keep an index into the values array for each field that repeat.
>>
>> An more compact representation could be to store {"::original_fields"}
>> as a ":"-separated string; but we can think about that optimization
>> later.
>>
>> --Gisle
>>
>
>
> --
>  . . . . . . . . . . . . . . . . . . . . . . . . . . .
>   Mark Stosberg            Principal Developer
>   m...@summersault.com     Summersault, LLC
>   765-939-9301 ext 202     database driven websites
>  . . . . . http://www.summersault.com/ . . . . . . . .
>
>
>

Re: research and reflections on HTTP header ordering (was: Re: [PATCH] FIFO header order support in HTTP::Headers)

Reply via email to