I am forwarding your request to wikitech-l, in the hope that there are
more people on there who can comment on this issue.

For those who did not follow the entire thread: the user does not send
an Accept-Encoding: gzip header, but nevertheless gets a gzipped
response.

On Thu, Nov 25, 2010 at 8:19 PM, Anand Ramanathan <rcan...@gmail.com> wrote:
> Bryan: No, I didnt set the Accept-Encoding header explicitly - I found the
> following related issue on bugzilla: 7098
>
> Andrew: Yes, thanks. I see that curl can support this, and so can open-uri.
>
> I wanted to clarify if I should be handling this in the client:
> As per http 1.1  (section 14.3), for non-browser user agents, if no
> Accept-Encoding is explicitly set, the response should be the document
> itself if the server supports returning the document itself (identity).
> However, if the server is unable to return the document itself, it is
> preferable to return gzip or compressed content.
> I think this issue is happening whenever I hit a cache node that has the
> gzip, but not the identity cached. From a server standpoint, it seems like
> the right behavior. So, it is up to the client, which needs to do one of the
> following:
> a) Set Accept-Encoding to make gzip not-acceptable, and identity as
> acceptable. In this case, a cache node containing only gzip encoded document
> will miss, and eventually a node that contains the identity will return it.
> (This is a leap of faith, as I cannot target such a cache node explicitly.
> If a node has both gzip and identity content, and is responding with gzip
> for a request with no explicit Accept-Encoding set, then it violates the
> spec and is a bug. Can anyone comment on this?)
> b) Set Accept-Encoding to accept gzip or identity (or leave it unset), and
> on the client, if Content-Encoding is gzip, unzip it explicitly.
> I am fine with either of these approaches. Is this an accurate assessment of
> the issue and options?
> Thanks
> Anand
>
>
>
>
>
>
> On Thu, Nov 25, 2010 at 4:23 AM, Andrew Dunbar <hippytr...@gmail.com> wrote:
>>
>> On 25 November 2010 19:41, Anand Ramanathan <rcan...@gmail.com> wrote:
>> > Yes, confirmed that they are. It is gzip - what is the best way to deal
>> > with
>> > this? Is this a bug that is tracked, or is this something worth handling
>> > in
>> > client code (checking if gzip and manually unzipping)?
>> > Thanks
>> > Anand
>>
>> Curl can definitely handle gzipped responses. Here's something about
>> it from a very quick Google search:
>> http://curl.haxx.se/mail/curlphp-2004-01/0043.html
>>
>> Andrew Dunbar (hippietrail)
>>
>>
>> > On Thu, Nov 25, 2010 at 12:12 AM, Bryan Tong Minh
>> > <bryan.tongm...@gmail.com>
>> > wrote:
>> >>
>> >> On Thu, Nov 25, 2010 at 9:02 AM, Anand Ramanathan <rcan...@gmail.com>
>> >> wrote:
>> >> > OK, I got it again: Here is my curl output (headers + first few
>> >> > characters)
>> >> > for the garbled India wikipedia page (and the proper China wikipedia
>> >> > page
>> >> > for comparison below that):
>> >>
>> >> Can you verify that the first two characters are 0x1f and 0x8b
>> >> respectively? Looks like gzip.
>> >>
>> >> _______________________________________________
>> >> Mediawiki-api mailing list
>> >> Mediawiki-api@lists.wikimedia.org
>> >> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>> >
>> >
>> > _______________________________________________
>> > Mediawiki-api mailing list
>> > Mediawiki-api@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>> >
>> >
>>
>> _______________________________________________
>> Mediawiki-api mailing list
>> Mediawiki-api@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>
>
> _______________________________________________
> Mediawiki-api mailing list
> Mediawiki-api@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>
>

_______________________________________________
Mediawiki-api mailing list
Mediawiki-api@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api

Reply via email to