On Apr 2, 2011, at 3:29am, Oleg Kalnichevski wrote:
> On Sat, 2011-04-02 at 06:10 -0400, Chad La Joie wrote:
>> Okay, that makes sense.
>>
>> To test this, is there a way I can force the content type on the client
>> side, prior to requesting the response entity, via the response object?
>>
>
> You can try adding Accept and / or Accept-Charset header to the request
> message and see if the origin server responds appropriately.
>
> However, generally you might be better off using some sort of a content
> detection algorithm such that provided by Apache Tika toolkit. I suspect
> wget does exactly that.
Tika tries to follow the recommendations of RFC 3023:
If an application/xml entity is received where the charset
parameter is omitted, no information is being provided about the
charset by the MIME Content-Type header. Conforming XML
processors MUST follow the requirements in section 4.3.3 of [XML]
that directly address this contingency.
Which means it will look for a byte-order-mark and encoding declaration inside
of the XML content.
-- Ken
--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c w e b m i n i n g
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]