On Mon, 2011-04-04 at 08:16 -0400, Chad La Joie wrote:
> Then I guess I misunderstood what you were saying before.  Are you
> suggesting then that the server is transcoding the file when it serves
> it up?  And that the missing bytes actually go missing before HttpClient
> gets the response?


That is one possibility. Besides, I suspect that your application also
needs to convert the response content to a stream of characters in order
to be able to parse the XML message. This is another possibility for
things to go screwy.   

Hope this helps

Oleg

> On 4/4/11 8:08 AM, Oleg Kalnichevski wrote:
> > On Mon, 2011-04-04 at 07:34 -0400, Chad La Joie wrote:
> >> Yeah, unfortunately that didn't work.
> >>
> >> Is there any way to get the old v3 behavior that gives you access to the
> >> raw bytes of the entity before any sort of character decoding is done?
> >>
> >> I strongly suspect that very few web servers out there are properly
> >> configured to return the correct character encoding so this could
> >> definitely be an ongoing problem.
> >>
> > 
> > EntityUtils.toByteArray returns raw response content without attempting
> > to decode it. 
> > 
> > http://hc.apache.org/httpcomponents-core-ga/httpcore/xref/org/apache/http/util/EntityUtils.html#81
> > 
> > Oleg
> > 
> > 
> >> On 4/2/11 6:29 AM, Oleg Kalnichevski wrote:
> >>> On Sat, 2011-04-02 at 06:10 -0400, Chad La Joie wrote:
> >>>> Okay, that makes sense.
> >>>>
> >>>> To test this, is there a way I can force the content type on the client
> >>>> side, prior to requesting the response entity, via the response object?
> >>>>
> >>>
> >>> You can try adding Accept and / or Accept-Charset header to the request
> >>> message and see if the origin server responds appropriately.
> >>>
> >>> However, generally you might be better off using some sort of a content
> >>> detection algorithm such that provided by Apache Tika toolkit. I suspect
> >>> wget does exactly that.
> >>>
> >>> http://tika.apache.org/0.9/detection.html
> >>> http://tika.apache.org/
> >>>
> >>> Oleg
> >>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [email protected]
> >>> For additional commands, e-mail: [email protected]
> >>>
> >>>
> >>
> > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> > 
> > 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to