On Mon, 2011-04-04 at 08:16 -0400, Chad La Joie wrote: > Then I guess I misunderstood what you were saying before. Are you > suggesting then that the server is transcoding the file when it serves > it up? And that the missing bytes actually go missing before HttpClient > gets the response?
That is one possibility. Besides, I suspect that your application also needs to convert the response content to a stream of characters in order to be able to parse the XML message. This is another possibility for things to go screwy. Hope this helps Oleg > On 4/4/11 8:08 AM, Oleg Kalnichevski wrote: > > On Mon, 2011-04-04 at 07:34 -0400, Chad La Joie wrote: > >> Yeah, unfortunately that didn't work. > >> > >> Is there any way to get the old v3 behavior that gives you access to the > >> raw bytes of the entity before any sort of character decoding is done? > >> > >> I strongly suspect that very few web servers out there are properly > >> configured to return the correct character encoding so this could > >> definitely be an ongoing problem. > >> > > > > EntityUtils.toByteArray returns raw response content without attempting > > to decode it. > > > > http://hc.apache.org/httpcomponents-core-ga/httpcore/xref/org/apache/http/util/EntityUtils.html#81 > > > > Oleg > > > > > >> On 4/2/11 6:29 AM, Oleg Kalnichevski wrote: > >>> On Sat, 2011-04-02 at 06:10 -0400, Chad La Joie wrote: > >>>> Okay, that makes sense. > >>>> > >>>> To test this, is there a way I can force the content type on the client > >>>> side, prior to requesting the response entity, via the response object? > >>>> > >>> > >>> You can try adding Accept and / or Accept-Charset header to the request > >>> message and see if the origin server responds appropriately. > >>> > >>> However, generally you might be better off using some sort of a content > >>> detection algorithm such that provided by Apache Tika toolkit. I suspect > >>> wget does exactly that. > >>> > >>> http://tika.apache.org/0.9/detection.html > >>> http://tika.apache.org/ > >>> > >>> Oleg > >>> > >>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: [email protected] > >>> For additional commands, e-mail: [email protected] > >>> > >>> > >> > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
