On Tue, 1 Sep 2009 17:54:42 +0200
Oleg Kalnichevski wrote:
> This is the expected behaviour. You can override it, though, by using
> this workaround:
>
> http://svn.apache.org/repos/asf/httpcomponents/oac.hc3x/trunk/src/contrib/org/apache/commons/httpclient/contrib/ssl/HostConfigurationWithStick
Gerald Turner wrote:
Hello HttpClient Users List, I have spent the last couple days upgrading
a dozen applications from HttpClient 3.1 to 4.0.
First off, I must say that I'm very pleased that
MultiThreadedHttpConnectionManager (now ThreadSafeClientConnManager) is
using synchronization rather tha
Hello HttpClient Users List, I have spent the last couple days upgrading
a dozen applications from HttpClient 3.1 to 4.0.
First off, I must say that I'm very pleased that
MultiThreadedHttpConnectionManager (now ThreadSafeClientConnManager) is
using synchronization rather than thread interrupts. B
Hi Magnus,
I used curl to grab the file, and the bytes at 0x1845...0x1847 are
0xC3 0xA5, which is valid UTF-8 for the u00E5 code point (latin small
letter a with ring above).
I also used Bixo (http://bixo.101tec.com) to crawl the same page, and
wound up with the same raw data. Bixo uses H
On Wed, Sep 02, 2009 at 11:54:42AM -0400, NBW wrote:
> No use of things like InputStreamReaders then I take it.
>
InputStreamReaders is used by one utility method in HttpCore
(EntityUtils#toString()). However, it uses an InputStreamReaders constructor
that explicitly takes the charset name to be
No use of things like InputStreamReaders then I take it.
On Wed, Sep 2, 2009 at 11:41 AM, Oleg Kalnichevski wrote:
> On Wed, Sep 02, 2009 at 11:39:35AM -0400, NBW wrote:
> > What about passing -Dfile.encoding=utf-8?
> >
>
> HttpClient does not use system properties (per design)
>
> Oleg
>
>
>
On Wed, Sep 02, 2009 at 11:39:35AM -0400, NBW wrote:
> What about passing -Dfile.encoding=utf-8?
>
HttpClient does not use system properties (per design)
Oleg
> On Wed, Sep 2, 2009 at 10:58 AM, Magnus Olstad Hansen wrote:
>
> >
> > > But when you call httpclient.execute(httpget, responseHandl
What about passing -Dfile.encoding=utf-8?
On Wed, Sep 2, 2009 at 10:58 AM, Magnus Olstad Hansen wrote:
>
> > But when you call httpclient.execute(httpget, responseHandler), the
> > BasicResponseHandler will call EntityUtils.toString, and that in turn
> > uses ISO-8859-1 as its default charset whe
Thanks for the reply, Ken.
>
> The basic problem is that determining the character set of a web page
> is complex, and not something that HttpClient is designed to handle.
>
> If you check out (for example) the Nutch source, you'll see that it
> has a multi-step process, where it uses the Content-t
Hi Magnus,
Don't know exactly where you can plug this, but this project helped me a
lot
parsing non ISO charset :
http://jchardet.sourceforge.net/
hope that helps
Florent
Pingwy
27, rue des arènes
49100 Angers
Magnus Olstad Hansen a écrit :
Hello,
I'm using HttpClient 4.0 to download a
Hi Magnus,
On Sep 2, 2009, at 1:22am, Magnus Olstad Hansen wrote:
Hello,
I'm using HttpClient 4.0 to download a webpage the same way as shown
in one of the examples. This is my method to return a webpage as a
string:
protected static String leechUrl(String url) throws
IOException
On Wed, Sep 02, 2009 at 10:22:16AM +0200, Magnus Olstad Hansen wrote:
> Hello,
>
> I'm using HttpClient 4.0 to download a webpage the same way as shown in
> one of the examples. This is my method to return a webpage as a string:
>
>protected static String leechUrl(String url) throws IOExc
Hello,
I'm using HttpClient 4.0 to download a webpage the same way as shown in
one of the examples. This is my method to return a webpage as a string:
protected static String leechUrl(String url) throws IOException {
HttpClient httpclient = new DefaultHttpClient();
13 matches
Mail list logo