Hello,
I would like to migrate from HttpClient 3.x to HttpClient 4.x but having
difficulty how to handle redirects. The code works properly under Commons
HttpClient but breaks when migrated to HttpComponents Client. Some of the
links get undesirable redirects but when I set
"http.protocol.handle-redirects" to 'false' I get no result altogether for
some of the links.
Commons HttpClient 3.x code:
private static HttpClient httpClient = null;
private static MultiThreadedHttpConnectionManager connectionManager =
null;
private static final long MAX_CONNECTION_IDLE_TIME = 60000; //
milliseconds
static {
//HttpURLConnection.setFollowRedirects(true);
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
CookieHandler.setDefault(manager);
connectionManager = new MultiThreadedHttpConnectionManager();
connectionManager.getParams().setDefaultMaxConnectionsPerHost(1000);
// will need to set from properties file
connectionManager.getParams().setMaxTotalConnections(1000);
httpClient = new HttpClient(connectionManager);
}
/*
* Retrieve HTML
*/
public String fetchURL(String url) throws IOException{
if ( StringUtils.isEmpty(url) )
return null;
GetMethod getMethod = new GetMethod(url);
//HttpClient httpClient = new HttpClient();
//configureMethod(getMethod);
//ObjectInputStream oin = null;
InputStream in = null;
int code = -1;
String html = "";
String lastModified = null;
try {
code = httpClient.executeMethod(getMethod);
in = getMethod.getResponseBodyAsStream();
//oin = new ObjectInputStream(in);
//html = getMethod.getResponseBodyAsString();
html = CharStreams.toString(new InputStreamReader(in));
}
catch (Exception except) {
}
finally {
try {
//oin.close();
in.close();
}
catch (Exception except) {}
getMethod.releaseConnection();
connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME);
}
if (code <= 400){
return html.replaceAll("\\s+", " ");
} else {
throw new Exception("URL: " + url + " returned response
code " + code);
}
}
HttpComponents Client 4.x code:
private static HttpClient httpClient = null;
private static HttpParams params = null;
//private static MultiThreadedHttpConnectionManager connectionManager =
null;
private static ThreadSafeClientConnManager connectionManager = null;
private static final int MAX_CONNECTION_IDLE_TIME = 60000; //
milliseconds
static {
//HttpURLConnection.setFollowRedirects(true);
CookieManager manager = new CookieManager();
manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
CookieHandler.setDefault(manager);
connectionManager = new ThreadSafeClientConnManager();
connectionManager.setDefaultMaxPerRoute(1000); // will need to set
from properties file
connectionManager.setMaxTotal(1000);
httpClient = new DefaultHttpClient(connectionManager);
// HTTP parameters stores header etc.
params = new BasicHttpParams();
params.setParameter("http.protocol.handle-redirects",false);
}
/*
* Retrieve HTML
*/
public String fetchURL(String url) throws IOException{
if ( StringUtils.isEmpty(url) )
return null;
InputStream in = null;
//int code = -1;
String html = "";
// Prepare a request object
HttpGet httpget = new HttpGet(url);
httpget.setParams(params);
// Execute the request
HttpResponse response = httpClient.execute(httpget);
// The response status
//System.out.println(response.getStatusLine());
int code = response.getStatusLine().getStatusCode();
// Get hold of the response entity
HttpEntity entity = response.getEntity();
// If the response does not enclose an entity, there is no need
// to worry about connection release
if (entity != null) {
try {
//code = httpClient.executeMethod(getMethod);
//in = getMethod.getResponseBodyAsStream();
in = entity.getContent();
html = CharStreams.toString(new
InputStreamReader(in));
}
catch (Exception except) {
throw new Exception("URL: " + url + " returned
response code " + code);
}
finally {
try {
//oin.close();
in.close();
}
catch (Exception except) {}
//getMethod.releaseConnection();
connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME,
TimeUnit.MILLISECONDS);
connectionManager.closeExpiredConnections();
}
}
if (code <= 400){
return html;
} else {
throw new Exception("URL: " + url + " returned response
code " + code);
}
}
I won't want redirects but under HttpClient 4.x if I enable redirects then
I get some that are undesirable, e.g. http://www.walmart.com/ =>
http://mobile.walmart.com/. Under HttpClient 3.x no such redirects
happens.
What do I need to do to migrate HttpClient 3.x to HttpClient 4.x without
breaking the code?
Thanks in advance.
Mugoma.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]