On Sun, Jul 30, 2006 at 11:55:22PM -0400, Kyama Ashok-E51121 wrote:
> Is there any java class/commons libraries which can encode the url that
> HttpClient accepts.

I used this code snipplet: 

......
private static BitSet safeChars = new BitSet();
static {
    for (char c = 'a'; c <= 'z'; c++) {
        safeChars.set(c);
        safeChars.set(Character.toUpperCase(c));
    }
    for (char c = '0'; c <= '9'; c++)
        safeChars.set(c);
    safeChars.set(':');
    safeChars.set('/');
    safeChars.set('_');
    safeChars.set('-');
    safeChars.set('.');
    safeChars.set('&');
    safeChars.set('%');
    safeChars.set('=');
    safeChars.set('*');
    safeChars.set('?');
}

......

private String customEscape(String _url, String encoding)
        throws UnsupportedEncodingException {
    String _internal_url = new String(URLCodec.encodeUrl(safeChars, _url
            .getBytes(encoding == null ? "latin1" : encoding)));
    return _internal_url;
}

That worked for most cases (however for URL fragments you may need to add # in
safeChars).

I copied the code from commons-codec and altered it a bit for my requirements,
That works pretty fine for most cases I figured during 1.5 years of spider
development ;)

-- 
Eugene N Dzhurinsky

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to