Re: refresh header proxy

2004-01-12 Thread Mike Moran
Kalnichevski, Oleg wrote:
I think all you need to know is what the header looks like, as i did look at
the logs. It simply ignores the header. The header looks like this:
Refresh: 0; URL=https://


Well, things _may_ be a little bit more complicated than that. 
[ ... ]

I had to do some parsing of this type of header when writing a parser 
that extracted these from their in-html incarnation. At the time I 
couldn't find much out about them either. FWIW, the following regexp 
caught a lot of the html pages I saw in the wild:

;\s*[Uu][Rr][Ll]=\s*([^\s]+)\s*$

The main thing to watch out for was the variation in case of the URL= 
part. This may not be an issue if the header is generated by an actual 
http server (as opposed to being in some html or added by a CGI script).

--
Mike
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: NTLM class

2003-08-14 Thread Mike Moran
On Thursday, August 14, 2003, at 03:36  pm, Michael Becke wrote:

+1 for me as well.
Me too (+1, obviously non-binding). I'm about a quarter the way through 
integrating rc1 into some code and the internal JCE hidden setup would 
be a total spanner-in-the-works.

--
Mike
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: normalizing a URI with ..'s in it ?

2003-07-24 Thread Mike Moran
Michael Becke wrote:
On Wednesday, July 23, 2003, at 06:18 PM, Mike Moran wrote:

[Oleg agreed] Right. Out of interest, which set of test cases does the 
URI class use, the ones from rfc2396 or rfc2396bis?


The tests are from rfc2396bis.
This is verging rapidly off-topic, but I was wondering if you knew 
anywhere I could keep up-to-date on the standardis track of rfc2396bis? 
I've written some code to do the normalization we were talking about and 
I am swithering about whether I should enable it. The different handling 
of paths such as /../../ between rfc2396 and rfc2396bis may have 
knock-on effects in my code, so it would be nice if they were *standard* 
knock-on effects :-)

--
Mike
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: normalizing a URI with ..'s in it ?

2003-07-23 Thread Mike Moran
On Wednesday, Jul 23, 2003, at 20:37 Europe/London, Michael Becke wrote:

Mike Moran wrote:
Btw, I presume this is the algorithm given in section 5.2 of  
http://www.apache.org/~fielding/uri/rev-2002/ 
rfc2396bis.html#absolutize?  If so, this is just a draft  
(draft-fielding-uri-rfc2396bis-03.txt). It does actually differ from  
rfc2396 in how it handles abnormal URLs (though I think that's  
irrelevant here).
Yes, this is the algorithm.  We decided to upgrade to ensure that URI  
parsing was consistent across Apache.  I think this was at the request  
of Roy Fielding.  Oleg, is that correct?
[Oleg agreed] Right. Out of interest, which set of test cases does the  
URI class use, the ones from rfc2396 or rfc2396bis?


The string my/relative/../../another/relative would never be output  
from merge() or given to remove_dot_segments() in the section 5.2  
algorithm. If you are just applying remove_dot_segments() to this  
string then it will get confused and output a wierd answer because  
it's not expecting that input (ie a path that doesn't have a / at  
the start).

I may be wrong, but I didn't think normalization could be applied to  
anything but absolute URLs.
I agree that when resolving a path relative to a base URI a relative  
path should never be passed to remove_dot_segments().  However,  
according to section 6.2.2.3 remove_dot_segments() can be used for  
path segment normalization.

I guess what is comes down to is that normalization is meant to  
generate a URI with a valid absolute path.  The value output in this  
case is a little strange but I think it's correct.  Essentially  
normalize should not be used on relative URIs.
I would agree. Doesn't this mean that normalize() should thrown an  
exception if it *is* called on a non-absolute URI?

--
Mike
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: DO NOT REPLY [Bug 21754] - NullPointerException when releasing connection

2003-07-21 Thread Mike Moran
On Monday, Jul 21, 2003, at 09:32 Europe/London, Kalnichevski, Oleg 
wrote:

Mike, I can guarantee that there will be no API changes of what so 
ever as long as you are using stable 2.0 branch. However, if we are no 
allowed to change even internal APIs on the development 2.1 branch, I 
personally might also just seriously consider stopping contributing to 
HttpClient at all.
I understand this as a recent bugbear of yours :-) I am not asking for 
no change, far from it. Instead, what I am asking is: what is the 
supported 'user profile'? For example, will this profile support 
'component assemblers' as well as 'User-Agent users'? From my point of 
view I see the HttpClient library as a grab-bag of components to stick 
together, not a single entry point.

I have followed the discussion around the need to refactor the 
internals of the HttpClient library for some time and I would certainly 
agree that they need reworking. But what is the end goal? Is it almost 
a User-Agent (just add water) ie the HttpClient class-to-be?

The main reason I am coming to the HttpClient library is because 
existing solutions such as the Sun one or innovation.ch have bugs or 
aren't transparent enough. I already have a User-Agent which does 
everything the HttpClient class does. I just need a way to plug in an 
HTTP 1.1 provider which performs better and does not have the same 
bugs/deficiencies as Sun/innovation.ch. If you wish I can itemize 
exactly what is wrong with these libraries and tell you what I need.

Again, many thanks for the work (I think I've probably spent more time 
and chars talking about this issue than I'll use doing the actual work 
:-) )

--
Mike Moran
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Post Method

2003-06-10 Thread Mike Moran
Michael Becke wrote:
I think there may be a bug here as well.  According to the spec, space 
characters should be represented as '+' but URIUtil is encoding them as 
'%20'.  I think the resevered character set is perhaps also incorrect. 
According to rfc 1738 ;, /, ?, :, @, = and  are the 
reserved chars but URI is also uncluding +, $ and ,.  My guess is 
that most servers translate all hex encoded characters but it seems that 
we are not quite to spec.
You may want to double-check this with rfc2396, which updates rfc1738. 
My interpretation of the '+'/'%20' issue was that both were legal 
escapings of space, however it may be worth another reading on my part.

--
Mike Moran
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Questions related to the use of HttpClient classes

2003-06-06 Thread Mike Moran
Adrian Sutton wrote:
On Friday, June 6, 2003, at 08:55  AM, Om Narayan wrote:
[ ... ]
I have another question to add:  How do I prime the connections in the
MultiThreadedHttpConnectionManager pool?  I am running into a problem 
where
it is taking a long time (almost 30 secs using https) to open the first
connection. As a result the session bean that is opening the 
connection is
timing out. I need a way to make the first call to create the connection
work without timing out. Any suggestion is welcome.


I'd say the delay here is actually initialising JSSE rather than 
actually making the first connection.   There should be a way to 
initialise JSSE without actually making a connection. 
[ ... ]

This might be difficult. I ran in to a problem like this while using a 
JCE implementation. The setup time seems to be caused by Suns 
implementation of SecureRandom. The following thread may be useful:

http://forum.java.sun.com/thread.jsp?thread=4250forum=2message=11205

--
Mike


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: 300 Multiple Choices handling?

2003-06-02 Thread Mike Moran
Adrian Sutton wrote:
Hi Mike,
HttpClient returns 300 as the status code as would be expected in such a 
case.  
Sounds reasonable. Does it also make the body available in this case?

The developer is then free to select whichever option they want.
The URL you gave however return 302 not 300 and HttpClient throws an 
exception because cross-site redirects are not supported.
Umm. I think HttpClient and wget must disagree:

$ wget --server-response 
http://www.blooberry.com//indexdotpreview/html/index8.htm
--12:48:55--  http://www.blooberry.com//indexdotpreview/html/index8.htm
   = `index8.htm'
Resolving www.blooberry.com... done.
Connecting to www.blooberry.com[204.122.16.82]:80... connected.
HTTP request sent, awaiting response...
 1 HTTP/1.1 300 Multiple Choices

...

A telnet to port 80 for that page also gives 300 Multiple Choices.

I'll create a patch for the docs to mention 300 responses.  Anything 
particularly important about them that I should note?
I'm not sure what the docs should say other than pointing out that 
you'll need to parse or display the body in some non-HTTP way to get any 
sense out of it.

--
Mike Moran


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: 300 Multiple Choices handling?

2003-06-02 Thread Mike Moran
Ortwin Glück wrote:
Has anybody ever seen a 300 in the wild?
Yes. I gave an example in the email I sent. This page is linked to on 
that site.

However, I would say that they aren't numerous.

--
Mike Moran


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: 300 Multiple Choices handling?

2003-06-02 Thread Mike Moran
Adrian Sutton wrote:
On Monday, June 2, 2003, at 09:57  PM, Mike Moran wrote:

Adrian Sutton wrote:
[ ... ]
Umm. I think HttpClient and wget must disagree:

$ wget --server-response 
http://www.blooberry.com//indexdotpreview/html/index8.htm
--12:48:55--  http://www.blooberry.com//indexdotpreview/html/index8.htm
   = `index8.htm'
Resolving www.blooberry.com... done.
Connecting to www.blooberry.com[204.122.16.82]:80... connected.
HTTP request sent, awaiting response...
 1 HTTP/1.1 300 Multiple Choices

...

A telnet to port 80 for that page also gives 300 Multiple Choices.


Interesting  I do get a 300 from telnet, but a 302 from HttpClient:
302
!DOCTYPE HTML PUBLIC -//IETF//DTD HTML 2.0//EN
HTMLHEAD
TITLE302 Found/TITLE
/HEADBODY
H1Found/H1
The document has moved A 
HREF=http://www.eskimo.com/notfound.html;here/A.P
/BODY/HTML
[ ... ]

I wonder... is HttpClient perhaps parsing out www.blooberry.com/ as 
the value for the Host: header? My URL is slightly bogus, though perhaps 
technically valid. What happens with:
http://www.blooberry.com/indexdotpreview/html/index8.htm
(Note no double slash)

www.eskimo.com and www.blooberry.com appear to be on the same subnet; 
www.eskimo.com is perhaps the default VirtualHost.

Just wondering ...

--
Mike Moran
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


HashMap does use equals() [was Re: DO NOT REPLY [Bug 18355] - HttpState cannot differentiate credentials for different hosts with sameRealm names]

2003-04-03 Thread Mike Moran
[EMAIL PROTECTED] wrote:
[ ... ]
The reason is because HashMap only compares the hashCodes of 
the objects and never consults equals. 

[ ... ]

This is not the case. HashMap uses equals() when the hashCode() of two 
objects are the same. If you want good performance it is a good idea to 
have the hashCode() different or at least evenly distributed. However, 
it is not necessary. As a test, try the following code:

import java.util.*;

public class HashMapThing
{
   private static class Key
   {
   private int hashCode;
   private String content;
   private String tag;
   public Key(int hashCode, String content, String tag)
   {
   this.hashCode = hashCode;
   this.content = content;
   this.tag = tag;
   }
   public int hashCode()
   {
   System.out.println(hashCode() called on  + toString());
   return hashCode;
   }
   public boolean equals(Object o)
   {
   System.out.println(equals() called on  + toString()
  + ,  + o.toString());
   if (o instanceof Key) {
   Key other = (Key) o;
   return this.content.equals(other.content);
   }
   else {
   return false;
   }
   }
  
   public String toString()
   {
   return tag: \ + tag
   + \ code:  + hashCode
   +  content: \ + content + \;
   }
   }

   public static void main(String[] args) throws Exception
   {
   Key entryA = new Key(0, 0, A);
   Key entryB = new Key(0, 0, B);
   Key entryC = new Key(0, 1, C);
   System.out.println(Adding entries:);
   Map map = new HashMap();
   System.out.println(A);
   map.put(entryA, 1);
   System.out.println(B);
   map.put(entryB, 2);
   System.out.println(C);
   map.put(entryC, 3);
   System.out.println(\nGetting entries:);
   String out1 = (String) map.get(entryA);
   String out2 = (String) map.get(entryC);
   System.out.println(\nEntries:);
   System.out.println(1:  + out1);
   System.out.println(2:  + out2);
   }
}
--
Mike
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [REMINDER] HttpClient IRC event - irc log

2003-02-28 Thread Mike Moran
On Friday, February 28, 2003, at 01:51 AM, Dennis Cook wrote:

Here is the complete session:

[ ... ]
Jandalf 1) I'm kinka amazed that cross host redirects has been 
absent for
so long.  Its seems glaring.
[ ... ]

Indeed it is. A large amount of the redirects I see are cross-site 
redirects. It makes sense, given the number of cgi scripts there are 
which count people leaving a site to go elsewhere.

Incidentally, some sites don't even give an absolute URL; you have to 
resolve it relative the current URL ie you treat the redirected-from 
URL as the base. I would think the RFC says this is a non-no, and I 
probably wouldn't expect HttpClient to go this far. However, it's one 
example why I wouldn't use any built-in redirection.

--
Mike
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Significant HttpClient HttpMethodBase overhaul. Need earlyfeedback

2003-02-27 Thread Mike Moran
Sam Maloney wrote:

Makes sense to me,
I would definatly agree on your point that 'Client' logic should be in 
HttpClient and not in HttpMethodBase. (I would say redirect, auth and even 
auto-retry would count as 'Client' logic).

[ ... ]

I would also agree. I would like and expect my use of the library to be 
one-to-one with respect to HTTP methods eg I ask it to do one thing and 
it does it. If anything makes the fetch not `complete' then it informs 
me and I decide whether to do a retry/follow a redirect/handle 
authentication.

Also:
- In my context the limits on redirects are not local to a call ie the 
number of redirects that are allowed on any one call may be affected by 
the general number of fetches that have been done so far.

- I require full control of when a fetch is done. This is to allow 
limiting of number and type of hits to a web server. If auto-retry is 
enabled then I have no way of throttling or scheduling hits.

As I think Laura has mentioned, it would be nice if HttpClient could be 
decomposed into reusable components for implementing a user agent, but 
it is essential that I be able to peel away any user agent layer that 
exists. From the sound of it, the current HttpMethodBase does too much 
on its own for me to use it.

--
Mike


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: DO NOT REPLY [Bug 17487] - waitForResponse is using busywait

2003-02-27 Thread Mike Moran
Michael Becke wrote:

Agreed. A second thread should not be needed though.  It can be done 
using something like: 
[ ... ]



Oleg Kalnichevski wrote: 
[ ... ]

Oleg
 
On Thu, 2003-02-27 at 15:36, Ortwin Glück wrote:

Oleg Kalnichevski wrote:

Odi, are you sure you want to have an extra thread per HttpMethod? 
I do
not think so
Oleg


Better than a busy wait, isn't it? 

I just wanted to butt into this to point out that on some platforms, 
such as 2.2 linux, a thread equates to a process id, and you can quickly 
run out of them. In these cases, a busy wait is far preferable to a new 
Thread. Also firing off a Thread when things are slow can cause sudden 
spikes in Thread use. I've recently seen an analogous problem with an 
older version of jboss when doing RMI connections which was a pain in 
the arse to work around.
--
Mike



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: DO NOT REPLY [Bug 10807] - Handle virtual hosts, relativeurls, multi-homing

2003-02-27 Thread Mike Moran
[EMAIL PROTECTED] wrote:
[ ... ]


--- Additional Comments From [EMAIL PROTECTED]  2003-02-27 18:37 ---
I'd like to go ahead and tackle this one, but I need a little clarification. 
Does the following correctly describe what we want?

- we want to perform a get on www.google.com, let's say www.google.com has X  1
IP addresses 
- we want to specify which IP address x to actually connect to
- we want www.google.com in the Http header instead of x

If this is the case, it sounds like we may want support for custom DNS
resolution.  Though this might be a little more than is needed for this simple
case I think that is what it boils down to.
I would support addition regardless, but then that's just me. I assume 
by custom DNS resolution you mean passing in the resolved values eg 
the HttpClient library is told: here's the Name/IP mapping, do a GET to 
this IP with this Host: Name?

--
Mike
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: DO NOT REPLY [Bug 17487] - waitForResponse is using busywait

2003-02-27 Thread Mike Moran
Oleg Kalnichevski wrote:

Mike
May I add your comments to the bug report?
Ye, sure.

--
Mike


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: DO NOT REPLY [Bug 10807] - Handle virtual hosts, relativeurls, multi-homing

2003-02-27 Thread Mike Moran
Michael Becke wrote:

I would support addition regardless, but then that's just me. I 
assume by custom DNS resolution you mean passing in the resolved 
values eg the HttpClient library is told: here's the Name/IP 
mapping, do a GET to this IP with this Host: Name?


I was thinking more along the lines of an interface that would provide 
DNS resolution.  This interface would be used by HttpConnection when 
opening connections.  In general DNS names would be used everywhere 
except for when creating sockets. 
That would also suffice. If possible, I would prefer to pass an 
implementation of the interface in to HttpConnection upon creation 
rather than have it as a global setting, but I presume that's easier anyway.

One other thing is that, currently, as a side-effect of using the 
Socket(DNS name, ...) constructor, the DNS lookup and Socket connection 
processes seem to be rolled into one. I was wondering if it is 
worthwhile setting a separate timeout for DNS lookup? As it stands, if 
DNS becomes slow for some reason then, even if the remote server is 
responding quickly, you'll get exceptions. Most people wouldn't care, 
but it would be nice to be able to set a longer timout for DNS responses 
if you happened to know they sometimes take a while. This would also 
allow some leeway for an implementation of this DNS interface to do 
retries internally without worrying about a `connection' timeout externally.

--
Mike (moran) :-)


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


nogoop: HttpClient competition

2003-02-06 Thread Mike Moran
I just found this on my travels: http://www.nogoop.com/product_16.html

I thought I'd mention it as there was a thread going on a little while 
ago about HttpClient competitors. They make explicit comparisons to 
HttpClient.

--
Mike


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: using httpclient without a HttpClient object (was Redirects?)

2003-02-03 Thread Mike Moran

On Monday, February 3, 2003, at 09:33 PM, Laura Werner wrote:


Jeffrey Dever wrote:


Is there anyone out there that has code that actually calls the 
HttpMethod.execute()?  Anything that looks like this:

HttpState state = new HttpState();
HttpConnection = new HttpConnection(host, port, secure);
HttpMethod method = new GetMethod(path);
int status = method.execute(state, connection);

I do.  I'm the one who doesn't use HttpClient at all, because it's too 
simplistic for me.  I need to maintain a single HttpConnectionManager 
but a bunch of HttpState objects (one per thread in my application), 
so I have my own function that does the same thing as 
HttpClient.executeMethod.
[ ... ]

I would second the request to leave entry points into the `engine' 
behind HttpClient.java. As far as I understand it, HttpClient.java is 
just an implementation of a simple user agent. My expected use (and 
limited current use) of the API would likely not even mention 
HttpClient.java and would itself constitute a user-agent.

To take the example of redirects, this is something I need control of 
(auto-redirects is the first thing I turn off in Sun's 
HttpURLConnection).

--
Mike


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Configurable DNS resolver?

2003-01-27 Thread Mike Moran
Ortwin Glück wrote:


Michael Becke wrote:


If you set the host of a method or HttpClient to an IP address then 
it will connect to that address.  DNS names are not required, but 
will be resolved using the default Java method if used.

Mike


This causes problems on Multi-Homed sites. A DNS name is required in 
the HTTP request (Host request header) to uniquely reference the site. 

I was having a look through the latest release code and that was one of 
the things that occurred to me. Am I right in thinking that it would be 
HostConfiguration/HttpConnection that would require extra methods/calls 
to make a distinction between the host connected to (Socket level) and 
the advertised  host (Host: header)?

--
Mike


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]