Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Jakarta-httpclient 
Wiki" for change notification.

The following page has been changed by RolandWeber:
http://wiki.apache.org/jakarta-httpclient/GuidedTourOfHttpCore

------------------------------------------------------------------------------
+ #DEPRECATED
- ## pragma section-numbers 2
- = A Guided Tour of HttpCore (module-main) =
  
+ This page has been 
[http://wiki.apache.org/HttpComponents/GuidedTourOfHttpCore moved]
+ to the new [http://wiki.apache.org/HttpComponents/ HttpComponents Wiki].
- ----
- [[TableOfContents]]
- ----
  
- == Welcome ==
- 
- Welcome, visitors, to this guided tour of [wiki:Self:HttpComponents HttpCore].
- I am your tour guide.
- If you could please come a little closer and gather around me,
- so I don't have to shout? Thank you, that's much better.
- I'm about to give you a short introduction, and then we'll visit
- some interesting places in !HttpCore so you can see how it works.
- Whenever you've got a question, feel free to ask.
- That's what I'm here for, to answer your questions.
- 
- As you probably know, HTTP is a protocol for exchanging messages
- between a client and a server. It's in widespread use, and it
- typically is running on top of plain TCP/IP or secure TLS/SSL sockets.
- [[BR]]
- Here at [http://www.apache.org/ Apache], there is an implementation
- of the client side of that protocol called the Commons HttpClient.
- Informally, we also call it "the 3.x codebase" or simply "the old code".
- It proved quite useful to a lot of people, but the old code has severe
- limitations in it's design.
- For example, there is a class called {{{HttpMethodBase}}}.
- It represents a request and a response at the same time,
- and it also implements logic for processing both.
- This kind of design where different things are crammed together
- in a single place makes it really hard to maintain or extend the code.
- 
- Therefore we started a new successor project called HttpComponents.
- Based on the experience gained with the old code, it implements the
- HTTP protocol with a new approach. Above all, there are several modules
- dealing with different aspects of the big problem.
- As you can gather from it's name, the !HttpCore module is at the
- very heart of this effort. It defines stuff on which all the other
- modules depend and rely.
- [[BR]]
- !HttpCore deals with representation of HTTP messages, and
- with transport logic for sending and receiving those messages.
- It also defines a kind of framework infrastructure so other
- modules can plug in functionality.
- Unlike the old code, !HttpCore is not specific to the client side
- of HTTP communication, it can also be used for the server side.
- And because it is so fundamentally different from it's predecessor,
- we put all the code into an all-new package hierarchy so you
- don't confuse them.
- 
-  '''Q:''' I have a question.
-  [[BR]]
-  '''A:''' Yes, please? What would you like to ask?
-  [[BR]]
-  '''Q:'''
-  If it is in a new package hierarchy, applications written
-  for the old code will not be able to use the new one?
-  [[BR]]
-  '''A:'''
-  Yes, that is correct. Because the old code was limited in it's design,
-  we had to change the API. Applications have to be rewritten to
-  make use of the new code. There was no way to avoid this.
-  The all-new package names at least make sure that both old and new code
-  can be used in the same environment, for example a Servlet engine,
-  without interference.
- 
- Now, if you would like to follow me to the main hall...
- it's called package {{{org.apache.http}}}.
- You may want to keep the
- [http://hc.apache.org/httpcomponents-core/httpcore/apidocs/index.html 
JavaDocs]
- at hand, that will make it easier for you to follow my explanations.
- 
- 
- 
- == Messages ==
- 
- The first problem we had to deal with is the representation of messages.
- If you don't know how to represent a message, you can't send or receive it,
- right?
- So here we have a set of interfaces for the building blocks of an HTTP 
message.
- There's the {{{RequestLine}}} for a request and the
- {{{StatusLine}}} for a response, both containing a {{{ProtocolVersion}}}.
- The latter is so elementary that we made it a class instead of an interface,
- and of course we have the {{{HttpVersion}}} derived from it.
- Then we have a {{{Header}}} with name and value, where the value
- can have multiple {{{HeaderElement}}}s. And finally there is the
- message body, the {{{HttpEntity}}}.
- 
-  '''Q:'''
-  {{{HttpVersion}}} derived from {{{ProtocolVersion}}}?
-  Wouldn't the protocol always be HTTP in HttpCore?
-  [[BR]]
-  '''A:'''
-  Not quite. There is at least one other protocol,
-  the Session Initiation Protocol SIP,
-  which has a message format identical to that of HTTP.
-  Only the protocol name and version differs.
-  Since it's so similar, we tried to keep the door open.
-  [[BR]]
-  '''Q:''' That makes sense.
- 
- So, out of these building blocks, we collect messages.
- Every {{{HttpMessage}}} has headers, which can be added or deleted at will.
- {{{HttpRequest}}} adds the request line,
- {{{HttpEntityEnclosingRequest}}} an entity.
- {{{HttpResponse}}} adds a status line and also an entity.
- For convenient integration into frameworks that explore the Factory pattern,
- there are factory interfaces for both requests and responses.
- 
-  '''Q:''' Aren't there responses without an entity?
-  [[BR]]
-  '''A:'''
-  Yes. You are very attentive!
-  You don't have to provide an entity to the response, you can leave it null.
-  With requests, you usually know in advance whether you want to provide an
-  entity or not. Also, many requests are GET requests and don't have an entity.
-  That's why we created two different interfaces.
-  Responses mostly have an entity, except in very special cases. Having to
-  know in advance whether there will be an entity or not would have made the
-  API cumbersome to use in some situations, so we went with a single interface.
- 
-  '''Q:'''
-  You said headers can be added and deleted from an {{{HttpMessage}}}.
-  What if I want a read-only message?
-  [[BR]]
-  '''A:'''
-  We've given up the idea of distinguishing between modifiable and
-  non-modifiable messages. It created an insane number of interfaces,
-  and we had to spread instanceof checks and downcasts all over the place.
-  !HttpCore after all is meant for people who know what they do.
-  If you want a message to remain unchanged, simply don't change it ;-)
-  If you really have to prevent modifications, you can implement the
-  interface with a custom class that throws an exception whenever a
-  modifying method is called.
- 
- Now, if you would come over here and have a look through this window
- into the adjoining room? It's a bit too small for all of us to go into,
- but you can see the important things from here.
- It is called package {{{org.apache.http.message}}}.
- Notice that there are basic implementations for all of the
- message representation interfaces. You'll hardly need more than
- those when writing an application that uses !HttpCore directly.
- 
-  '''Q:'''
-  I can't see implementations for {{{GetRequest}}} and {{{PostRequest}}}
-  and so on?
-  [[BR]]
-  '''A:'''
-  You are right, we don't have those convenience classes in core.
-  Core is hardcore. If you want a GET request, you just create a
-  basic request and pass "GET" as the HTTP method name. Likewise for
-  POST or PUT, except you'd create a basic entity enclosing request
-  for those.
-  You'll find convenience classes for the default HTTP methods in
-  the client, but they are really superfluous in the core.
- 
-  '''Q:'''
-  But then I could create an entity enclosing request with GET as the
-  method name. That doesn't make sense, GET requests never have an entity!?
-  [[BR]]
-  '''A:'''
-  Yes, you can do all sorts of stupid things with core.
-  Core is hardcore, and meant for people who know what they're doing.
-  And maybe you really want to create a GET request with an entity,
-  for example to test how a server responds to invalid requests?
-  [[BR]]
-  '''Q:'''
-  An interesting example. I hadn't thought about that.
- 
- All right, are there more questions about the basic implementations?
- No? Good, then let's move on to the room over there.
- It is called package {{{org.apache.http.entity}}}.
- You can find a selection of message entities in there.
- Message entities are really not that much different from the
- request entities in !HttpClient 3.x, except they are no longer tied
- to the client side. As in the old code, there are entities getting
- their content from a string, byte array, file, or input stream.
- The {{{BasicHttpEntity}}} is what we use when a message is received
- over a connection. You'll see the connections later today.
- We also have some advanced stuff for wrapping and buffering entities,
- and an {{{EntityTemplate}}} that simplifies writing a new entity
- if you have to.
- 
-  '''Q:'''
-  There used to be a multipart entity in the old code?
-  [[BR]]
-  '''A:'''
-  Indeed, there is. But that didn't make it into core.
-  It was considered slightly out of scope even for the old code.
-  Maybe we'll bring it into another module at some time,
-  but surely not into core.
- 
- Any more questions about entities?
- Fine, then let's pass through this door, back into the main hall.
- We're going to have a look at connections next.
- 
- 
- 
- == Connections ==
- 
- Connections are needed to send HTTP message from client to server or
- the other way 'round. On the interface level, we have the 
{{{HttpConnection}}}.
- It allows for checking whether a connection is open,
- for closing it or shutting it down,
- and for getting statistical data if such has been gathered.
- To actually send and receive messages, you have to use either
- {{{HttpClientConnection}}} or {{{HttpServerConnection}}},
- depending on what you implement.
- Obviously, the client connection allows for sending requests and
- receiving responses, whereas the server connections receives requests
- and sends responses. Messages are passed to and from the connections
- in terms of the interfaces we have just seen. We require two calls
- for sending the message header and the message entity. That allows
- for explicit handling of the expect-continue handshake, for example.
- 
- 
-  '''Q:'''
-  I don't see a method to open a connection?
-  [[BR]]
-  '''A:'''
-  You have an eagle's eyes, don't you? That is absolutely correct,
-  there is no method for opening a connection in the interface.
-  It took a good deal of discussions until we got to that point.
-  Opening a connection can be trickier than it might seem at first
-  glance, so we left it out of the core API.
- 
-  '''Q:'''
-  So how do I use a connection if I can't open it?
-  [[BR]]
-  '''A:'''
-  Well, the API is different from the implementation. There is no
-  {{{open()}}} in the API, but your implementation can offer that method.
-  The default implementations we ship in core expect to be given
-  an open socket, which you can create in any way you want.
- 
-  '''Q:'''
-  Talking of sockets, I don't see a socket in the interfaces either.
-  Wouldn't it be useful, for example to configure TCP/IP settings?
-  [[BR]]
-  '''A:'''
-  Oh yes, it's definitely useful to know the socket. But, you see,
-  it doesn't belong into the core API. Somebody might want to use the
-  API with some native communication library instead of Java sockets.
-  But we'll have a look at the default implementations right away,
-  you'll see the socket there.
- 
-  '''Q:'''
-  Does that mean one has to downcast to an implementation class in order
-  to obtain the IP address and port number connected to?
-  [[BR]]
-  '''A:'''
-  Oh no, it's not ''that'' bad. You see, here is one more interface
-  {{{HttpInetConnection}}} that provides access to IP addresses and
-  port numbers, both local and remote. It's an optional interface, but
-  it's supported by all our default implementations. You only have to
-  cast to the interface, not to an implementation class.
- 
- Before you ask any more questions, it's probably best we move on into
- the next room, which is called package {{{org.apache.http.impl}}}.
- As you can see, there is a whole bunch of connection implementation classes.
- Don't let that confuse you, it's just for keeping the code maintainable.
- All you really need to look at are the two classes
- {{{DefaultHttpClientConnection}}} and {{{DefaultHttpServerConnection}}}.
- You see, there are the {{{bind}}} operations I told you about,
- where you pass in an open socket to have an open connection.
- And inherited from a base class, there also is a {{{getSocket}}} method.
- 
-  '''Q:'''
-  There are a lot of inherited methods. What's this serializer stuff?
-  And the data transmitter?
-  [[BR]]
-  '''A:'''
-  Don't worry about those. We provide reasonable defaults.
-  It'll all just work by itself, you don't have to do anything.
-  It's a kind of magic ;-)
- 
-  '''Q:'''
-  There are connection re-use strategies here, and I've seen an interface
-  in the main hall. That sounds like connection management?
-  [[BR]]
-  '''A:'''
-  Yes, almost, but not quite. There is no connection management in core.
-  But there are cases where core has to decide about closing a connection.
-  Remember, {{{close()}}} is in the generic interface. The re-use strategies
-  are used to decide about closing connections.
- 
-  '''Q:'''
-  Can those re-use strategies query the statistical information
-  you've mentioned before?
-  [[BR]]
-  '''A:'''
-  Yes, that's the idea. A re-use strategy can look at the headers of
-  request or response, but also at the statistical data of the connection
-  if that is available.
- 
-  '''Q:''' Where do these doors lead to?
-  [[BR]]
-  '''A:'''
-  Eh, please don't go there, thank you. The adjoining rooms
-  {{{org.apache.http.impl.entity}}} and {{{...impl.io}}} are
-  where the transport encodings are handled.
-  You know, the magic stuff I mentioned.
- 
- Now please, visitors... I know that the connections look very interesting
- and complicated, but you really don't want to miss the exciting things still
- coming up. So, if you follow me back to the main hall, and then on to the
- next room...
- 
- 
- 
- == Execution ==
- 
- Here we are in package {{{org.apache.http.protocol}}}.
- This is the home of the framework for executing the higher levels of HTTP.
- Remember that the lower levels, in particular transport encodings,
- are dealt with automagically by the connections we have just left.
- The protocol framework here is concerned with putting the appropriate headers
- into messages, and with calling the connection methods at the right time
- in the right sequence.
- 
- For example, the expect-continue handshake is dealt with here, both on the
- client and server side. For those of you that are not familiar with the
- details of that handshake, I'll explain it briefly.
- When sending a message with a body that is large or tricky to generate,
- clients don't want to risk sending the message data just to get a simple
- error response from the server, for example because authentication is
- required. In that case, the client will put a special Expect: header into
- the request and send only the message headers. The server is expected
- to check the message headers, and to respond with a status code of 100
- if it finds everything ready for processing the request entity. Only
- then will the client send the rest of the request. If the server detects
- a problem, it responds with the appropriate error status code and the
- request body is never sent.
- [[BR]]
- Here we have the {{{HttpRequestExecutor}}}, the client side implementation
- for protocol execution. It handles the expect-continue handshake, and it
- also checks whether an incoming response has an entity that needs to be read.
- For the server side we have {{{HttpService}}}, which checks whether the
- incoming request has an entity, and uses {{{HttpExpectationVerifier}}}
- if the expect-continue handshake is employed.
- Both use an {{{HttpProcessor}}} to modify and interpret headers.
- 
- The framework for setting and interpreting headers is based on 
'''''interceptors'''''.
- Those are little classes which take care of one specific aspect, often just
- a single header. These are collected into a list of interceptors that need
- to be executed on a message before it is sent, or after it is received.
- A range of typically needed interceptors is provided,
- I'll just pick some examples.
- [[BR]]
- Here we have the {{{RequestUserAgent}}}. It is a request interceptor for
- outgoing requests, so it is executed on requests on the client side before
- they are sent. It's only task is to add a User-Agent header, if there is
- none in the request. If you don't want a User-Agent header to be sent,
- you just don't add this interceptor to your list.
- 
-  '''Q:'''
-  How does {{{RequestUserAgent}}} know the value for the header?
-  [[BR]]
-  '''A:'''
-  A very good question. We are keeping parameters with the request.
-  Those will be the next station of this guided tour.
- 
- A trickier interceptor is {{{RequestContent}}}, also applied before a request
- is sent on the client side. It checks whether there is an entity in the
- request and sets up Content-Length and Transfer-Encoding headers if so.
- This is a must-have interceptor if you want to send a request entity.
- On the server side, {{{ResponseContent}}} does the same for the response.
- 
-  '''Q:'''
-  Didn't you say that transfer encodings are handled automagically?
-  [[BR]]
-  '''A:'''
-  Yes, I did. These interceptors are the wizards that make it all happen.
- 
- 
- The already mentioned {{{HttpProcessor}}} holds lists of request and response
- interceptors that should be applied. You set it up once when your application
- initializes.
- 
-  '''Q:'''
-  There are very many interceptors here. How do I know which ones I need?
-  [[BR]]
-  '''A:'''
-  That is a tricky thing. You should stick to the interceptor lists used
-  in the examples. If that doesn't do what you want, just ask by posting
-  your question to the user
-  [http://hc.apache.org/mail-lists.html mailing list].
- 
- 
-  '''Q:'''
-  If I need to authenticate a request, I would use an interceptor that
-  asks the user for the password?
-  [[BR]]
-  '''A:'''
-  NO! Ahem, sorry. No. You should never execute a blocking operation of
-  this kind in an interceptor, and in particular not user interactions.
-  In general, you don't know what kind of background process will execute
-  the interceptors. You could stall the whole application, or even others
-  if it is running in a shared environment.
- 
-  '''Q:'''
-  Then an interceptor that asks the user to confirm cookies is also
-  not a good idea? But how should I do it?
-  [[BR]]
-  '''A:'''
-  You should perform user interaction before or after the execution of the
-  interceptors. For authentication, you would ask for the password before
-  executing the request, and then give the password to the interceptor.
-  For cookies, you take an interceptor that puts the cookies in a separate
-  location, and ask for confirmation when the request execution is done.
-  I was just about to explain how applications interact with interceptors.
- 
- 
- You see this interface here, {{{HttpContext}}}. That is a collection of
- named attributes, where names are strings and attributes can be any kind
- of Java object. When a request is executed, it has one specific context.
- Likewise when a request is being serviced on the server side, of course.
- The interceptors, and many other parts of the framework, have access to
- this context. So your application can put some data - like a password -
- into the context, and an interceptor picks it up. On the other hand, an
- interceptor can put data into the context - like incoming cookies - and your
- application picks that up after the execution.
- [[BR]]
- The context is also the place to keep session information, like the cookies
- that should be sent or passwords that have already been entered.
- Mind you, core does not handle cookies or authentication.
- Core is hardcore, it just provides the framework for doing that.
- The examples show what attributes need to be present in the context for
- the default interceptors to work. We have synchronized and unsynchronized
- implementations of the {{{HttpContext}}} interface.
- 
- Now, if you would kindly follow me to the last stop on our little tour...
- 
- 
- == Parameters ==
- 
- This is package {{{org.apache.http.params}}}, home of the parameter framework.
- We've introduced the preferences framework with version 3.0 of the old code.
- The 4.0 version is a natural evolution of that rather than a radical redesign.
- We keep maps of named parameters in instances of {{{HttpParams}}}.
- Parameters get attached to HTTP messages, so they are available to all
- objects involved in processing a message: interceptors, connections,
- and whatever else other modules are going to add on top of core.
- [[BR]]
- The names of parameters are defined in {{{PNames}}} interfaces,
- where each interface lists parameters for a particular part of the framework.
- We also have {{{Bean}}} classes for these parameter sets.
- These beans don't store the parameters in attributes, but put them into a
- parameter map. This comes in handy if you want to use something like the
- [http://www.springframework.org/about Spring framework],
- which can populate beans from configuration files but wouldn't know what to 
do with a map. 
- [[BR]]
- In the old code, parameters were hierarchical. This feature is still present,
- we can link a map of parameters with another one providing defaults.
- However, this feature should ''never'' be used by applications directly.
- Parameters may and will be linked inside the framework, and having both
- application and framework set up parameter hierarchies would wreak havoc
- on both.
- 
-  '''Q:'''
-  What's the difference between parameters and contexts?
-  Both are maps of named attributes.
-  [[BR]]
-  '''A:'''
-  From a framework perspective, parameters are read-only. The application
-  prepares parameters in advance, then the frameworks reads them.
-  The context is updated by the framework.
-  Furthermore, parameters are meant to hold data, whereas the context can
-  hold any kind of attribute. As a rule of thumb, if it is something you'd
-  write into a properties file, that's a candidate for parameters. If you
-  need to set up a callback at runtime, that goes into the context.
- 
- Caution has to be used when updating parameters after they have been
- passed to the framework. You should avoid to update a parameter set at
- all while execution or servicing is in progress.
- The default implementation of {{{HttpParams}}} is unsynchronized,
- because the framework will use it read-only.
- [[BR]]
- The parameter ''values'' themselves should be read-only at all times.
- So if for example you stored a modifiable map as a parameter value,
- never modify that map again. If you have to update the parameter set
- with a new map, then copy the old one, modify the copy, and replace
- the old value with the modified copy.
- 
-  '''Q:'''
-  If parameters are read-only for the framework, why is the interface 
read-write?
-  [[BR]]
-  '''A:'''
-  Good question! That's for the users,
-  so they can get and modify parameters without typecasting:
-  {{{
-    request.getParams().setParameter("name", value);
- }}}
- 
-  '''Q:'''
-  What's with these {{{HttpProtocolParams}}} and {{{HttpConnectionParams}}}?
-  Are these special implementations of the interface?
-  [[BR]]
-  '''A:'''
-  Ah, no. Those classes contain static helper methods for
-  getting and setting the respective parameters.
-  This encapsulates typecasts and provides at least some type safety:
-  {{{
-    HttpProtocolParams.setVersion(request.getParams(), HttpVersion.HTTP_1_0);
-    HttpVersion version = HttpProtocolParams.getVersion(request.getParams());
- }}}
- 
-  '''Q:'''
-  Is there a helper that loads parameters from a properties file?
-  [[BR]]
-  '''A:'''
-  No, core is hardcore. We don't deal with configuration via properties files
-  in core. If we supported properties files today, we'd be asked to support
-  XML configuration tomorrow, and something else the day after.
-  There would be no end to it.
-  Besides, instantiating parameters with the correct type is not trivial.
-  A string parameter needs to be stored as {{{String}}}, whereas an integer
-  parameter needs to be stored as {{{Integer}}}. It becomes even worse for
-  custom parameter types. We don't want this kind of type conversion logic
-  in core. Maybe in an extra module, sometime.
- 
- 
- == Farewell ==
- 
- I hope you enjoyed our tour of the
- [wiki:Self:HttpComponents HttpCore module]
- and found the experience enlightening.
- If you have any more questions, do not hesitate to post them on the
- user [http://hc.apache.org/mail-lists.html mailing list].
- Saying that, you might want to search the archives of the mailing lists
- first, in case somebody else already got an answer to a similar question.
- We are also considering to offer guided tours of other modules in the future.
- We'd be happy if you join one of those when they become available.
- 
- Thank you all, and see you next time!
- 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to