David Nesting ("NESTING, DAVID M (SBCSI)") wrote to <mailto:[EMAIL PROTECTED]> on 2 August 2004 in "4xx responses for bad query strings?" (<mid:[EMAIL PROTECTED] s.sbc.com>):
I'm trying to determine a proper way to indicate a requested piece of information was not found, where the information is keyed not just on the URI components identifying the HTTP resource, but upon the query string as well.
First you need to eliminate the misunderstanding over "HTTP resource". As far as Web architecture is concerned, URIs like
"http://example.com/path/display"
and
"http://example.com/path/display?page"
are nearly opaque and their respective resources have no inherent relationship. The second URI, with the query string, identifies a first-class HTTP resource.
If, however, one of those "logical" pages doesn't exist, with what HTTP response code should this resource respond?
In short, one of 200 OK, 204 No Content, or 404 Not Found.
The choice depends on what you want to convey.
My first thought was to use a 404 response, but it can be argued pretty convincingly that this response code indicates the HTTP resource itself (the "display" resource) is missing, when it's not.
You'll find at least this humble correspondent resistant to such argument. For more authoritative answers, direct your question to the World Wide Web Consortium's Technical Architecture Group (TAG), the authors of the HTTP/1.1 specification in RFC 2616, the authors of the URI specification in RFC 2396, or the contributors to the revision of RFC 2396. (Granted, there are plenty who belong to more than one of those groups.)
It's just the *query* to that resource failed to return any content.
But why and how did it fail to return content? Details about your situation would help.
Imagine an English-language dictionary service that allows queries of the form http://example.info/lexicon?term=<input-term> . An HTTP request with a Request-URI of "http://example.info/lexicon?term=ant" should yield "200 OK" with definitions of the word "ant" in the response entity body.
If a word were recognized as belonging to the English language but had no definition available from the lexicon (just play along for the sake of example), a response of "204 No Content" would be best. A 204 response signals to crawlers as well as to human users that there is no definition available there.
If the input term did not form a word recognized as belonging to the English language, a response of "404 Not Found" would be best.
My second thought was to use a 403, since that could be interpreted as just a generic refusal to handle the request.
This seems at the margins of acceptable practice, but I'm having a difficulty in explaining my position.
I don't think a 500-series error would be appropriate, because that implies a problem with the server, not with the request.
I agree.
It could also be argued that the resource itself is operating correctly and the request was fine and is getting a valid response (of "no such content"), therefore it should use a 200 response code. The problem with this, though, is that search engines will end up indexing it as though it were legitimate, which is not desirable**.
So would "204 No Content" suffice?
I'm a little curious to know if there is a recommended practice here.
I don't know of any, but what I don't know could fill an encyclopedia.
Many HTTP response codes tie themselves with the presence, absence or abilities of the HTTP *resource* itself, without discussing resources that may change behavior based upon a query string.
I refer to my opening position in this message that the URI with a query string identifies a separate and full-fledged resource.
-- Etan Wexler.