Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-02 Thread Jonathan Rochkind

Joe Hourcle wrote:

On Tue, 1 Jun 2010, Jonathan Rochkind wrote:


  
Accept-Ranges is a response header, not something that the client's 
supposed to be sending.
  
Weird. Then can anyone explain why it's included as a request parameter 
in the SRU 2.0 draft?   Section 4.9.2.


Jonathan


Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-02 Thread Jonathan Rochkind

Erik Hetzner wrote:


Accept-Encoding is a little strange. It is used for gzip or deflate
compression, largely. I cannot imagine needing a link to a version
that is gzipped.

It is also hard to imagine why a link would want to specify the
charset to be used, possibly overriding a client’s preference. If my
browser says it can only supports UTF-8 or latin-1, it is probably
telling the truth.
  
Perhaps when the client/user-agent is not actually a web browser that 
is simply going to display the document to the user, but is some kind of 
other software. Imagine perhaps archiving software that, by policy, only 
will take UTF-8 encoded documents, and you need to supply a URL which is 
guaranteed to deliver such a thing.


Sure, the hypothetical archiving software could/should(?)  just send an 
actual HTTP header to make sure it gets a UTF-8 charset document.  But 
maybe sometimes it makes sense to provide an identifier that actually 
identifies/points to the UTF-8 charset version -- and that in the actual 
in-practice real world is more guaranteed to return that UTF-8 charset 
version from an HTTP request, without relying on content negotation 
which is often mis-implemented. 

We could probably come up with a similar reasonable-if-edge-case for 
encoding.


So I'm not thinking so much of over-riding the conneg -- I'm thinking 
of your initial useful framework, one URI identifies a more abstract 
'document', the other identifies a specific representation. And 
sometimes it's probably useful to identify a specific representation in 
a specific charset, or, more of a stretch, encoding. No?


I notice you didn't mention 'language', I assume we agree that one is 
even less of a stretch, and has more clear use cases for including in a 
URL, like content-type.


Jonathan


Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-02 Thread Erik Hetzner
At Wed, 2 Jun 2010 15:23:05 -0400,
Jonathan Rochkind wrote:
 
 Erik Hetzner wrote:
 
  Accept-Encoding is a little strange. It is used for gzip or deflate
  compression, largely. I cannot imagine needing a link to a version
  that is gzipped.
 
  It is also hard to imagine why a link would want to specify the
  charset to be used, possibly overriding a client’s preference. If my
  browser says it can only supports UTF-8 or latin-1, it is probably
  telling the truth.

 Perhaps when the client/user-agent is not actually a web browser that 
 is simply going to display the document to the user, but is some kind of 
 other software. Imagine perhaps archiving software that, by policy, only 
 will take UTF-8 encoded documents, and you need to supply a URL which is 
 guaranteed to deliver such a thing.
 
 Sure, the hypothetical archiving software could/should(?)  just send an 
 actual HTTP header to make sure it gets a UTF-8 charset document.  But 
 maybe sometimes it makes sense to provide an identifier that actually 
 identifies/points to the UTF-8 charset version -- and that in the actual 
 in-practice real world is more guaranteed to return that UTF-8 charset 
 version from an HTTP request, without relying on content negotation 
 which is often mis-implemented. 
 
 We could probably come up with a similar reasonable-if-edge-case for 
 encoding.

 So I'm not thinking so much of over-riding the conneg -- I'm thinking 
 of your initial useful framework, one URI identifies a more abstract 
 'document', the other identifies a specific representation. And 
 sometimes it's probably useful to identify a specific representation in 
 a specific charset, or, more of a stretch, encoding. No?

I’m certainly not thinking it should never be done. Personally I would
leave it out of SRU without a serious use case, but that is obviously
not my decision. Still, in my capacity as nobody whatsoever, I would
advise against it. ;)
 
 I notice you didn't mention 'language', I assume we agree that one is 
 even less of a stretch, and has more clear use cases for including in a 
 URL, like content-type.

Definitely.

best, Erik
Sent from my free software system http://fsf.org/.


pgpQQ4F8ZxBbI.pgp
Description: PGP signature


[CODE4LIB] Inlining HTTP Headers in URLs

2010-06-01 Thread LeVan,Ralph
I've been sensing a flaw in HTTP for some time now.  It seems like you
ought to be able to do everything through a URL that you can using a
complete interface to HTTP.  Specifically, I'd love to be able to
specify values for HTTP headers in a URL.

 

To plug that gap locally, I'm looking for a java servlet filter that
will look for query parameters in a URL, recognize that some of them are
HTTP Headers, strip the query parms and set those Headers in the request
that my java servlet eventually gets.

 

Does such a filter exist already?  I've looked and not been able to find
anything.  It seems like the work of minutes to produce such a filter.
I'll be happy to put it out as Open Source if there's any interest.

 

Thanks!

 

Ralph


Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-01 Thread Jonathan Rochkind
You could do this using mod_rewrite in apache alone, if you can find no 
better way to do it, if your app is apache-fronted.


But it's not as obvious to me as it is to you that  you ought to be 
able to do everything through a URL that you can using a complete 
interface to HTTP.  I guess it is convenient for debugging though.


LeVan,Ralph wrote:

I've been sensing a flaw in HTTP for some time now.  It seems like you
ought to be able to do everything through a URL that you can using a
complete interface to HTTP.  Specifically, I'd love to be able to
specify values for HTTP headers in a URL.

 


To plug that gap locally, I'm looking for a java servlet filter that
will look for query parameters in a URL, recognize that some of them are
HTTP Headers, strip the query parms and set those Headers in the request
that my java servlet eventually gets.

 


Does such a filter exist already?  I've looked and not been able to find
anything.  It seems like the work of minutes to produce such a filter.
I'll be happy to put it out as Open Source if there's any interest.

 


Thanks!

 


Ralph

  


Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-01 Thread Nate Vack
On Tue, Jun 1, 2010 at 1:21 PM, LeVan,Ralph le...@oclc.org wrote:
 I've been sensing a flaw in HTTP for some time now.  It seems like you
 ought to be able to do everything through a URL that you can using a
 complete interface to HTTP.  Specifically, I'd love to be able to
 specify values for HTTP headers in a URL.

No, you shouldn't. HTTP headers and the URL string are for completely
different purposes.

You can certainly do similar things with headers and URI components
(for example, a client could use either to specify what content type
it expects, or language preferences) but the uses don't overlap much.

Think: do you really want clients specifying Referer: or
Cache-control: as URL parameters? It's not that it'd be harmful,
just... odd.

For debugging your requests, something like LiveHTTPHeaders or Fiddler
is almost certainly cleaner.

Cheers,
-Nate


Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-01 Thread Erik Hetzner
At Tue, 1 Jun 2010 14:21:56 -0400,
LeVan,Ralph wrote:
 
 I've been sensing a flaw in HTTP for some time now.  It seems like you
 ought to be able to do everything through a URL that you can using a
 complete interface to HTTP.  Specifically, I'd love to be able to
 specify values for HTTP headers in a URL.
 
 To plug that gap locally, I'm looking for a java servlet filter that
 will look for query parameters in a URL, recognize that some of them are
 HTTP Headers, strip the query parms and set those Headers in the request
 that my java servlet eventually gets.

 Does such a filter exist already?  I've looked and not been able to find
 anything.  It seems like the work of minutes to produce such a filter.
 I'll be happy to put it out as Open Source if there's any interest.

Hi -

I am having a hard time imagining the use case for this.

Why should you allow a link to determine things like the User-Agent
header? HTTP headers are set by the client for a reason.

Furthermore, as somebody involved in web archiving, I would like to
ask you not to do this.

It is already hard enough for us to tell that:

  http://example.org/HELLOWORLD

is usually the same as:

  http://www.example.org/HELLOWORLD

or:

  http://www.example.org/helloworld

I don’t want to work in a world where this might be the same as:

  http://192.0.32.10/helloworld?HTTP-Host=example.org

Apologies if this sounds hostile, and thanks for reading.

best, Erik Hetzner
Sent from my free software system http://fsf.org/.


pgpC9To4fBtJW.pgp
Description: PGP signature


Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-01 Thread Ed Summers
On Tue, Jun 1, 2010 at 2:21 PM, LeVan,Ralph le...@oclc.org wrote:
 I've been sensing a flaw in HTTP for some time now.  It seems like you
 ought to be able to do everything through a URL that you can using a
 complete interface to HTTP.  Specifically, I'd love to be able to
 specify values for HTTP headers in a URL.

It might help (me) if you describe what you are trying to do first.

//Ed


Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-01 Thread LeVan,Ralph
 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Erik Hetzner
 
 I am having a hard time imagining the use case for this.

A simple use case comes from OpenSearch and its use of URL templates.  To 
enable the return of RSS as the response to an SRU query, we added the 
parameter httpAccept=application/rss+xml to the SRU URL in the OpenSearch 
template and coded for it in the SRU server.  Had we had a filter in the 
request, the servlet's life would have been easier.

That seemed like a specific solution to what could be a generalizable problem.

Ralph


Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-01 Thread Joe Hourcle

On Tue, 1 Jun 2010, Jonathan Rochkind wrote:

Accept-Ranges, I have no idea, I don't understand that header's purpose 
well enough. But SRU also provides a query param for that, it seems less 
clear to me if that's ever useful or justifiable.


Accept-Ranges is a response header, not something that the client's 
supposed to be sending.


The client sends a 'Range' header (with an optional 'If-Range' if you're 
concerned with the resource having changed), and in response, the server 
sends a 206 status with a 'Content-Range' header.


See
http://labs.apache.org/webarch/http/draft-fielding-http/p5-range.html

...

I only know of two values for 'Accept-Ranges' -- none (ie, I don't accept 
partial downloads) and bytes, so for incomplete downloads you can start 
where you left off.  If you know the file's excessively large, I guess you 
could use it to transfer it in parallel to abuse the TCP congestion rules. 
(or if you have a way of knowing that there are multiple mirrors, to 
spread the load across servers).


-Joe


Re: [CODE4LIB] Inlining HTTP Headers in URLs

2010-06-01 Thread Ed Summers
On Tue, Jun 1, 2010 at 6:35 PM, Erik Hetzner erik.hetz...@ucop.edu wrote:
 There is a time for a URI that can use content-negotiation (the Accept
 header, etc.) to get, e.g., PDF, HTML, or plain text. As an example:

  http://example.org/RFC1

 And there is a time when we want to explicitly refer to a particular
 resource that has only ONE type. For example, the canonical version of
 an RFC:

  http://example.org/RFC1.txt

 But these are different resources. If you want to be able to link to
 search results that must be returned in RSS, a query parameter or file
 extension is proper.

 But this query param or file extension, in my opinion, is quite
 different than HTTP content negotiation or the Accept header.

Nicely put Erik!

//Ed