Re: [pkg-discuss] pipelining vs. streaming for depot filelist operation

Shawn Walker Sat, 10 May 2008 10:03:20 -0700

On Fri, May 9, 2008 at 1:37 PM, Danek Duvall <[EMAIL PROTECTED]> wrote:
> I still think that fully pipelined file ops are a cleaner way to go, if we


I strongly agree in this regard.

Currently in depot.py, it says that "pkg.depotd will reduce to a
special purpose HTTP/HTTPS server explicitly for the version
management operations, and must manipulate the various state
files--catalogs, in particular--such that the pkg(1) pull client can
operately accurately with only a basic HTTP/HTTPS server in place."

When I had that discussion Monday night, one of the topics that came
up was the fact that they didn't want to run any special software to
be able to have a package mirror. They just wanted to be able to untar
a repository and "go."

Given services like akamai and others, I have to agree with that
sentiment to a certain extent.

The primary issue I see with, is it has the potential to create an
inconsistent user experience. Specifically, it would mean that
filelist operations (or other static retrieval functionality) could go
to a "standard" server (on a service like akamai) but "search -r"
requests and the like might go to a much slower-responding server.
This would result in the user having extremely slow searches and
really fast downloads.

This also means that our client would need to become "smarter" about
how it performs operations, and we would need it to check whether the
target depot server supports anything other than "static requests" (by
checking /versions/ I would assume).

This does lead to some interesting possibilities though. For example,
the server could maintain a list of "file request servers" in the same
way it maintains a catalogue of packages. The client could use that
list and attempt to automatically determine the closest geographical
and most responsive server (via ping or some other simple method?) and
then use that for "file requests" while continuing to use the central
depot server for all other requests. Or that list could indicate which
server the client should use instead of attempting to auto-determine
where to get the content.

Going that direction would allow us to maintain a "relatively trusted"
set of "file request mirrors" on the depot that clients could retrieve
in the same way we handle catalogs now. In addition, the client could
maintain its own custom list of "file request mirrors" to use in
addition to ones reported by the depot server.

I know this creates some additional issues, including how to determine
whether a target "file request server" is a legitimate mirror, but I
think it is worth pursuing.

> can actually use such a construct in Python.  So I'm still curious if
> CherryPy will provide that for us, server-side, leaving us only needing a
> client-side implementation.

Yes! Thankfully, cherrypy does support this in its own wsgi server,
and since cherrypy applications can be tied directly to Apache through
mod_python, the support is there too:
"The builtin WSGI server is now HTTP/1.1 compliant! It correctly
handles persistent connections, pipelining, Expect/100-continue, and
the "chunked" transfer-coding (receive only)." [1]

I managed to find an example of how to do HTTP 1.1 pipelining requests
in python [2] and discovered that httplib2 [3] apparently supports
this directly [4].

Cheers,
-- 
Shawn Walker

"To err is human -- and to blame it on a computer is even more so." -
Robert Orben

[1] http://trac.cherrypy.org/wiki/WhatsNewIn30
[2] http://trac.cherrypy.org/browser/trunk/cherrypy/test/test_conn.py?rev=#L282
[3] http://bitworking.org/projects/httplib2/ref/module-httplib2.html
[4] 
http://www.xml.com/pub/a/2006/03/29/httplib2-http-persistence-and-authentication.html
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] pipelining vs. streaming for depot filelist operation

Reply via email to