Re: [pkg-discuss] Code review request for [bugs 1154, 1237, 1845, 1887, 1888]

Shawn Walker Fri, 16 May 2008 17:13:56 -0700

2008/5/16  <[EMAIL PROTECTED]>:
> Shawn,
>
>> > The approach that you're taking now doesn't need to employ streaming.
>> > You could just write the tarfile to a temporary file, serve it, and
>> > delete it.  I'm not sure that's the right approach, though.  Are you
>> > sure there's no sneaky way to get CherryPy to write a streaming response
>> > with an unknown length?
>>
>> The issue is not the unknown length; cherrypy doesn't care about that.
>>
>> The issue is that cherrypy does not provide a file object to write to
>> for the response.
>
> This was why I asked if there was some sleazy trick we could play to
> obtain the file object.  The output has to go back to the requestor over
> a socket, so there's a descriptor for it _somewhere_.  I just wondered
> if we might be able to get hold of it through the framework by doing
> things we weren't supposed to do.


I have not yet been able to find a way to do that.

Everything I've tried so far has resulted in "bad things".

I did post a question to the cherrypy users group though, so I'll see
if someone knows a better way.


>> One possible thought I had was to try creating a temporary file and
>> use that for the tarfile object, and after each tar_strem.add(),
>> stream the temporary file and then truncate it. I would repeat that
>> until done. However, I'm not sure if such chicanery would work :-)
>
> Me either.  Another thought would be to write a separate streaming
> filelist daemon and use Apache's reverse proxy to map filelist ops to
> the separate service.

Yeah, the chicanery didn't work (one of the pkgsend tests failed, not
sure why yet). I tried using a temporary file, stringio, cStringIO,
etc.

I also tried mucking with:
cherrypy.request.rfile._sock.makefile('w')
cherrypy.request.rfile._sock._fileobject

I think, unfortunately, that this will be an issue I'll have to
resolve somehow before this gets putback.

Either that, or very soon after.

I know creating a temporary file the size of whatever we're streaming
isn't really practical when serving a large number of requests.

I just don't know of any other way to handle it at the moment.

>> > There has been a fair amount of hand-wringing about how filelist doesn't
>> > fit with our architectural principles.  Perhaps now would be an
>> > opportune time to investigate whether we could switch to sitting behind
>> > Apache and simply pipeline our requests for multiple files?
>> > To be clear, I'm not implying that you should do that work all by
>> > yourself.
>>
>> I was actually considering that. The problem I have is that, as
>> Stephen pointed out, the current protocol must be supported for a
>> while. As such, that would have to be /filelist/1/.
>
> I was imagining that this would simply be a bunch of GET /file/0/
> requests pipelined together.  We'd need to write a httplib that can
> issue pipelined requests, as no Python implementations seem to do this
> yet.  I'm also not sure how many requests Apache will serve before it
> says enough is enough.  (Can we request 64 different files at once?)

httplib2 can supposedly handle pipelined requests.

The cherrypy guys also have an example of doing it "by hand" using the
existing python libraries.

>> One of the things I struggled with while making these changes was
>> whether it was more efficient to pass the request and response object
>> around (and cleaner) or whether it was better to simply use the
>> singleton object to access them.
>
> My guess it that it might be faster to pass the request and response
> object; however, the difference probably isn't enough to be appreciable.

I'll take a look back at the code and see how big of a change it would
be to do this.

>> > repository.py:
>> > 307 - Does serve_file return a 404 if it can't find the file at the path
>> > that it has been given?
>>
>> If you specify a request path that cherrypy is unable to map to an
>> object through the "mounted object tree" (see quickstart in depot.py)
>> it will pass it off to the default page handler if you have one. If
>> you don't have one it will return a 404.
>>
>> So yes, currently the depot code is setup with a custom default page
>> handler that will return a 404 for unknown pages via face.py:unknown.
>
> I guess I may have asked this question obliquely.  I was trying to
> figure out what happens if the client requests as filehash that isn't in
> the depot.  The code in file_0 looks like this:
>
>        return serve_file(os.path.normpath(os.path.join(
>            self.scfg.file_root, misc.hash_file_name(fhash))),
>            'application/data')
>
> So if fhash isn't in the depot and the path we pass to serve_file
> doesn't actually name a file, do we get a 404 here or an exception?

We get a 404. cherrypy handles that too :-)

> If I read your response correctly, this is a 404. I just wanted to make
> sure I understood.

It was actually a separate case, but the answer is the same.

In general, that was one of the things I liked the most: the fact that
cherrypy did what I expected in most cases.

All of the various code we had to deal with exceptions, for serving
what is essentially static content, was able to be greatly simplified.

Cheers,
-- 
Shawn Walker

"To err is human -- and to blame it on a computer is even more so." -
Robert Orben
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] Code review request for [bugs 1154, 1237, 1845, 1887, 1888]

Reply via email to