Re: [pkg-discuss] Code review request for [bugs 1154, 1237, 1845, 1887, 1888]

johansen Thu, 15 May 2008 10:49:21 -0700

Hey Dude,
Thanks for doing this work.  I commented on your notes first, and the
code-review feedback is follows that.


> Notes:
> * The tar streaming now has to write the file to a temporary file
> first before streaming. This is an unfortunate side-effect of cherrypy
> not providing a file object to write data directly back to the client.
> Initially, I considered using a StringIO object, but that would have
> bloated the server process quite a bit (memory usage).

This effectively negates the intended benefit of streaming the files.
The idea was to write the tar stream out to the socket as we read each
file so that memory and disk usage would remain low.

The approach that you're taking now doesn't need to employ streaming.
You could just write the tarfile to a temporary file, serve it, and
delete it.  I'm not sure that's the right approach, though.  Are you
sure there's no sneaky way to get CherryPy to write a streaming response
with an unknown length?

In a previous e-mail you noted that it's possible to run CherryPy behind
Apache.  The documentation for downloading files seems to hint that
using Apache is preferable, at least for this task.
(http://www.cherrypy.org/wiki/FileDownload).

There has been a fair amount of hand-wringing about how filelist doesn't
fit with our architectural principles.  Perhaps now would be an
opportune time to investigate whether we could switch to sitting behind
Apache and simply pipeline our requests for multiple files?
To be clear, I'm not implying that you should do that work all by
yourself.

> * I have not yet done extensive performance testing, nor have I
> checked for the threading issues that we previously had.

Will you do both before you putback?  An extensive test may be above and
beyond the call of duty; however, some amount of performance and
threading testing would sure be nice.

I have one more question that wasn't really in your notes.  One of the
justifications for switching to CherryPy was that it would simplify our
codebase.  However, most of the code in the depot seems to have been
moved (more, or less) from depot.py to repository.py.  Is there some
portion of the existing code that you think we could simplify once we
switch to using CherryPy?  I'm also curious how you see this change
improving our ability to add new features to the depot.  Perhaps you
could give an example of how you see this helping future development?

> webrev:
> http://cr.opensolaris.org/~swalker/pkg-depot-3/

depot.py:

44 - Would it make sense to check if the import failed and print a message
that CherryPy is required to run the depot?  

catalog.py:

I'm a little slow this morning.  Why do we need to embed two different
methods in catalog.send()?  I'm sure it's justified.  I'm just having a
hard time figuring out why.

updatelog.py:

26 - This module is common to the client and the server.  Unless I'm
being a moron, this import means the client will also depend upon
cherrypy being installed.

repository.py:

Just so I'm sure I understand, what does the @cherrypy.expose decorator do?

79 - If cherrypy handles this automatically, can we assert this
condition, or throw some kind of exception if we're processing a
function that isn't exposed?

307 - Does serve_file return a 404 if it can't find the file at the path
that it has been given?

There are a number of places in updatelog and transaction where you
refer to the current request or response as cherrypy.request and
cherrypy.response, respectively.  Are these guaranteed to always be for
the HTTP session we're currently processing?  Do you know how cherrypy
enforces this?

Thanks,

-j
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] Code review request for [bugs 1154, 1237, 1845, 1887, 1888]

Reply via email to