This proposal adds hacks to a hack to solve a problem that we never
bothered to design for in the prototype.

> [I]t's annoying to see this (elided for clarity):
> 
> [07/May/2008:18:52:41 -0700] "POST /filelist/0 HTTP/1.0" 200
> [07/May/2008:18:52:36 -0700] "POST /filelist/0 HTTP/1.1" 200
> [07/May/2008:18:52:43 -0700] "POST /filelist/0 HTTP/1.1" 200
> 
> As you can see, all the stuff you might want to know is buried
> in the POST operation.

Yes, but that's not actually the problem.  By the time we print out this
log message, we've decoded the header in the post and could easily print
the content hashes for the files that were requested.

The problem is that we're logging from a proxy and only grabbing the
URL.

> For those not familiar with what RESTful means,
> well, it's the opposite of this-- that is to say, the URL should provide
> the context.
> 
> So my short-term proposal is to alter this to something like:
> 
>  POST /filelist/0/[EMAIL PROTECTED]
>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If you were to actually write this so the URL provides the context, it
would be something like:

        GET /filelist/0/file1=<hash1>&file2=<hash2>&file3=<hash3>

Filelist is completely package agnostic.  It's a way of retrieving an
arbitrary set of files in a batch from the server.

> This could mean: get all files associated with this package (this is the
> "install fresh" case).  This could even be mapped as:
> 
>   GET /package/0/SUNWgnome-img-editor-help-es/0.5.11%2C5.11-0.86
>       ^^^^^^^^

This approach makes sense, since the operation is actually related to
the package in question.

>   GET /filelist/0/SUNWgnome-img-editor-help-es/0.5.11%2C5.11-0.86/1:2:7:9
> 
> This could mean "get the files associated with the 1st, 2nd, 7th, 9th
> file actions in the manifest" (this would rely on the client and server
> both obeying a predictable sort order for files in a manifest).

I don't like the idea of demanding that the manifest be ordered in a
particular way.  This seems like an invitation for utter chaos when bugs
inevitably creep into the manifest or sorting code.

Instead, I would make your "package" operation both a GET and a POST.  In
the case where you're only interested in a particular subset of files in
the package, send a POST with a header that contains just the content
hashes of the files you're interested in.  However, I still think there
are drawbacks to this approach.  If you're only grabbing a few small
files for a package, it would make a lot more sense to aggregate your
requests into one larger one, sparing the multiple round trips to
upgrade a few files in a bunch of different packages.

More to the point, I don't actually think we care that much about the
exact breakdown of packages downloaded.  We can count the number of
times a manifest gets downloaded.

What's really interesting are the packages that the user requested, as
opposed to the requested packages plus the required dependencies.
Sending a header to the server that contained the FMRIs that we parsed
on the pkg(1) commandline would be far more illuminating, at least if we
want to measure what packages are actually popular.

Part of the work we're going to need to do for mirroring will involve
making filelist handle a failure to download requested files from a
particular host.  We're going to need to make filelist more flexible.
If you really want this kind of observability for package file
downloads, and it's not entirely clear to me that you do, I would use
the "package" operation that you proposed.  If we turn out not to need
filelist for the mirroring stuff, we can take it out and let "package"
be its eventual successor.  I don't see why they couldn't co-exist for
now, though.

-j
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to