[email protected] wrote:
Brock,

On Mon, Sep 28, 2009 at 02:24:37PM -0700, Brock Pytlik wrote:
Webrev:
http://cr.opensolaris.org/~bpytlik/ips-7960-v1/

transport.py:

  - lines 456-460:  Why is this exception handler necessary?
Why wouldn't it be? This seemed like a reasonable place to translate the exception into something the client code was prepared to deal with.
  - lines 669:  This check is different from the previous conditional.
    Is it possible for cache_path not to exist, even if lookup returns a
    path.
If lookup returns a path, then at that moment, that path exists in the cache. If more than one process (with different layout configurations or with different readonly settings) is operating on that directory, it's possible for race conditions to happen. That's one reason the lookup method has an option to return a file handle.
modules/file_layout:

  - The source under this directory hierarchy is < 300 lines of code.
    Could we coalesce this into a single file in modules/ or am I
    missing something?
I thought that since the layouts and file manager were fundamentally different pieces of functionality, they should go into different files.
file_layout/__init__.py:

  - lines 29-31: This means that if you import file_layout without
    further arguments, you'll get everything in file_manager and layout.
    Is this what you intended?  You're generally importing these modules
    specificially when you need them in other code.
I've changed this to all = []. Thanks for catching this.

file_layout/file_manager.py:

  - line 33: How about, "This exception is raised when the caller
    attempts to modify a read-only FileManager."
Sure
  - lines 40-42: This might be more clear as, "The FileManager cannot
    create %s because it is configured read-only."\
Except that it's not always creation that's a problem. Assuming you're fine with create also being substitutable, I'm fine with this.
  - line 46: This name conflicts with a class in another module:
    api_errors.PermissionsException.
I was under the impression that it was fine to have names that were the same but in different modules... that's why the module names are attached to them. In fact, there are other places in the code where this is done today. (see api_errors.py vs search_errors.py)
  - General comment about these two exceptions: Should these inherit
    from some form of API exception?  Unless you're catching these in
    every code path that might emit them, it probably makes sense to
    give the API some kind of ability to catch these.
I don't think so. This is also what the server will be using to organize its files, so I didn't want to include client specific exceptions (which the api exceptions are). I'll double check that the exceptions are being caught properly on the server side as well.
  - lines 72-73: Should this be a class property, it doesn't look like
    it's going to vary between instances?
I don't think so. If it ever changed, it'd probably vary by instance, not by class. For example, if the layouts were stored on the file system (something I considered for a bit before deciding it was currently unnecessary) then part of initialization of an instance would be to load it off disk. Besides, in that world, if a multi threaded program were operating on different two images with different layouts, having it as a class variable would break things.
  - line 91:  How does this function cope with files that may be
    readable in one tree, but not the other?  If you've found a file
    that exists in both layouts, but is only readable (or modifyable) in
    one, is there any mechanism to ensure that this works properly?
    What do you do if you can't move the file from the old layout to the
    new layout?  It looks like you have some code on lines 132-134 that
    tries to handle this, but it doesn't appear to make a distinction
    between the move failing because the destination wasn't writable,
    and the move failing because the source wasn't writable.  I can
    imagine some cases where we'd want to copy the file over instead,
    even if we can't remove it.

    In fact, if you had a depot on a stick, you might want the layout
    manager to copy the files over to a local disk as they're accessed,
    but this current model doesn't look like it would support that kind
    of incremental depot migration.
I'm not sure what you mean by "one tree but not the other." Assuming layout can replace tree, then right now it doesn't cope. What happens currently if someone goes in and makes selective files in the download directory not readable? If it's useful, I can stat the file on line 110 and ensure that it's readable.

I'm also happy to introduce logic that copies the files, but in general, I think that (like the stat issue) is introducing needles complexity. If a user's messing with the directories/permissions/files inside /var/pkg then they're on their own to some extent (at least that's always been my understanding).

- lines 147-149: Why not defer the lookup on 146 until we've checked
    read-only?
Yeah, I had it that way originally, not sure what I was thinking when I changed it.
  - lines 182-184: Same comment as above.

file_layout/layout.py:

  - This module needs more documentation.
OK
  - Why do you allow Original to name itself "s1" by defining SHA1_160,
    but Fan256Once allows the class that instantiates it to give its
    name.  Wouldn't it be better to encode the name as a class property
    in both cases instead?
Original can only be SHA_160 since that's the old format and we're not going back to change it. Fan256Once can use SHA_256, when we add it, as well, hence it's an argument to the constructor.
  - lines 59/60 and 65/66:  These should match the function signatures
    in the parent class that they are overriding.
They do except in one class they're declared static but not in the other. I can make them all non-static if you prefer. This works though.
General comments:

  - You did this work as part of a performance investigation surrounding
    file layouts and directory access times.  It would help the rest of
    us provide useful feedback if you would include information about
    the experiments that you ran, and the data that you obtained.  In
    particular, how did you decide that one level of 256 directories was
    optimal?
Here's the openoffice spread sheet if you'd like to see the numbers (http://cr.opensolaris.org/~bpytlik/ips-dir-timing.ods). I can also put out the python script I used, or the entire hg repo so you can see each experimental step I took.

Short version: I created various numbers of files, from 1 up to 10M, and placed them on disk using different layouts. The layouts included flat (all files in one directory), multiple level fanout where each level had a fan out of N and the number of levels was determined by rounding up log_(base N) TOTAL_NUMBER_OF_FILES for N of 16, 256, 1024, 4096, 32768, the final choice of one level of 256 fanout, and the current scheme. I stopped experimenting with the current scheme because the time to complete was growing at an unsustainable rate. I also looked at different encodings for the file names to see whether that made a difference (due to the difference in length of the file names). It didn't make a noticeable difference at the scales we're looking at.

Initially, I was evaluating on four tasks. 1) The time it takes to create N files and place them in the layout correctly. 2) Take 10 percent of the files, consider that the working set. Take 20% of that set, consider that the highly active set. Drawing from the highly active set twice as often as the working set (our rough guesstimate of a real world situation), time how long it takes to open and close N files. 3) The time a rsync -n for the entire stricture. 4) The time it takes to remove the entire directory tree.

Because rsync -n and removal consistently produced the same graph shapes for all the layouts, I eventually dropped it from the tests.

Looking at the numbers, flat was the best performer on ZFS (the only FS we tested on) and also has the advantage of being the simplest to implement. However, some other file systems we would like to support can't really cope with 1M files in a single directory. It also has some downsides for a user. ls'ing the directory used 374M of RAM if I remember right while running os.listdir within python used about 100M. For reference, the current dev depo was at about 1M files about a month ago when took my measurements. A client system seemed to be in the 300k to 600k depending on how long the machine had been upgraded, what packages were installed, etc... Rather than fork the code for platform/use, I tried the 256 once layout. This had comparable performance to flat (again, see the spreadsheet) and would allow us not to need to fork the code unless we had a client or server on a non-zfs file system with more than 8,388,608 (256 * 32768) files in it. This seemed like a reasonable limit.

I don't claim that this is the optimal solution across all platforms. I make the following claims: 1) It won't break (in the absolute sense) on reasonably sized servers and clients across platforms.
2) It's a big improvement on performance for ZFS servers and clients.
3) Other platforms can be forked at later date if a more optimal layout is found for them
4) A better layout can be added in the future if someone else thinks of one
5) The most important bit: It's faster than what we have now, roughly an order of magnitude faster for removal and rsync.

  - You've created the file layout and file manager as part of an
    extensible architecture.  Would you provide more information about
    your design and architecture?  As an example, how would I go about
    adding a new layout to this design?
You'd add a new class to the layout file, and add it an instance of it as the first item in self.layouts in FileManager.
  - The FileManager implements a cache of sorts, but I don't see any
    hooks for some of the typical operations that would be performed on
    a cache.  How do you determine what items to evict when the cache is
    full?  How do you plan to support multiple policies, both for sizing
    and eviction?  Are there any options for those who need to perform
    external cache maintenance?
Not this bug.

-j

_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to