Background/Scope
----------------

Last week, Katie, Morgen, Ted and I got together to review the current packaging of Chandler and the progress of the flattening project. We also talked about creating some patterns for Chandler's API organization. This post is my attempt at summarizing our discussion and hopefully consensus on how Chandler APIs should be organized, within the following overall scope:

1. Get API packaging/grouping settled in 0.6. Given that this will likely involve more flattening and flinging directories around, getting the overall groupings settled is all we can reasonably expect to do in 0.6.

2. Examine individual APIs more closely in 0.7. This will be a good time (or at least a better time than now) to revisit Chandler's content model in general, the implementation of stamping, and inheritance vs. composition questions in general.


Goals/Issues
------------

Given the scope, the goals of the API guidelines are to:

Make Things Easy to Find and Use
One of the 0.6 API goals is that someone should be able to create a parcel like the Amazon or Flickr parcels without needing someone from OSAF to sit over their shoulder. Currently, however, you have to either have fairly intimate knowledge of the codebase or else copy things into your parcel that you don't understand and can't fix. This is partly because we don't...

Distinguish Internal Components From APIs
Right now, there are no organizational cues that distinguish internal components from APIs in the codebase, and in many cases there are no cues to even tell the difference between a module and a class within that module, that can produce confusing errors for people not as familiar with the codebase. This isn't just a "newbie" problem, either; providing organizational cues is more like making a handle ridged so you can get a better grip on it: it makes things easier for everybody, all the time. And speaking of making things easier, we also want to...

Provide Easy API Access in Scripting/Testing Tools
Our API strategy needs to consider tools like "headless", CPIAScript, and any future embedded-in-Chandler developing or scripting tools. They need a way to access APIs that's consistent with the way they're used in Python. If the way we're accessing APIs in Python is "too hard" to be used in these other tools, then the way we're accessing APIs is too hard, period, and we need to make it better, rather than making "headless" or CPIAScript into isolated, training-wheels-only ghettoes. Similarly, we want to...

Follow Python Community Lessons-Learned
...so that we don't create an isolated island of Chandler-only rules. We're not in a position to dictate standards, and in any case the common Python practices have solid reasoning behind them. Minor deviations (like allowing "_" in module names) aren't a big deal, but the overall look-and-feel of the APIs shouldn't produce any big surprises.


Proposed Guidelines
-------------------

Don't Stop Application Work to Refactor
Obviously, application feature work takes priority. It's also best to keep reorganization a separate task from making actual changes or additions, because then it becomes hard to tell what's a problem with a move, versus what's a problem with a change.


APIs are short, all-lowercase, top-level names under "osaf"
Each API will be accessible via "from osaf import apiname". For example, "from osaf import pim, mail, startup" would import the pim, mail, and startup APIs. You can then use expressions like "pim.Contact" to refer to a particular class or function.

Notice that this means that APIs that are packages *must* import any API classes, functions, etc. from the package's internal modules, to make them available for direct access from the API namespace. Users *must not* need to import modules from inside an API package, and the only time third-level imports may be used is from within the API package itself. For example, "from osaf.pim.items import ContentItem" should not appear in any code outside the osaf.pim package.

Instead, "from osaf import pim" and "pim.ContentItem" should be used wherever possible, so that we have a consistent message as to what a thing is called, and the same name can be used in all of our tools. For example, the Python interpreter for the "headless" tool can import the APIs so that it suffices to refer to 'pim.Contact' to use the Contact type. The same applies to CPIAScript, etc., so that we have only *one* name by which a given thing is known, across all tools.

(Since a given API symbol is always used with its API name, we may also want to rename some APIs for a better fit; for example, if there were a MailMessage type exported from the mail API, it might make sense to rename it to mail.Message.)

In the top level 'osaf.__init__' module, we will have an '__all__' list containing the names of API packages that conform to these API guidelines, so that tools like "headless" can use "from osaf import *" to obtain all of the APIs. (If an API package does not yet conform to these guidelines, it must not be added to this list.)


APIs Are Also Defined Using '__all__'
API modules should define an '__all__' list at the top of the module, listing the names of API classes, functions, constants, etc. in that module. (See application.schema and osaf.startup for examples.) The pydoc tool and "import *" both use this list to control their operation, and it gives people reading the code a quick list of what is actually part of the API. It frees you from having to use '_private' names for classes or functions you don't want to include in the API, or which are not yet ready for inclusion in the API. '__all__' is also useful for assembling a package API from individual modules' API; more on this below.


Modules are Better Than Packages
Don't create a package for an API unless there's too much code. What "too much code" is can vary, but anything over 2500 lines is probably too much, and anything under 1000 probably doesn't make sense to split. If in doubt, err on the side of keeping it a single module, because someone browsing the code can skim a single module more easily than they can dive into a subdirectory and poke around in more modules. (Especially if you put an __all__ list at the top of the module so they know what the API is.)

Keep in mind also that packages are not for categorization. It's easier for people to scan a flat list of mixed APIs like "app, pim, mail, startup, webserver, sharing, etc." then to delve into a many-leveled tree that perfectly categorizes the code.

If you're creating an API module under osaf, you can put its tests into osaf.tests, and you can put any doctest files adjacent to the module. (E.g. 'startups.txt' is adjacent to the 'osaf.startup' module.) This is a great opportunity to include programmer overview documentation and testable examples, right where they can be found when browsing the source tree.


Use Packages When a Module Gets Too Big
But *only* if it's really too big. It's easier to expand a module into a package than to go the other way around. When you split a module into multiple modules, you should make sure each module define the right '__all__' list for its contents, and add a '__parcel__="osaf.whatever"' setting, to ensure that the entire package goes in the same parcel. The __init__.py for your new package should then use 'from foo import *' for each module within the package, so that existing code still works. You should *not* give either of the split modules the same name as the old module; this is confusing. Instead, break up the original module along thematic lines and name the two or more modules for those themes within the overall API scope.

You can leave your package's tests under osaf.tests if you like; or move them directly to your package directory. Since by convention we name all test modules beginning with "Test", it is not necessary to create a 'tests' subpackage. (Because the Test* modules are visibly different and alphabetically grouped, as well as eventually being the only modules that should be using UpperAndLower case names.) This does not mean that we are flattening 'tests' subpackages; this doesn't have enough API impact to be worth doing. But it's also not really worth creating new 'tests' packages either.


0.6 Plans
---------

The 'pim' package is already in transition towards meeting these guidelines, although there are still many flattening operations pending, such as moving all of the calendar modules up one level, and eliminating the subparcels for contacts, tasks, mail, etc. These won't take place until after the sets branch is reintegrated and stable.

The plan for the current osaf.pim.mail module is that most of it should be moved to osaf.mail.__init__, except for the five classes needed to support the core pim contentmodel: AccountBase, EmailAddress, MIMEBase, MIMEContainer, and MailMessageMixin. All of the other classes and APIs make more sense associated with the 'mail' API, as an API for the mail service and mail client operations in general.

For 0.6, the APIs we end up with should be at least pim, mail, sharing, startup, and webserver. Also, as part of the parcel.xml elimination process, the osaf.examples.* packages are going to become top-level packages under the 'parcels' directory, the way that 'feeds' and 'photos' already have.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "Dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/dev

Reply via email to