Background/Scope
----------------
Last week, Katie, Morgen, Ted and I got together to review the current
packaging of Chandler and the progress of the flattening project. We also
talked about creating some patterns for Chandler's API organization. This
post is my attempt at summarizing our discussion and hopefully consensus on
how Chandler APIs should be organized, within the following overall scope:
1. Get API packaging/grouping settled in 0.6. Given that this will likely
involve more flattening and flinging directories around, getting the
overall groupings settled is all we can reasonably expect to do in 0.6.
2. Examine individual APIs more closely in 0.7. This will be a good time
(or at least a better time than now) to revisit Chandler's content model in
general, the implementation of stamping, and inheritance vs. composition
questions in general.
Goals/Issues
------------
Given the scope, the goals of the API guidelines are to:
Make Things Easy to Find and Use
One of the 0.6 API goals is that someone should be able to create a
parcel like the Amazon or Flickr parcels without needing someone from OSAF
to sit over their shoulder. Currently, however, you have to either have
fairly intimate knowledge of the codebase or else copy things into your
parcel that you don't understand and can't fix. This is partly because we
don't...
Distinguish Internal Components From APIs
Right now, there are no organizational cues that distinguish internal
components from APIs in the codebase, and in many cases there are no cues
to even tell the difference between a module and a class within that
module, that can produce confusing errors for people not as familiar with
the codebase. This isn't just a "newbie" problem, either; providing
organizational cues is more like making a handle ridged so you can get a
better grip on it: it makes things easier for everybody, all the time. And
speaking of making things easier, we also want to...
Provide Easy API Access in Scripting/Testing Tools
Our API strategy needs to consider tools like "headless", CPIAScript,
and any future embedded-in-Chandler developing or scripting tools. They
need a way to access APIs that's consistent with the way they're used in
Python. If the way we're accessing APIs in Python is "too hard" to be used
in these other tools, then the way we're accessing APIs is too hard,
period, and we need to make it better, rather than making "headless" or
CPIAScript into isolated, training-wheels-only ghettoes. Similarly, we
want to...
Follow Python Community Lessons-Learned
...so that we don't create an isolated island of Chandler-only
rules. We're not in a position to dictate standards, and in any case the
common Python practices have solid reasoning behind them. Minor deviations
(like allowing "_" in module names) aren't a big deal, but the overall
look-and-feel of the APIs shouldn't produce any big surprises.
Proposed Guidelines
-------------------
Don't Stop Application Work to Refactor
Obviously, application feature work takes priority. It's also best to
keep reorganization a separate task from making actual changes or
additions, because then it becomes hard to tell what's a problem with a
move, versus what's a problem with a change.
APIs are short, all-lowercase, top-level names under "osaf"
Each API will be accessible via "from osaf import apiname". For
example, "from osaf import pim, mail, startup" would import the pim, mail,
and startup APIs. You can then use expressions like "pim.Contact" to refer
to a particular class or function.
Notice that this means that APIs that are packages *must* import any
API classes, functions, etc. from the package's internal modules, to make
them available for direct access from the API namespace. Users *must not*
need to import modules from inside an API package, and the only time
third-level imports may be used is from within the API package itself. For
example, "from osaf.pim.items import ContentItem" should not appear in any
code outside the osaf.pim package.
Instead, "from osaf import pim" and "pim.ContentItem" should be used
wherever possible, so that we have a consistent message as to what a thing
is called, and the same name can be used in all of our tools. For example,
the Python interpreter for the "headless" tool can import the APIs so that
it suffices to refer to 'pim.Contact' to use the Contact type. The same
applies to CPIAScript, etc., so that we have only *one* name by which a
given thing is known, across all tools.
(Since a given API symbol is always used with its API name, we may also
want to rename some APIs for a better fit; for example, if there were a
MailMessage type exported from the mail API, it might make sense to rename
it to mail.Message.)
In the top level 'osaf.__init__' module, we will have an '__all__' list
containing the names of API packages that conform to these API guidelines,
so that tools like "headless" can use "from osaf import *" to obtain all of
the APIs. (If an API package does not yet conform to these guidelines, it
must not be added to this list.)
APIs Are Also Defined Using '__all__'
API modules should define an '__all__' list at the top of the module,
listing the names of API classes, functions, constants, etc. in that
module. (See application.schema and osaf.startup for examples.) The pydoc
tool and "import *" both use this list to control their operation, and it
gives people reading the code a quick list of what is actually part of the
API. It frees you from having to use '_private' names for classes or
functions you don't want to include in the API, or which are not yet ready
for inclusion in the API. '__all__' is also useful for assembling a
package API from individual modules' API; more on this below.
Modules are Better Than Packages
Don't create a package for an API unless there's too much code. What
"too much code" is can vary, but anything over 2500 lines is probably too
much, and anything under 1000 probably doesn't make sense to split. If in
doubt, err on the side of keeping it a single module, because someone
browsing the code can skim a single module more easily than they can dive
into a subdirectory and poke around in more modules. (Especially if you
put an __all__ list at the top of the module so they know what the API is.)
Keep in mind also that packages are not for categorization. It's
easier for people to scan a flat list of mixed APIs like "app, pim, mail,
startup, webserver, sharing, etc." then to delve into a many-leveled tree
that perfectly categorizes the code.
If you're creating an API module under osaf, you can put its tests
into osaf.tests, and you can put any doctest files adjacent to the
module. (E.g. 'startups.txt' is adjacent to the 'osaf.startup'
module.) This is a great opportunity to include programmer overview
documentation and testable examples, right where they can be found when
browsing the source tree.
Use Packages When a Module Gets Too Big
But *only* if it's really too big. It's easier to expand a module
into a package than to go the other way around. When you split a module
into multiple modules, you should make sure each module define the right
'__all__'
list for its contents, and add a '__parcel__="osaf.whatever"' setting, to
ensure that the entire package goes in the same parcel. The __init__.py
for your new package should then use 'from foo import *' for each module
within the package, so that existing code still works. You should *not*
give either of the split modules the same name as the old module; this is
confusing. Instead, break up the original module along thematic lines and
name the two or more modules for those themes within the overall API scope.
You can leave your package's tests under osaf.tests if you like; or
move them directly to your package directory. Since by convention we name
all test modules beginning with "Test", it is not necessary to create a
'tests' subpackage. (Because the Test* modules are visibly different and
alphabetically grouped, as well as eventually being the only modules that
should be using UpperAndLower case names.) This does not mean that we are
flattening 'tests' subpackages; this doesn't have enough API impact to be
worth doing. But it's also not really worth creating new 'tests' packages
either.
0.6 Plans
---------
The 'pim' package is already in transition towards meeting these
guidelines, although there are still many flattening operations pending,
such as moving all of the calendar modules up one level, and eliminating
the subparcels for contacts, tasks, mail, etc. These won't take place
until after the sets branch is reintegrated and stable.
The plan for the current osaf.pim.mail module is that most of it should be
moved to osaf.mail.__init__, except for the five classes needed to support
the core pim contentmodel: AccountBase, EmailAddress, MIMEBase,
MIMEContainer, and MailMessageMixin. All of the other classes and APIs
make more sense associated with the 'mail' API, as an API for the mail
service and mail client operations in general.
For 0.6, the APIs we end up with should be at least pim, mail, sharing,
startup, and webserver. Also, as part of the parcel.xml elimination
process, the osaf.examples.* packages are going to become top-level
packages under the 'parcels' directory, the way that 'feeds' and 'photos'
already have.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Open Source Applications Foundation "Dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/dev