Richard Wackerbarth writes: > As an example, suppose that I want to have an "intelligent" ToDo > indicator. As a minimum, I need to be able to get from the data > store a list of MLs that have pending requests AND for which I am > authorized to do that work. Typically, this would be some kind of > join.
OK. But in my head, Python is a dynamic language, and we should be able to use the ORM to dynamically revise the DB schema, and access such complexly specified data. > I don't pretend to know just what our users will want to add. But > they should be allowed to write an SQL-type description of their > needs and they shouldn't "muck" with the inner workings of the > message handling schema to do so. So by "SQL-type" do you mean they *must* have access to the RDBMS so they can actually write SQL, or just that the provided interface needs to allow queries with logical operators? The "-type" suggests you mean the latter. But you've also suggested the former. I don't like the idea of direct access to the underlying database, because there isn't necessarily going to be just one, and it may be that Mailman needs certain kinds of access to component DBs (eg, updating email addresses) but the organization would like to have access controls on them based on another component database (authorized admins, say). Also, we're not in a position to require that all databases be kept in, say, Postgresql. They may not even be RDBMSes (LDAP member databases, sendmail alias files). So we need a layer of abstraction. > > The point is that the message distribution agent is > > mission-critical; if it goes down you are well-and-truly screwed. > > If the web UI goes down, it might not even be noticed for weeks. > > I don't buy that. If you advertise a subscribe URL, or any other > function, that is just as much a "mission critical" component as > any other. We'll have to agree to disagree. > I don't see user passwords providing much direct use in the mail > distribution system. I don't understand what you're thinking. You started this thread with the observation that various components are keeping data in different places, and that this data is often redundant but not synced or inaccessible. To me this suggests a design principle: a single conceptual database managed by a core component (i.e., one that is present in every Mailman 3 system). The implementation of that database may very well include multiple database systems (eg, the organization's LDAP directory, a Postgresql database for the tables related to list configurations, and an MTA alias file for the list addresses). However, these need to be managed via a single common API, and the data must not be private to any non-core component. The fact that some data are not useful to all components seems to me to be a red herring. The point of a DBMS in general is that you can flexibly access only the data you need for the job at hand, in a form optimized for the job at hand. > > So what? This extension needs to be done *somewhere*; you aren't > > going to be able to avoid it by throwing it out of the core. > > No, but I will "compartmentalize" it. You mean "as a single entity in the distribution of core components", or "as per-component entities containing what each component needs"? > No, I am suggesting that either you implement the functionality by > specifying that some particular structure be set in a standard > database (and provide a reference implementation of doing so) or I think that's a non-starter. We are not in a position to specify that there even *be* a standard database backing our API, unless we're willing to push the redundancy/inaccessibility problems to the next higher level by copying databases from organizational sources *outside* of Mailman *into* Mailman-only databases. I consider that unacceptable; use of external databases for subscriber lists is a high-frequency RFE, and it would be *way* higher if it weren't for the extremely high quality of MM-U participants, most of whom check the FAQ/tracker and notice that there already is an RFE on file. AIUI, Barry does too. > that you specify REST interfaces that implement the appropriate > functions and REQUIRE that all components manipulate that data ONLY > through those interfaces. > > The REST interface is not a single entity, but a collection of > components that inter-operate. This makes no sense to me. I see the architecture as +--------------+ +-------+ | Message | | | | Distribution | . . . . . | WebUI | +--------------+ +-------+ \ | / \ | / \ | / +-----------------------+ | REST API | +-----------------------+ / | \ / | \ / | \ +------------+ +------------+ | Subscriber | . . . . . | Social | | List | | Networking | | | | Data | +------------+ +------------+ where the "MD" component may perceive a member in terms of only subscriber data (i.e., something on the order of (FullName, Email, BounceCount)), while the "WUI" component might be interested in something like (Avatar, FullName, Email, IsATroll). (Of course the lower ellipsis also include a site config DB and a list config DB.) To my mind a Pythonic base REST API would return MemberObjects with appropriate properties, and the properties would be turned into DB queries on access. For performance-critical cases there would be a separate .query() method on MemberObjects that would look up a vector of attributes in one DB query. Also a .select() method on the MailmanDB object which would return a list of MemberObjects with specified properties, optionally as a (MemberObject, *values_of_requested_properties) tuple or dict. > Further, "each non-core module will do it differently and > incompatibly" is a red herring. There MUST be a SPECIFICATION of > the interface and EVERY implementation MUST meet those > REQUIREMENTS. What ever else it does will not affect any other part > of the system. Have you ever told a baby to stop sucking their thumb, and use the pacifier? You have to pull the thumb out to get the point across. In the same way, there's going to have to be one implementation, and that implementation will be distributed with the core. Otherwise there WILL be a SPECIFICATION of the interface and EVERY implementation WILL meet those REQUIREMENTS (except where the implementer finds it inconvenient), and we're back where we started. > That is because you have not followed the principles and allow > "someone else" to provide that service. True. (I wish you'd stop using "you" in this kind of statement; it isn't true, I didn't code any of this. And it doesn't matter who did.) Announcing principles isn't going to help enough, though. Python operates on the basis of "consenting adults" and can't force anybody write a program in a particular way. Unless the API is actually provided in *every* Mailman 3 distribution, and is well- enough designed to be TOOWTDI, implementers will work around it. > >> I view your argument as the message handler claiming "I'm special! > > > > It is. First, it is mission-critical; nothing else is. > > And the underlying RDBMS, the MTA, etc. are not? This confounds levels of architecture. > This is my objection. IF some particular data is exposed, then it > should be maintained by one handler, without back doors. If that > handler is local, then the interface need not serialize the data > and transmit it, but the access should be isomorphic to doing so. That's not an objection, that's a somewhat more precise restatement of what I wrote. > Credentials should be kept in a separate box. And that box should > be kept where ever it best fits in the overall data flow. Precisely. Since databases will be needed by all components when present, they should be kept in or with a component that will always be present. That's what "core" means. > From a design perspective, it should be easy to place it anywhere > the installer desires. No. That exposes an implementation detail. As far as installers are concerned, the database *is* the API. Where it is located is none of their business. There will need to be a little leakage here, because admins will want to link the Mailman DB to existing organizational DBs. So the possibility of specifying an existing external database needs to be considered. But this is only slightly more than the amount of information required to configure Mailman's own PostgreSQL or MySQL database, and these are not going to be "placed" by a Mailman admin, but rather configured and accessed from a provided installation (whether by the user organization, or by an OS distro). So I don't see a need to make a big distinction here, except that the "own" database will have a schema designed for Mailman, but an external database will need some kind of "adapter" to match schema. > For distribution, a reference implementation of EVERY interface > should be included. I don't see how that's possible in your design, since you propose to allow components to implement their own databases. > And substituting a different implementation should be a simple as > excluding the distribution version and dropping in the alternate. Sure, but is there a reason why this might be difficult? ISTM that Python's orientation to duck-typing will make this happen naturally. (I don't mean to ignore the possibility of problems, but if you have something specific in mind we can be careful to avoid that in the design process.) _______________________________________________ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9