On Wed, Jun 22, 2022 at 04:13:40AM +0000, Hamid Maadani wrote:

> > This sort of "concat" operation is a bad idea, because it is prone to 
> > collisions...
>  
> Those were just examples to discuss a point. You can find similar
> types of concatenations in multiple guides written for setting up
> postfix with a mysql backend. For example refer to
> 'virtual_alias_domains.cf' mentioned in arch linux's wiki page:
>
> https://wiki.archlinux.org/title/Virtual_user_mail_system_with_Postfix,_Dovecot_and_Roundcube

There are lots of Wikis giving dubious advice.  Yes, in some corner
cases one might actually want to compute some result elements as
concatenations of multiple input fields, and perhaps this can be
supported, but it should not be encouraged, and the simple cases where
this is not used should be easy and natural to express.

> I was just trying to understand if these type operations (concat,
> etc.) need to be supported in the projection. Am I correct in
> understanding they are not?

They're not essential, but can be added as expert features.  Let's get
the basics right first, and talk about the expert features second.

> If the result_attribute + result_format design is the best practice,
> I'm all for that.  need to go look at the result_format and understand
> how to use it with mongo..

It is the "basics right" approach, which avoids advanced MongoDB
syntax.

> >> which would return:
> >> maadani,ha...@dexo.tech/,dukhovni,vik...@postfix.org/
> > 
> > which makes no sense.
>  
> This is honestly confusing to me. This was meant to show we are
> printing multiple multi-valued results as one comma separated string.

These *particular* results make no sense because you're mixing last
names with directory paths.  The list elements are from different
semantic domains.

> When you say this makes no sense, are you referring to this result not
> being useful to postfix because of multiple mail-paths in it? or the
> comma separated string part!?

Neither, it is the disparate semantics of the elements.  Had the
elements all come from the same semantic domain, and not been compounded
from multiple input columns, they would typically all have the same
post-processing requirements, that could likely be handled with just
"result_format".

> > You do have to decide how mailing lists are modeled in MongoDB. Are
> > they one row per member? Is it a list of "_id" values? Or a list of
> > email addresses? If the former, how does list expansion work? Can
> > MongoDB do joins as well as projections? ...
> 
> I imagine each list as a JSON object with an array of addresses inside of it. 
> Something like:
> { "createdAt": ISODate("<some date>"), "active": 1, "addresses": [ 
> "ha...@dexo.tech",
> "vik...@postfix.org" ] }
>  
> Would that work?

Only if you provide code to handle list-valued result columns, and if
such denormalised schemas are best-practice for MongoDB.  A more typical
database practice is to have a "member" table, which makes it easy to
insert users into lists without modifying the list itself, to delete
a user from all the lists a user is a member of, ...

Member tables work best the database supports some form of "join"
operation, though of course they could be as simple as:

    { "list": "somel...@example.net", "member": "la...@example.net' }
    { "list": "somel...@example.net", "member": "cu...@example.net' }
    { "list": "somel...@example.net", "member": "m...@example.net' }

with both the list name and the member primary address stored by value,
rather than by reference.

Is there some prior art in this space?  Has anyone used MongoDB for
managing email users and lists with some other MTA?

> MongoDB supports joins, but through "aggregation pipelines":
> https://www.mongodb.com/docs/manual/aggregation
> 
> here, we are using 'mongoc_collection_find_with_opts' which runs a 'find' 
> operation. If support for
> joins are necessary, we should switch to 'mongoc_database_aggregate' and 
> require 'filter' to be in the 
> pipeline format:
> http://mongoc.org/libmongoc/current/mongoc_database_aggregate.html

Well, this is an important design choice.  What sort of schemas are
best-practice in this space?  Are joins need to enable some data
"normalisation"? ...

> One more question, what's the policy regarding multiple databases? the
> way that the module is now, it supports multiple collections (tables)
> in only one database. Should I put any effort in supporting multiple?
> For example, if mailboxes are in cluster 1 and mail lists in cluster 2
> (separate URIs basically)?

There should be no assumption that all tables use the same database.
Each table designates its source database.  The thing that need not
be supported (and is likely impossible to express or difficult to
implement) is joins or other operations that span multiple databases.

-- 
    Viktor.

Reply via email to