time for cyrus-imap v3.2?

2019-11-04 Thread Ricardo Signes
So, I think the plan was to cut a stable Cyrus 3.2 after we had stable JMAP. Is 
that time now? We talked about this on the Zoom call today.

Cyrus master has pretty stable for JMAP core and mail. I think we need to do 
one more pass through to look for places where Cyrus extensions might leak 
through without the correct `using` options, but apart from that, I don't think 
we expect its mail API to change apart from bugfixes.

The other part of the conversation was declaring pre-3 releases EOL except for 
security fixes.

I don't have much of a horse in this race, but it felt like a bit of looming 
question.

--
Ricardo Signes (rjbs)


yearly release cycle

2019-12-13 Thread Ricardo Signes
Hey, remember last month when I asked about releasing Cyrus v3.2 
?

That thread had some more conversation about what needs to get done before 
v3.2, and I wanted to come back to it and turn some things on their head.

Right now, we’re talking about Cyrus releases being feature-bound. “We’ll 
release v3.2 when feature X is done.” I think we’re not being well-served by 
that. As feature X is delayed (for various reasons that we can’t easily 
eliminate), it doesn’t just delay the feature, but also all the other minor 
bugfixes and optimizations that we’ve made in the master branch. Also, it sets 
up the idea that we delay releases for the sake of fixes, instead of releasing 
the fixes that are ready.

That is: every additional criteria for a new release is another doorway to 
delay. Instead of opening those doors, I would rather try to eliminate all of 
them.

I propose that instead of tying releases to milestones, we tie them to the 
calendar. For the sake of full disclosure: I am modeling this suggestion on the 
release cycle of perl , which I ran for 
several years. I found the process more than satisfactory, then.

 1. A new *unstable release* of Cyrus is made every month. We promise only that 
it compiled and passed the Cassandane test suite on the release manager’s 
computer. It might contain regressions from previous unstable releases, it 
might have crashers or corruptors. We try to avoid any of these, but the goal 
here is a snapshot for easy month-to-month testing. These are the 
odd-middle-digit releases. (3.3.x)

 2. A new *major release* of Cyrus is made every year. We will have tested it 
on as many configurations as we can readily test. We will have, some time 
before the release, frozen the branch for risky changes, to reduce churn. In 
the meantime, new work lives in feature branches. (The changelogs from each 
unstable release provide a good basis for the whole-year changelog!) These are 
the even-middle-digit third-digit-zero releases. (3.4.0)

 3. A new *maintenance release* of Cyrus is made for the last two stable 
releases when there are enough fixes to critical bugs to warrant it. These are 
the even-middle-digit third-digit-nonzero releases (3.4.1)

For the above to work, some more properties need to be maintained.

Maintenance releases should be no-brainers to install, so they must only fix 
regressions, crashers, security vulnerabilities, and the like. This means that 
once you’re on 3.4.0, you can always upgrade within the 3.4 series with a 
minimum risk. It also means you get no optimizations, features, and the like.

Major releases must clearly document any incompatible changes or upgrade steps 
required. Because non-regression bugfixes aren’t backported, we want everyone 
to be able to upgrade from major release to major release, so incompatible 
changes must be kept to a minimum.

In part, this is just “don’t kill off a feature people use just because it’s a 
little annoying.” The more important one is “don’t introduce half-baked things 
that might need to change,” because people will come to rely on them before you 
get the updates finished. For features that will require multiple years to get 
right, they have to go behind a default-off configuration option. I’d strongly 
suggest they all have a uniform substring like “unstable”. That way, when a 
complaint comes in that the behavior of JMAP calendaring has changed, we can 
reply, “well, to use it, you had to turn on the unstable_jmap_calendaring” 
option.

If we go with this policy, we’ll need to…

 1. identify what issues are *blockers* to v3.2.0, meaning they’re regressions 
from v3.0 and would reasonably prevent someone from upgrading; this does *not* 
include all known bugs, since they may be bugs that already exist in the last 
stable release!

 2. pick a release target for v3.2.0; I will arbitrarily suggest March 2 as 
“not too far off, but far off enough that we can get things in order”; also, if 
you’re American, March 2 is 3/2 ;-)

 3. produce a changleog, and especially identify what changes in master need 
documentation as “incompatible changes”

 4. produce a list of changes in master that should be put behind an unstable 
configuration option and then do it

 5. decide when to stop merging non-release-related things to master

 6. make a plan for who will do monthly snapshot releases

I’ve spoken with ellie and Bron about just a few of these, such that I don’t 
think it’s all crazy. (ellie notes, correctly, I think, that the first set of 
releases like this will be the hard ones, where we work out things like “how do 
we keep track of incompatibilities, upgrade steps, and also how do we make 
snapshots dead easy to release.”) If there’s general agreement, I am definitely 
ready to pitch in and help try to make it work!

—
rjbs



Re: yearly release cycle

2019-12-17 Thread Ricardo Signes
On Tue, Dec 17, 2019, at 12:58, Anatoli wrote:
> Hi Ricardo,

Hi!

> But I couldn't understand from the description what are the benefits of
> tying major releases to certain calendar dates vs to make a release when
> certain desired features are implemented and well tested.

By promising a new major release every year, you know, when your significant 
improvement to Cyrus is accepted, it will very likely be released within a 
year. Right now, users who added a major feature in 2017 are still waiting for 
a stable release. For example, Sieve duplicate detection was implemented in 
March 2017. I don't think we have a stable version that has this feature. If 
this had been a contribution from a potential repeat contributor, it's easy to 
imagine that they'd have given up in frustration, by now. (Good thing it was 
good ol' reliable Ken!)

The problem with "we will release when X" is ready is that X might not be ready 
in a year, meaning all the little things don't get released. Also, you can't 
shove those into maintenance releases, because the little things still can be 
destabilizing, so it's less likely to be no problem to just upgrade.

In the event that a cool new feature isn't quite ready a month before release, 
I would argue: yes, it has to wait another year. I think it will be pretty rare 
that this happens, though. If it comes up, of course, an exception to the rules 
could be discussed, but in my experience, it won't. These kind of features, in 
largely volunteer-staffed projects, are rarely good at sticking to a timeline.

> Then, when you implement a new large feature, who would test it?

1. new large features should have tests written for them, which should be run 
by developers and dedicated test runners
2. some people always like to run snapshot releases; I have often done coding 
on dev releases of languages, and some people will surely run their personal 
services on snapshots
3. feature authors write features so they can use them; this means they're also 
both motivated and likely to use them before they're declared ready for general 
release

I feel pretty strongly that #3 is the big test. We're almost always close to 
bleeding edge Cyrus at work, because we have tons of new features that we rely 
on since cyrus-imapd-3.0.0. We know that many, many of these have been heavily 
tested in the real world, and we want to declare them generally ready for use, 
and then be able to do the same regularly as we move forward.

> Today,
> for example, I (as an advanced user and a potential community dev) can
> run 3.1 branch at some semi-production deployments (and I sometimes do)
> and report issues. If, with the new scheme, you only guarantee that the
> unstable branch just compiles, certainly I wouldn't be using it
> anywhere, and probably neither would other users. Then pre-production
> testing of new features would be exclusively the developers' task, with
> obvious limitations.

I think you are seriously overestimating the kind of stability guarantee you 
get from a 3.1 release. It's really not much more than the proposed snapshot 
releases, but on a looser timetable. Mostly, we get our current feedback from 
master, rather than snapshots, because there are fewer known snapshot 
deployments. Deploying snapshots regularly will give more points where we're 
specifically asking for feedback. (Also, I guaranteed Cassandane tests would 
pass, which is a *far* stronger guarantee than compilation.)

My expectation is that in reality, the snapshots will be, at any given time, 
very close to what Fastmail is running in production, or at least in (real, 
used by real people for real mail) testing environments.

> So when the devs are sure that a new feature works well (in their setups
> and for their use cases), it is included in the next major stable
> release... and suddenly a lot of migrating users start finding issues.
> That could create an impression that the new stable releases of Cyrus
> are not that stable at all.

I expect these features will have been heavily tested over the course of the 
time between releases.

> I don't understand what you mean here, but with the current scheme
> (AFAIK) the bug fixes go to the current stable branch (3.0) and all
> users receive them without delays.

There are two kinds of bugfixes. Some are "there is an obvious regression or 
crasher." Others are "there has long been a bug that meant that SELECT would 
fail on mUTF-7 sequences containing three hyphens in a row, and I fixed it!" 
The intent here is to include only the first category in new maintenance 
releases, because that optimizes maint releases for stability, making them easy 
to install without fear. The other fixes are put into the next possible 
snapshot for inclusion in the next major release.

I think your major concerns are:

1. new features might languish for a longer time than needed to be known stable
2. snapshots will be less reliable under this regime than before

I feel strongly that #1 wil

Re: yearly release cycle

2019-12-20 Thread Ricardo Signes
On Wed, Dec 18, 2019, at 05:13, Дилян Палаузов wrote:
> The email of Quanah Gibson-Mount from 25 July about the general policy on 
> integrating patches in Cyrus SASL is not
> answered.
> 
> Will the time–based release policy also apply to Cyrus SASL?

I think there was some discussion / decision on this a while back, but I don't 
remember. cyrus-sasl always floats just outside my field of vision… I *think* 
I'll be talking to Ken on Monday, who can clear things up.

-- 
rjbs


moving the Cyrus mailing lists?

2020-01-14 Thread Ricardo Signes
This morning, I updated a bunch of my Sieve to split out some mailing list 
mail, and remembered that I've been meaning to float the idea of moving Cyrus's 
development lists.

Right now, the Cyrus lists are hosted at lists.andrew.cmu.edu. My understanding 
is that CMU is no longer involved with Cyrus development, but we've got 
assurances that these lists will keep working indefinitely. So: I don't think 
we *need* to do anything.

Fastmail is deeply committed to Cyrus IMAP, and we have a mailing list product 
(Topicbox) that I think is really good, with great search (powered by Cyrus + 
Xapian), a great web UI for both public archives and composing mail, and it 
still does all the things you'd expect from mailing lists: accepts mail via 
SMTP, has a moderation queue, and so on.

You can see an example open source project hosted on Topicbox here: 
https://illumos.topicbox.com/ <https://illumos.topicbox.com/latest>

So, consider this an offer. I'll set up a gratis Topicbox organization for 
hosting the Cyrus lists and we'll import past mail from the lists into the 
organization and, if we can get the subscriber lists, transfer subscribers. For 
me, the benefits are having an easier-to-search archive and living without fear 
that the list might go away. Obviously, I'm also a Fastmail employee, and I'm 
happy to show off Topicbox, too.

I'll wait a while to hear general interest in the idea before doing anything 
else.

-- 
Ricardo Signes (rjbs)
CTO, Fastmail

guide to migrating to virtdomains: userid

2020-01-14 Thread Ricardo Signes
We have documented that we wish to discourage settings for userid other than 
virtdomains. In discussion earlier today, ellie and I chatted about the fact 
that 3.2 is a good time to take *some* action in that regard.

I think I have a question and a request for volunteer.

*QUESTION:* Do we have an established policy for how features are deprecated 
and removed? For example, if we can make a backward incompatible change in 
Cyrus X.Y.0, how long in advance do we need to provide notice, and via what 
means?

In this case, the very least we can do is add it to the documentation. The 
second-to-least thing we can do is issue a warning when a deprecated setting is 
found, even if we don't have a timetable for removal.

*REQUEST*: We want people to change from virtdomains:X to virtdomains:userid 
for all non-userid values of X. It would be a kindness to provide a migration 
guide. Is anyone competent to write such a guide? (My eye is drawn to one Bron 
Gondwana, but perhaps he migrated his user base so long ago that all his wisdom 
on that topic has since been replaced by new wisdom.)

-- 
Ricardo Signes (rjbs)
CTO, Fastmail

minutes, Cyrus call 2020-05-11

2020-05-11 Thread Ricardo Signes
I think it's been since July that we've not posted any minutes from the regular 
developer's Zoom call. This was basically fallout from some rescheduling and 
tool changes, but it's still not great. Sorry about that! Let's get this going 
again.

Today (2020-05-11 🇺🇸) in attendance: Bron, Ellie, Ken, Rik, Robert

 * ellie
   * Travis has been testing the wrong branches — generally always master 
instead of version or feature branches
   * Also, Cassandane make failures were unreported.
   * Partha did a ton of work and this should be fixed.
   * Travis is mostly just failing on Sieve.snooze_tzid - might add that to the 
disabled tests list.
   * Getting lots of bug reports for 3.2: all the things not tested during the 
RC/beta period
   * Marco found that syslog interception doesn’t work on Redhat system. Would 
be good to fix it for future Cassandane users.
   * Oh yeah, 3.2 is out! Only 2 months late. 3.4 for March 2021. Feature 
freeze end of 2020.

 * Robert
   * Big features like attachment indexing - always pushed on master. e.g. if 
3.2 users do attachment indexing then we have to tell them to reindex 
everything when they get to 3.4.
   * The plan going forward will be to keep big features on long-lived feature 
branches while they're developed, and only in master when proven out. (This 
will often mean "has run on Fastmail for a long while under real user load".)
   * The new build and deploy systems will make this much easier to manage.
   * Basic plan is to include branches with particular labels in the final 
build. This will be similar to the current Fastmail branch building stuff. The 
Fastmail builder will also be able to look at our private gitlab if we want to 
build anything in secret.
   * Was looking into a possible split-conversations bug, has been transferred 
to Bron.
   * Hoping to get JSCalendar done by June, so have been spending time on that.
   * Went down a rabbit hole of looking at how languages are stored in Xapian. 
Now much simpler.
   * Implemented a way to recalculate xapian language counts exactly. Can 
always recalculate language counts.
   * Close to getting to code review for attachment indexing. 
   * Now API can set “index level” on Xapian document - can allow for partial 
indexing. Bron suggests adding headers to the level data too.
 * Ken
   * Hopefully this week subscriptions for mailbox storage by UUID get finished
   * Added JMAP Backup tweaks that Fastmail requested - adding options to 
restore or not restore drafts.
   * Sieve work!
 * Added timezone argument to Sieve snooze - not sure if it’s been tested 
other than in Cassandane. [ed.: not yet! Fastmail Sieve generator needs an 
update first]
 * Did more hacking around Sieve! Rewrote bytecode parser to use a lookup 
table.
 * Broke and then fixed vacation fcc.
 * Started deprecating old non-standard sieve stuff. (mark/unmark, etc)
 * We’re still using legacy notify - hopefully we can switch to enotify and 
start deprecating it. Rik will talk to Ken about transition plan for Fastmail.
 * Been working on making a sieve decompiler that can convert bytecode into 
scripts again.
   * Might be hard to tell the difference between “elseif” and “else { if 
{} }”.
   * But it doesn’t need to be exact, just readable to see what was there.

 * Bron:
   * Wondering if we can “autodetect” that a user wants header indexing by 
seeing the queries they run and turn it on / flag them for reindex.
   * Talking to CMU about getting DNS transferred to Fastmail server where we 
can do better website hosting setup.

 * Rik
   * ideally, this week we will finish and begin using new-style Cyrus builds, 
which will mean some changes to the policy about "what goes on master" — will 
email when we get there
   * other than that, all Cyrus work has just been doing routine deploys

-- 
Ricardo Signes (rjbs)


minutes, Cyrus call 2020-05-18

2020-05-18 Thread Ricardo Signes
Today (2020-05-18 🇺🇸) in attendance: Bron, Ellie, Ken, Rik, Robert

Ken
 * Doing some work on the Backup/restore* APIs
 * Draft restore performance has been poor, because we’re tracking by message-id
 * We think we can get a reasonable approximation by looking in the same thread.
 * Ken still needs to do restoration of parent mailboxes. Mailboxes don't 
currently have the mailboxId of their parent, so when restored can't detect a 
rename of their former parent. We could make mailboxes record retain parentid 
in UUID mailboxes, but it’s not there now. Maybe after the new layout code 
lands…

Robert
 * The last week was spent mostly on Xapian work
 * Priority moved from working on finishing work on attachment indexing work 
toward making it possible to perform a reindex of all users
 * Removed experimental Xapian features from the main branch.
 * Fixed small bugs with Quota.
 * Issue with Xapian - right now the in-memory backend is slow - better 
solution is index to disk at temp_path and make that a tmpfs. Need to instruct 
people that it should be on a tmpfs.
 * We discussed whether to do a separate config option or to just use temp_path.
 * [ rjbs: I believe the conclusion was that one option is *probably* fine, but 
I'm not sure it was firmly concluded ]

ellie
 * Mostly been working through fixing the bugs from 3.2.0. Hoping to do 3.2.1 
later this week or early next week.

Bron
 * not a whole lot this week
 * made http_ws do session_new_id() between calls
 * Might be adding some pushes

Rik
 * also pretty light; hoping to send out notes on proposed new merge policy for 
master branch soon

-- 
rjbs

master branch: merging less

2020-05-28 Thread Ricardo Signes
Cyrus developers,

Right now, the policy on what goes in master is "when a committer thinks it's 
ready for master, it goes in master." This hasn't been a serious problem, but 
we can do better. What I'd like to do is have feature branches live longer *as* 
branches, and then be merged when we think they're done. This will make it 
easier to declare that a feature is really in Cyrus (even if it's only in a 
3.${odd} release), and to produce builds with features turned off and on. It 
will also make it easy to drop an experiment without scouring history much. 
Finally, it should make code review better, as it can be part of the "can we 
merge this?" process and when it happens, the whole feature changeset can be 
seen at once.

There isn't much actual policy change to talk about. Something like this:

> Changes to Cyrus are made in branches. Branches aren't merged until the 
> feature seems plausibly done. When a feature is still undergoing testing, 
> it's left in a branch. Pull requests are opened on GitHub, and before the 
> branch is merged, code review is completed by another committer to the one 
> who wrote the change. Branches should, whenever practical, be rebased before 
> merging. Trivial bugfixes and changes to documentation may be applied 
> directly to master, but when in doubt, favor making a branch!

This will go somewhere in our "Contribute code and tests" section, but right 
now there seems to be no discussion of policy after the creation of a PR.

At present, Fastmail tends to run very close to master, and we're often testing 
features before they're in a state we'd consider mergeable under this policy. 
To make it easy to build a Cyrus that includes all the latest bugfixes and 
approved experimental branches, we built a tool to merge all the pull requests 
we've flagged for inclusion . You might 
also find this tool useful for building your own test Cyruses.

-- 
rjbs

expanding allowsetacl

2020-06-24 Thread Ricardo Signes
Behold this commit:

commit da8305164877735ba29b078151c70455f1aa6eea
Author: Bron Gondwana 
Date:   Wed Oct 30 14:13:38 2019 +1100

imapd: allow disabling the SETACL command

I'm not sure what the intent here was, but I *assume* it was related to our 
plan to kill of the ability of users to set their own ACLs. If so, I think we 
need two small changes which I'd like to get out pretty soon.

 1. relax the restriction so the admin user can still use SETACL
 2. tighten the restriction so it also restricts DELETEACL

That's it! Then we'll move on to rolling that out and cleaning up users' 
existing ACLs. I have tentatively made a task for this 
<https://app.liquidplanner.com/space/14822/projects/show/58761531>.

Bron: I just want to make sure we're not stepping on work intended for 
something else!

-- 
Ricardo Signes (rjbs)
CTO, Fastmail


Re: expanding allowsetacl

2020-06-24 Thread Ricardo Signes
On Wed, Jun 24, 2020, at 16:25, Дилян Палаузов wrote:
> Horde/IMP allow the user to edit their own IMAP ACLs. Unfortunately I do not 
> know of any WebDAV client which users can utilize to edit their own ACLs. For 
> WebDAV editing ACLs is one way to share (a caldav) collection.
> 
> Why is there a plan to kill the ability of users to set their own ACLs?

The intent here is only to make it *possible* for an individual server to 
restrict this permission, for the sake of a more structured set of managed 
ACLs. That is: to provide a control panel for ACL management with presets, 
rather than full control to every user.

I would not suggest that ACL management be removed for all installs!

-- 
rjbs


Re: expanding allowsetacl

2020-06-25 Thread Ricardo Signes
On Thu, Jun 25, 2020, at 21:59, Bron Gondwana wrote:
> So... this does also handle DELETEACL, because they both call cmd_setacl, 
> just with different paramenters. Maybe it should be called cmd_frobacl or 
> something.

Foolishly I read the commit message and stopped there instead of looking at the 
code.

Thanks for the clarification!

-- 
rjbs

minutes, cyrus dev call, 2020-06

2020-07-10 Thread Ricardo Signes
We have been keeping minutes, but not getting them out in a timely fashion. 
I've adjusted my calendar to make this happen more reliably. (The basic deal is 
that these meetings happen at 7:30 a.m. my time, and as soon as they're over I 
go to get some caffeine, and then they've fallen down my priority list.)

I realize these are less useful when they're sent late, but I'll learn my 
lesson better by sending them anyway.

*2020-06-01*
 * *Present*: Bron, Robert, ellie, Ken, Rik, Neil
 * [ discussion about reindexing Fastmail's large install ]
 * Bron
   * been doing some small changes
   * wants to get things shepherded to production
   * soon: sync replication code!!
   * need to give sync client a state struct to pass around between funcs
 * ellie
   * v3.2.1 on Friday!
   * experimental Debian build passing all the cunit tests on all the Debian 
platforms
   * it was the tests more than the code under testing being a problem
   * v3.3.0 dev tag should happen soon!
   * some backports from master are pending (probably)
   * Cass failing on v3.2 failing because of mismatch between 
expectations/fixes on v3.2
   * rjbs: can we put Cassandane into the cyrus repo?
 * yes but it sounds like for complex reasons we might want to; rjbs will 
follow up
   * also what will we do with the uuid mailboxes test code?
 * neilj
   * rules question: fromContactGroupId, convert errors to false — has just hit 
production
   * status of backup/restore bugs?
 * Ken: as far as I know, all tasks are done
 * What about the “contacts don’t get restored back to groups”
 * for now, we’ll say this is expect, and count people who notice/complain
 * will ask Matthew to do a review of code and/or tests?
 * murch
   * uuid mailboxes all rebased by Friday before Memorial Day
   * but since, been working on other things
   * would like to move Blob/get capability out of the core capabilities 
(because it isn’t core)
   * did a fix for mime parameter wrapping (*0* stuff) for boundary string 
(wtf??)
   * did some work on AuthIndicator (BIMI) blob retrieval; had led to “we 
should try to unify some branches of blob handling code”
   * hopefully uuid mailboxes is back to top slot (assuming for example that 
restore work doesn’t turn out to be a problem)
 * rsto
   * deleted mailboxes need to retain an entry in convdb
   * we can know mailbox has been deleted, which affects how to understand the 
flags of emails in mailbox [rjbs: not sure I got this *exactly* right]

*2020-06-08*
 * moving Blob/get to JMAP_BLOB_EXTENSION capability in cyrus master, while 
adding a revert to FM builds until our middleware is ready for it: coordinate
 * Ken
   * UUID mailboxes rebased, subs db. code has been updated; will upgrade on 
first open on UUID mailbox code
   * ready to start testing???
   * fixed a crasher in vacation responses (free of const str)
   * what’s next? maybe sieve in mailboxes; rjbs says: maybe let’s get back to 
Sieve JMAP, for which we wanted a fewer-http-round-trip API (for Fm’s usage 
pattern)
 * Bron
   * seems like something in the mailbox delete code path doesn’t always clean 
up code on disk
   * clearly still live; we’re seeing recent should-be-deleted data still on 
disk
   * possibly related to duff locking code, possibly in sync client
 * rjbs: how did this get found?
 * bron: audit_slot, an internal Fastmail tool
   * slow slog of license update continues; no clear progress atm

*2020-06-15*
 * rjbs
   * to discuss: time to create v3.3.x branch so we can merge 
usermeta-by-id-bis to master
   * discussion result: we’ll just keep the branches like they are unless we 
think of a way in which a second temporary main branch is better than this
   * testing of usermeta-by-id-bis on a vm is now happening
   * will post a public “give ken a heads up” if you have an intended thing to 
merge
   * when we know this can run safely on a replica, we can merge to master
 * Ken
   * managed attachments; seem to work okay in Apple!
   * CalConnect later today: will talk DAV caching
 * ellie
   * mostly working through 3.2 bugs, but maybe fixed the last one today
   * working on release notes for 3.3 dev so she can make proper v3.3.0
 * Bron
   * working on rename bugs: something with intermediates and renames is 
causing bugs
   * rjbs hopes we can rip out the intermediate-supporting code from cyrus 
entirely [someday]
 * rjbs again
   * what’s the effort required to move to single copy of metadata per email
   * bron: the convdb is not yet appropriate, as it’s a secondary source
   * general consensus seems that we will likely just replace how messages are 
stored (hash of emailid)

*2020-06-22*
2020-06-22
 * Discuss design of JMAP Sieve
 * Ken: still working on UUID mailboxes
 * ellie
   * v3.3.0 — about halfway done with release notes, will send an email about 
places we’re stuck
   * v3.2.2 — released today; last known new bug in v3.2 may be fixed!
 * Bron
   * main thing: working on fixing behavior of user d

meeting minutes, 2020-08-03

2020-08-03 Thread Ricardo Signes
Minutes from this week's Cyrus dev call.  If we keep ending early, I can keep 
sending them immediately following the meeting! :)

 * Ken:
   * Where to store quotas for UUID mailboxes?
 * quota should stay with user during rename, so either needs to be by uuid 
or be transactional with rename; start at the leaf, walk up mailboxes to find 
quota; look at domain if none found
 * can put the quota file right in the folder (for non-domain quotaroots)
 * can we get rid of quotalegacy?
   * testing of uuid mailboxes on FM VM began, now Ken will be building new Fm
   * Updating JMAP Sieve spec and hope to get it posted as JMAP WG doc this week
   * …and some refactoring of DAV delete code, which has gotten a bit out of 
hand
 * Bron:
   * sync replication: Bron has put a star next to it.  TWO STARS!
   * Going to change reconnection and ping logic for the indexer such that it 
handles transient errors more easily.
 * right now squatter is pinging the indexer every time, can fail hard if 
no good
 * bron will change to “just try it and if it fails, retry” — fewer pings, 
better recovery from transient error
   * rjbs asks whether Melbourne shift at Fm learned more about the high CPU 
use of Tika?
 * bron: we can always restart it once an hour once we have retry in place
 * rsto: we can use JMX to investigate what it’s doing, tune the VM if 
that’s the issue
 * current theory: it’s some bogus message
   * big reindex of Fm users continues apace
 * ellie
   * a bunch of new 3.2 issues have arisen since last meeting; new tests, 
framework work
   * next week’s away time moved to October
 * rsto
   * IETF last week, of course
   * doing some misc. bug fixing
   * working on sec’y mode and inbox role for jmap calendars
   * scheduling default calendar is now protected and movable (thanks Ken!)
   * implementing inbox role on calendars introduces same problem we have in 
mailboxes: ordering now matters
   * sec’y mode: should be a matter of just setting some cyrus setting properly
   * rjbs: aren’t we already in secretary mode in Fastmail?
 * yes, because the Perl middleware is setting this by default
 * see https://fastmail.blog/2017/11/04/shared-calendar-improvements/
   * rjbs will learn more about calendar-address-set’s behavior here - 
https://tools.ietf.org/html/draft-pot-caldav-sharing-01#section-5.1
 * IETF notes, if any:
   * bron has to finish polishing minutes from CodiMD from his 3 chaired 
sessions
   * mnot proposed an HTTP API WG, which seems interesting and useful
   * no new work from JMAP and EXTRA?
 * maybe we add S/MIME validator sometime?
 * we’ll probably do the quota work
 * we’ll probably implement all of IMAP4rev2 at some point

-- 
Ricardo Signes (rjbs)
CTO, Fastmail


standardizing logging

2020-08-17 Thread Ricardo Signes
On our weekly call this morning, we were talking about moving toward 
standardizing the format of Cyrus logs.  My interest here is in making it easy 
for a program to read and classify logs.  That's not as simple as it could be, 
right now, because often a log line is sprintf-'d with parameters.  Even worse, 
sometimes those parameters have spaces in them.

I think we all agree on something like this:
 * produce a macro that does the logging in a standard format
 * the format leads with a "category", which is a fixed string
 * extra data to be included show up like auditlog does it:  foo= bar=
 * by using a macro, we can get the location (file, function) from which the 
log line is being emitted

Next steps:
 * agree on the specifics of the above and that it's the way to log in new code
 * start converting old code (prioritized by value of reading lines from each 
part of the code)

Further thoughts before we get on to specifics?

-- 
Ricardo Signes (rjbs)

meeting minutes, 2020-08-17

2020-08-17 Thread Ricardo Signes
Fun fact about working at Fastmail: so many meetings are held in the US evening 
/ AU morning that most agendas get two dates on them.  The Cyrus call, though, 
is held in the US morning / AU night, so we only have one date to mention.

Anyway, the Cyrus development call happened this morning, and here are some 
notes from it:

2020-08-17:

 * log format normalization
   * figure out what we care about
   * create a syslog wrapper which generates the format we want!
   * logline(level, function_name, category, subcategory, message)
   * auditlog: type foo= bar= …
   * create a macro to get things auto-added (fn name, line no)
   * agree what we want to do for new code
   * change it
   * tell everyone
   * [rjbs: I already sent an email asking for more info!]
 * deleting ACL when we delete folder: should we?
   * when user is recreated from pending delete in Fm, ACLs not recreated, 
including (anyone p) which allows plus addr delivery
   * ACL should be copied to the DELETED. folder
   * Doesn’t matter if we leave it on the tombstone as well, but we should 
always be bringing it back from the folder we undelete, not using the copy on 
the tombstone, because that folder is GONE.
 * lmtp fuzzy matching with plus addressing will cause delivery to parent (if 
ACL allows) if child doesn’t exist
   * is this a recent change? to be investigated
 * discuss upgrading mailboxes.db for UUID mailboxes - specifically multiple 
mailboxes with same UID
   * [discussion to be noted by Ken]
 * robert
   * bunch of work on making sure Email/changes exists on destroy of containing 
mailbox
 * involves change of semantics of deleted modseq counter
 * we now bump the email deleted modseq when we soft delete a mailbox
 * with this change: after doing Mailbox/set.delete, Email/changes will get 
cannotCalculateChanges
   * running AFL against MIME parser: so far, no results
 * will be looking into using AFL dictionaries to refine attack
 * DigitalOcean maybe not the best place to be running this; will look for 
other resource
   * looked into key-too-long errors that turned up recently; potential fix has 
been send upstream
   * next up: JSCalendar object updates; then updating role on Mailbox/set
   * rjbs to follow up internally about testing new JMAP Calendar branch on 
future
 * murch
   * sieve has been updated to match mailbox and snooze drafts
   * fixed weird bug in uppercasing in Sieve
 * bron
   * “synchronous replication is awesome!”
   * I’ve patched synclog; there was a thing called a synclog checklog, which 
picked up the log and ran it right there
   * that’s sync replication… but kind of bogus
   * now we create it in memory
   * next step is to create a backend connection with a 30s timeout and change 
the synclog reader to read from a buffer; if our buffer is nonempty, we hand 
that to synclog
   * when we get back an OK, we know it has been committed to the “blessed” 
replica
   * would like to get this under some real testing to figure out impact on 
performance and resource consumption
   * after that will be sync caching to improve performance of checking state 
of replica
 * append is 4 round trips; this will bring down to 3
 * drop another one by knowing about existing copies of msg
 * drop another one by embedding file (email) as part of the apply if file 
small enough; otherwise separate reserve upload
   * “user has moved” bug: we tried to efficiently combine requests, but didn’t 
deduplicate mailbox names, which led to aborts — once for every mailbox in a 
rename [rjbs: surely I got this a little wrong]
 * rjbs
   * we’ve changed how we aggregate errors
   * a question about how we get “deadlock avoided” in delivery from Sieve: 
rjbs to research
 * rsto
   * needs to update Cyrus docs for new query behavior
 * murch
   * dealing with conflict between include-in-fm PRs and usermeta-bis work

-- 
Ricardo Signes (rjbs)



Re: meeting minutes, 2020-08-17

2020-08-17 Thread Ricardo Signes
On Mon, Aug 17, 2020, at 09:21, Ricardo Signes wrote:
>  * discuss upgrading mailboxes.db for UUID mailboxes - specifically multiple 
> mailboxes with same UID
>* [discussion to be noted by Ken]

   * the current code to update existing mailboxes.db to the new form (N, I, A 
records) processs records in order by name and creates I (uniqueid) records 
accordingly
   * however, this means that if multiple mailbox nbames correspond to the same 
uniqueid (e.g. in the case of a rename - current name plus tombstone), the I 
record will point to the last name record alphabetically
   * so, if I rename mailbox zzz to aaa, the I record will end up point to the 
zzz tombstone
   * the solution will be to to create a hash of uniqueid → name(s) and 
create/update N and I records in the proper modseq/timestamp order

-- 
rjbs

meeting minutes, 2020-08-31

2020-09-01 Thread Ricardo Signes
Here are the minutes of our (fairly short) call from this week.

 * rsto
   * fixed sieve/jmapquery issue with stray CR characters
   * ran both afl and hongfuzz fuzzers for approx.
 * 10 CPU days for
   * jansson
   * libical
   * mime parser in Cyrus (message.c)
 * no crashers found, so it looks like no super-low hanging fruits (good!)
 * issue is with getting fuzzers find relevant edges:
   * either brute-force with lots of CPUs and long time horizon (>=weeks)
   * and/or see if we can nudge edge detection further
   * rjbs to contact a successful fuzzer of Perl about AFL fuzzing tips
   * rsto:  We likely need more human labor.
   * updated Mailbox/set to handle role updates:
 * all new and regression tests pass.
 * will start review next week after code cleanup
 * brong
   * Distracted by IETF politics
   * sync replication: haven’t touched it
   * xapian reindex: grr.  Going to write to temp_path for a bunch of records 
and then compact to temp path again.
   * skip_domains: reviewed, will merge soon
   * added bespoke method UserCounters/get; this lets us avoid IMAP connections 
from middleware when sending startup event on EventSource connection
 * ellie
   * rjbs *does* owe her a list of error messages at the top of 
to-reformat-to-new-format
   * new 3.2 release!
   * working through some GitHub issues
   * qsort_r BSD compat may soon be better-solved
 * all the time, we’ve had a macro adjusting for argument ordering plus a 
function for platforms without qsort_r … but we didn’t compile out the fallback 
 fn, so our macros always called *that! *and then on BSD, the macros changed 
the order of args, but then called our other-arg-ordered internal implementation
 * this means all our builds will start using platform-supplied qsort_r, 
which could expose some surprises
 * “It’s been fun.”
 * murch
   * bulk of last week: writing/rewriting code to transform quota db to use id 
instead of name … and deciding it wasn’t really worth it, given the amount of 
code that would then also be rewritten
 * murch: “I think I convinced Bron that the right thing is to leave it 
as-is.”
 * brong: “Yup.”
   * some new bug in uuidmailboxes has turned up; maybe an uninitialized 
return(?)
   * JMAP Sieve draft should get an update this week
 * rjbs to provide some feedback [*rjbs: hey I just did that!*]
 * rjbs
   * will soon be moving Cyrus mailing lists
   * *gotta* get build of jmap-calendar cyrus this week!
   * CalDAVTalk and jscalendar: should not be a conflict, right?
 * brong: Right.
 * [ some discussion about how this affects Cassandane because of 
multivalue rrules ]

-- 
Ricardo Signes (rjbs)
CTO, Fastmail


meeting minutes, 2020-09-14

2020-09-14 Thread Ricardo Signes
2020-09-14

 * ellie
   * xsyslog is now merged!
 * please get into a habit of using it in new code.  there’s some examples 
in b9191eeb09
 * if fm ops notices specific log outputs they want fixed they can throw 
them at ellie
 * will continue to chip away at converting existing ones as 
time/motivation permits
   * mbname_from_foo() constructor APIs (esp mbname_from_userid()) should work 
harder to only produce valid mbname objects?  mbname_from_intname() and 
mbname_from_extname() look fairly thorough, but mbname_from_userid() is 
especially naive, causing #3169.  i haven’t looked at the others (if there are 
others).
   * Australia will be back in DST again in a few weeks, so we should start 
thinking about when to flip the meeting time to the other end of the day
   * added a new makefile target to run each cunit test individually (instead 
of along with all others) and found 55 tests that don’t run on their own, 
mostly due to the same leaky abstraction
 * imapd.conf contents were being generated, but not every test that relied 
on those did the generation, so they passed because an earlier test did
 * some mbox name tests remain to fix, though (?)
   * custom annotation definitions: are specified case-insensitive, but we load 
them case-sensitively — would be easy to fix, but a comment suggests that our 
DAV annotations are case-sensitive.  Q:  Do we need *custom* DAV annotations?
 * brong
   * working on compaction stuff
   * first pass of synchronous replication is done!  needs testing, but maybe 
this week can go onto unstable testing stores
   * MR called “debug syslog lock ordering”, keeps ptr atr of open locks; the 
bookkeeping has been a mess because a bunch of code just closes the fd, not an 
explicit lock destroy
 * [ some discussion of how we can improve this and track locks better ]
   * fix search code path to better eliminate items in non-mail folders
 * rsto
   * AFL has detected an httpd crasher! \o/
   * ran fuzzer for 11d against the httpd, didn’t find any other crashers
   * may turn fuzzer back to other components, or maybe update its dictionary 
for better-targeted attack
   * JMAP calendars work still happening on the side
   * “I’d like to return to the coverage report.”
 * have found a few places that are clearly under-tested
 * the rest of us should give it a look to see what else looks 
insufficiently tested
 * also: can we run an instrumented-for-code-path-entry Cyrus on a 
used-but-not-for-customers branch to gather more real-world usage?  Answer:  
yes, probably, but there are some owner/mode shenanigans to make it all work
 * murch
   * spent a bunch of time on JMAP Sieve script code, mostly the testing code
 * there have been some memory management issues with ownership of strings

-- 
Ricardo Signes (rjbs)
CTO, Fastmail


Re: meeting minutes, 2020-09-14

2020-09-14 Thread Ricardo Signes
On Mon, Sep 14, 2020, at 09:44, Ricardo Signes wrote:
> 2020-09-14

>  * rsto
>* AFL has detected an httpd crasher! \o/

My mistake, this was found by honggfuzz.

-- 
rjbs

meeting minutes, 2020-09-28

2020-09-28 Thread Ricardo Signes
Very short call today!

2020-09-28

 * bron
   * in the process of opening a PR to always run squatter in batch mode 
whether rolling or not
   * previously, without doing indexing in chunks, users could get stuck on 
indexing all mail and back up replication!
   * possibly we’re seeing squatter failing to index some messages; have had >0 
reports of “search has no results”; one possibility, is that we’ve got a bug in 
the http-based append code, so it’s affecting dav and jmap moves?  (no synclog 
append/user?)
   * got a PR in flight to reduce how often we do a crc32, to avoid CPU load
 * murch
   * last week mostly jmap sieve (implemented query!)
   * Thu/Fri: memory leak squashing!
   * handling of STATUS:CANCELLED itip was busted and has been fixed; this 
affected both itip generation and itip reply processing [*rjbs: *this was a fun 
one]
   * have added a new argument to Backup/restore* to crank up log level 
per-call to see why a restore might be failing
 * ellie
   * been doing housekeeping work
   * next v3.2 release notes are ready, just pending a last run-through of 
low-hanging pending issues to fix
 * rjbs
   * not much Cyrus to report!
   * expect some planning of upcoming 2021 projects soon

-- 
rjbs



minutes, cyrus call, 2020-10-12

2020-10-12 Thread Ricardo Signes
Largely unedited, here are the minutes from this morning's dev call:

 * murch 
   * discussion of expanded header search semantics (prefix/suffix/substr match)
 * it's unclear whether we need or want this, given the yet unknown cost v. 
complexity v. benefit ratios
 * This isn't Xapian related at all, we'll be processing fields from the 
cyrus.cache.
   * Discuss replication of JMAPSieve script ids 
 * Given our current replication protocol, SieveScript object ids will not 
replicate. *Does this matter?* (This is: replicas will have the same script 
with a different id.)
 * First: *does* it get replicated? We need to double-check that it 
doesn't, but consensus seems to be that it isn't replicated.
 * We should fix it, if it isn't replicated, but we don't really *care* as 
far as Fastmail operation is concerned. It won't affect user-visible behavior.
   * Discuss expectations of cyr_ls 
 * always use unixhierarchysep?
 * let's use admin namespace
 * (we have other tools that use internal ns, it's been a pain, and we're 
fixing it as we go)
   * fixed cross-domain searching (new term added to Xapian)
   * still waiting on CR for 8bit characters in C-D
 * ellie 
   * PR #3166 contains code from OpenBSD, NetBSD, DragonflyBSD and FreeBSD's 
top(1) utilities (for turning process state flags into human-readable strings). 
Can we accept this? — We think so.
   * we should be expecting lots of small MRs for xsyslog conversions
 * rjbs 
   * we're working on cyrus ML conversion; waiting on Dave
 * brong 
   * locking! conversationsdb has locking problems, annotations are "a whole 
locking nightmare"
   * there's a lock inversion between JMAP and other calls in the locking of 
convdb vs. mailboxes
   * we can change how convdb locking works internally so you can take/release 
a lock without closing the entire db, add a user lock, then we're done!(?)
   * …but it's a bunch of work, and that will be Bron's next project for Cyrus
   * but this week is CalConnect week

-- 
Ricardo Signes (rjbs)
CTO, Fastmail