Hi Bron,

sorry, i had to rearrange some quotes to put them my answers in a more meaningful order.


Quoting Bron Gondwana <br...@fastmailteam.com>:

On Mon, Feb 4, 2019, at 22:00, Michael Menge wrote:

Quoting Bron Gondwana <br...@fastmail.fm>:

> On Mon, Feb 4, 2019, at 20:21, Michael Menge wrote:
>>

>> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR: opening
>> /srv/cyrus-be/ssd-part/L/user/XXXX/2185.: No such file or directory
>> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR: opening
>> /srv/cyrus-be/ssd-part/L/user/XXXX/2185.: No such file or directory
>> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR archive
>> user.XXXX 2185 failed to copyfile
>> (/srv/cyrus-be/ssd-part/L/user/XXXX/2185. =>
>> /srv/cyrus-hdd-be/archive/ssd-part/L/user/XXXX/2185.): Unknown code
>> ____ 255
>
>
> Ouch. Yeah, that could have been caused by a bug in delivery, and
> would definitely cause conversations DB corruption if the index file
> was updated but the conversations DB wasn't or vice versa.
>
>> The file was already at /srv/cyrus-hdd-be/archive/ssd-part/L/user/XXXX/2185.

I was able to fix these problems with reconstruct, and the didn't reappear till now. Also there where other accounts which had IOERRORS regarding the conversation db, with no cyr_expire archive errors, so i believe that these problems are not related.

I tried rebuilding the conversation db for the accounts with errors, but some other accounts will show up with errors some time later. I counldn't find a some thing in
common jet.


>> > Anyway, I don't think that would break anything.
>> >
>> > metapartition-ssd: /srv/cyrus-ssd-be/meta/ssd-part
>> > metapartition_files: header index cache expunge squat annotations
>> > lock dav archivecache
>> >
>> > Ooh, I haven't tested having cache and archivecache on the same
>> > location. That's really interesting. Again, I'd be in favour of
>> > separation here, give them different paths. That might be tricky
>> > with ssd though, the way this is laid out. I assume you have some
>> > kind of symlink farm going on?
>> >
>>
>> I didn't know that there could be a problem with cache and archivecache.
>> At the time we decided on the configuration for cyrus 3.0 I looked at the
>> imapd.conf man page and for metapartition_files decided that I want all
>> meta files on the ssd storage. There was no indication in the man page
>> that there could be a problem.
>
> Fair. I'd have to test that to see if it works correctly. I would
> hope so, but I haven't tested that configuration. This is the
> downside with having lots of different ways to do things!
>
>> How do I separate location of archivecache from the other
>> metapartition path?
>> And fix the cache and archivecache files?
>
> This I don't know a good answer for. I will test if having the same
> path for cache and archivecache could fail. I THINK that I made the
> code safe for it, but I'm not sure that it's been tested.
>
>> No there is no sysmlink farm. We have mounted different iSCSI volumes to
>> /srv/cyrus-ssd-be, /srv/cyrus-hdd-be and /srv/cyrus-be
>
> Right. That makes sense.

Did you have time to look into the cache/archivecache situation jet?


> Right! I do wonder if there are some bugs in 3.0.x which are fixed
> on master around delivery to archive partition. We definitely had
> bugs on master, but I thought they were newly introduced on master
> as well, which is why the fixes weren't backported. But if you're
> having files be in the wrong location, maybe there are bugs on 3.0.x
> as well.

Are all fixes from master backported to 3.0?

Is the new Commit "I will try your new commits regarding CID" related to the
"IOERROR: conversations_audit on load:" and "IOERROR: conversations_audit on store"?

I will try your new commits in the next days on my test servers to sea if the fix
the endless loop in ctl_conversationsdb I have seen for some accounts.

Quoting myself (Re: prblems rebuilding conversations db) Jan 24, 2019

The program loops in build_cid_cb (imap/ctl_conversationsdb.c:189)

For the problematic mailbox that I tested, for every message
record->cid was NULLCONVERSATION, so mailbox_cacherecord,
message_update_conversations and mailbox_rewrite_index_record
where called, each returned 0.

After iterating trough all messages in the mailbox count was > 0, and r==0,
so the while condition (!r && count) was true for the next run.
At this point record->cid was still NULLCONVERSATION for every message,
which I guess should not be the case.

Michael

--------------------------------------------------------------------------------
M.Menge                                Tel.: (49) 7071/29-70316
Universität Tübingen                   Fax.: (49) 7071/29-5912
Zentrum für Datenverarbeitung mail: michael.me...@zdv.uni-tuebingen.de
Wächterstraße 76
72074 Tübingen

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Reply via email to