Re: [Dovecot] Time moved backwards errors
On Fri, 2009-04-03 at 20:08, Charles Marcus wrote: > >> Hopefully you meant ntpd, not ntpdate... but I believe the OP was > >> using a VM, so ntpd is not an option... > > > How is that so? we use some vmware and xen setups, and ntpd works > > fine on both > > http://support.ntp.org/bin/view/Support/KnownOsIssues#Section_9.2.2. interesting, I run it fine on both, I skewed the clock on host and guest on boot set correct time
Re: [Dovecot] Time moved backwards errors
On 4/2/2009 6:05 PM, Noel Butler wrote: >>> I see this all of the time on an EL4 machine when it is under >>> high load. The clock is synced to ntp but I still get dovecot >>> killing itself. Sometimes ntp looses sync but not always. >> Hopefully you meant ntpd, not ntpdate... but I believe the OP was >> using a VM, so ntpd is not an option... > How is that so? we use some vmware and xen setups, and ntpd works > fine on both http://support.ntp.org/bin/view/Support/KnownOsIssues#Section_9.2.2. -- Best regards, Charles
Re: [Dovecot] Time moved backwards errors
Top posting has nothing to do with it, take your nazi'ism and put it where someone else might care to listen :) If your name is in the "To" field (even a 5yo can figure that one out) , then I responded to you, and I've seen RH boxes before that dont play like that, hence my suggestion. On Fri, 2009-04-03 at 16:04, Tom Diehl wrote: > On Fri, 3 Apr 2009, Noel Butler wrote: > > > Why not just run ntpd and be done with it, ensure you start ntpd with > > "-g" option > > If you are responding to me (it is hard to tell when you top post) I am > running ntpd. I still have the problem described below. I have never used the > -g option but the Red Hat init scripts call ntpdate at startup before ntpd > starts. In reading the man pg, it would appear that I should get the same > results as using the -g switch, since the -g switch only allows 1 large > adjustment of the system time. > > Regards, <>
Re: [Dovecot] Time moved backwards errors
On Fri, 3 Apr 2009, Noel Butler wrote: Why not just run ntpd and be done with it, ensure you start ntpd with "-g" option If you are responding to me (it is hard to tell when you top post) I am running ntpd. I still have the problem described below. I have never used the -g option but the Red Hat init scripts call ntpdate at startup before ntpd starts. In reading the man pg, it would appear that I should get the same results as using the -g switch, since the -g switch only allows 1 large adjustment of the system time. Regards, -- Tom Diehl tdi...@rogueind.com Spamtrap address mtd...@rogueind.com On Fri, 2009-04-03 at 03:11, Tom Diehl wrote: On Thu, 2 Apr 2009, Mark Hedges wrote: On 2009 Apr 02 (Thu) at 12:49:43 +0100 (+0100), Bloke wrote: Hello, I am experiencing a number of 'Time moved backwards errors' such as: Mar 27 11:38:20 host-78-129-239-60 dovecot: imap-login: Time just moved backwards by 729 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards Mar 27 15:20:10 host-78-129-239-60 dovecot: Time just moved backwards by 4214 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards Mar 29 11:08:59 host-78-129-239-60 dovecot: imap-login: Time just moved backwards by 4341 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards on my Centos 5.2 openvz - based VPS I have raised the issue with my VPS provider, who are responding that 'the jury is still out' as to whether this is a system, or Dovecot problem. FWIW I saw this the other day on a real-machine CentOS 5.2 system after a severe crash in which the system put all the drives into read-only mode. The machine had to be power-cycled and dovecot complained the clock was off by 5 minutes. Maybe something else going on with the 2.6.18 kernel? Dunno. I see this all of the time on an EL4 machine when it is under high load. The clock is synced to ntp but I still get dovecot killing itself. Sometimes ntp looses sync but not always. I have a cronjob check every 5 minutes to see if dovecot is running and if not restart it. I know that is not the right answer and I hate things like that but until I figure out how to fix it or upgrade the machine (coming soon) I will live with it. I am just throwing this out in case it means anything to someone else. FWIW dovecot version == dovecot-1.0.3-14 Regards,
Re: [Dovecot] Time moved backwards errors
On Thu, 2 Apr 2009, Charles Marcus wrote: On 4/2/2009, Tom Diehl (tdi...@rogueind.com) wrote: I see this all of the time on an EL4 machine when it is under high load. The clock is synced to ntp but I still get dovecot killing itself. Sometimes ntp looses sync but not always. Hopefully you meant ntpd, not ntpdate... but I believe the OP was using ntpd. a VM, so ntpd is not an option... Well actually it is. At least with xen and I think vmware. I run ntpd on dom0 and the domU's sync to it. In my case the EL4 machine is a real machine and running ntpd. Regards, -- Tom Diehl tdi...@rogueind.com Spamtrap address mtd...@rogueind.com
Re: [Dovecot] Time moved backwards errors
> Why not just run ntpd and be done with it, ensure you start ntpd with > "-g" option It's more than this. ntpd should be started ASAP in the boot process, and then as late as possible in the boot process one should run ntp-wait. Only after ntp-wait finishes should time-critical services be started. H
Re: [Dovecot] Time moved backwards errors
On Fri, 2009-04-03 at 05:43, Charles Marcus wrote: > On 4/2/2009, Tom Diehl (tdi...@rogueind.com) wrote: > > I see this all of the time on an EL4 machine when it is under high > > load. The clock is synced to ntp but I still get dovecot killing > > itself. Sometimes ntp looses sync but not always. > > Hopefully you meant ntpd, not ntpdate... but I believe the OP was using > a VM, so ntpd is not an option... How is that so? we use some vmware and xen setups, and ntpd works fine on both
Re: [Dovecot] Time moved backwards errors
Why not just run ntpd and be done with it, ensure you start ntpd with "-g" option On Fri, 2009-04-03 at 03:11, Tom Diehl wrote: > On Thu, 2 Apr 2009, Mark Hedges wrote: > > > > > > > On 2009 Apr 02 (Thu) at 12:49:43 +0100 (+0100), Bloke wrote: > >> Hello, > >> > >> I am experiencing a number of 'Time moved backwards errors' such as: > >> > >> Mar 27 11:38:20 host-78-129-239-60 dovecot: imap-login: Time just moved > >> backwards by 729 seconds. This might cause a lot of problems, so I'll just > >> kill myself now. http://wiki.dovecot.org/TimeMovedBackwards > >> Mar 27 15:20:10 host-78-129-239-60 dovecot: Time just moved backwards by > >> 4214 seconds. This might cause a lot of problems, so I'll just kill myself > >> now. http://wiki.dovecot.org/TimeMovedBackwards > >> Mar 29 11:08:59 host-78-129-239-60 dovecot: imap-login: Time just moved > >> backwards by 4341 seconds. This might cause a lot of problems, so I'll > >> just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards > >> > >> on my Centos 5.2 openvz - based VPS > >> > >> I have raised the issue with my VPS provider, who are > >> responding that 'the jury is still out' as to whether this > >> is a system, or Dovecot problem. > > > > FWIW I saw this the other day on a real-machine CentOS 5.2 > > system after a severe crash in which the system put all the > > drives into read-only mode. The machine had to be > > power-cycled and dovecot complained the clock was off by 5 > > minutes. Maybe something else going on with the 2.6.18 > > kernel? Dunno. > > I see this all of the time on an EL4 machine when it is under high load. > The clock is synced to ntp but I still get dovecot killing itself. > Sometimes ntp looses sync but not always. > > I have a cronjob check every 5 minutes to see if dovecot is running and if > not restart it. I know that is not the right answer and I hate things like > that > but until I figure out how to fix it or upgrade the machine (coming soon) I > will live with it. > > I am just throwing this out in case it means anything to someone else. > > FWIW dovecot version == dovecot-1.0.3-14 > > Regards,
Re: [Dovecot] Time moved backwards errors
On 4/2/2009, Tom Diehl (tdi...@rogueind.com) wrote: > I see this all of the time on an EL4 machine when it is under high > load. The clock is synced to ntp but I still get dovecot killing > itself. Sometimes ntp looses sync but not always. Hopefully you meant ntpd, not ntpdate... but I believe the OP was using a VM, so ntpd is not an option... -- Best regards, Charles
Re: [Dovecot] Time moved backwards errors
On Thu, 2 Apr 2009, Mark Hedges wrote: On 2009 Apr 02 (Thu) at 12:49:43 +0100 (+0100), Bloke wrote: Hello, I am experiencing a number of 'Time moved backwards errors' such as: Mar 27 11:38:20 host-78-129-239-60 dovecot: imap-login: Time just moved backwards by 729 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards Mar 27 15:20:10 host-78-129-239-60 dovecot: Time just moved backwards by 4214 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards Mar 29 11:08:59 host-78-129-239-60 dovecot: imap-login: Time just moved backwards by 4341 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards on my Centos 5.2 openvz - based VPS I have raised the issue with my VPS provider, who are responding that 'the jury is still out' as to whether this is a system, or Dovecot problem. FWIW I saw this the other day on a real-machine CentOS 5.2 system after a severe crash in which the system put all the drives into read-only mode. The machine had to be power-cycled and dovecot complained the clock was off by 5 minutes. Maybe something else going on with the 2.6.18 kernel? Dunno. I see this all of the time on an EL4 machine when it is under high load. The clock is synced to ntp but I still get dovecot killing itself. Sometimes ntp looses sync but not always. I have a cronjob check every 5 minutes to see if dovecot is running and if not restart it. I know that is not the right answer and I hate things like that but until I figure out how to fix it or upgrade the machine (coming soon) I will live with it. I am just throwing this out in case it means anything to someone else. FWIW dovecot version == dovecot-1.0.3-14 Regards, -- Tom Diehl tdi...@rogueind.com Spamtrap address mtd...@rogueind.com
Re: [Dovecot] Time moved backwards errors
On 2009 Apr 02 (Thu) at 12:49:43 +0100 (+0100), Bloke wrote: > Hello, > > I am experiencing a number of 'Time moved backwards errors' such as: > > Mar 27 11:38:20 host-78-129-239-60 dovecot: imap-login: Time just moved > backwards by 729 seconds. This might cause a lot of problems, so I'll just > kill myself now. http://wiki.dovecot.org/TimeMovedBackwards > Mar 27 15:20:10 host-78-129-239-60 dovecot: Time just moved backwards by 4214 > seconds. This might cause a lot of problems, so I'll just kill myself now. > http://wiki.dovecot.org/TimeMovedBackwards > Mar 29 11:08:59 host-78-129-239-60 dovecot: imap-login: Time just moved > backwards by 4341 seconds. This might cause a lot of problems, so I'll just > kill myself now. http://wiki.dovecot.org/TimeMovedBackwards > > on my Centos 5.2 openvz - based VPS > > I have raised the issue with my VPS provider, who are > responding that 'the jury is still out' as to whether this > is a system, or Dovecot problem. FWIW I saw this the other day on a real-machine CentOS 5.2 system after a severe crash in which the system put all the drives into read-only mode. The machine had to be power-cycled and dovecot complained the clock was off by 5 minutes. Maybe something else going on with the 2.6.18 kernel? Dunno. Mark
Re: [Dovecot] Time moved backwards errors
time should *never* move backwards. OSes (and programs) assume time is always moving forward. Injure your VPS provider. On 2009 Apr 02 (Thu) at 12:49:43 +0100 (+0100), Bloke wrote: :Hello, : :I am experiencing a number of 'Time moved backwards errors' such as: : :Mar 27 11:38:20 host-78-129-239-60 dovecot: imap-login: Time just moved backwards by 729 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards :Mar 27 15:20:10 host-78-129-239-60 dovecot: Time just moved backwards by 4214 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards :Mar 29 11:08:59 host-78-129-239-60 dovecot: imap-login: Time just moved backwards by 4341 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards : :on my Centos 5.2 openvz - based VPS : :I have raised the issue with my VPS provider, who are responding that 'the jury is still out' as to whether this is a system, or Dovecot problem. : :Reading the TimeMovedBackwards article on the Dovecot wiki, and the kernel mailing list thread referred to on it, it would seem quite apparent that this is caused by a failure of the gettimeofday() function to reliably return the correct value. : :If anyone has any advice as to how I could proceed to fix this (I currently have a watch on the Dovecot service which restarts it after any failures) or how I should best phrase this to get it resolved 'upstream' by the VPS provider, I would be very grateful. : :Thank you, : :Patrick Vale : -- Valerie: Aww, Tom, you're going maudlin on me ... Tom: I reserve the right to wax maudlin as I wane eloquent ... -- Tom Chapin
Re: [Dovecot] Time moved backwards ....
On February 18, 2009 6:08:04 PM +0200 Harry Lachanas wrote: of course I do . The server I was talking about was a test server, fresh install, and I corrected the time zone not possible. unix systems keep track of UTC time, not local time. you may have changed the time zone *and also* corrected the time. I haven't reached the point where a summer or winter time change happened ... :-) , yet . again, irrelevant. -frank
Re: [Dovecot] Time moved backwards ....
on 2-18-2009 8:12 AM Rob Mangiafico spake the following: > On Wed, 18 Feb 2009, Harry Lachanas wrote: OK.. So I synced the clock and got dovecot: Time just moved backwards by 1 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards > > Is this related to the leap second that occured yesterday? > > Rob > There are no leap seconds planned for at least the first half of 2009. The next bulletin will be in July. The last leap second was in December 2008. INFORMATION ON UTC - TAI NO positive leap second will be introduced at the end of June 2009. The difference between Coordinated Universal Time UTC and the International Atomic Time TAI is : from 2009 January 1, 0h UTC, until further notice : UTC-TAI = -34 s Leap seconds can be introduced in UTC at the end of the months of December or June, depending on the evolution of UT1-TAI. Bulletin C is mailed every six months, either to announce a time step in UTC, or to confirm that there will be no time step at the next possible date. -- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't signature.asc Description: OpenPGP digital signature
Re: [Dovecot] Time moved backwards ....
Rob wrote: > Is this related to the leap second that occured yesterday? There was no leap second in February. H
Re: [Dovecot] Time moved backwards ....
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, Feb 18, 2009 at 06:08:04PM +0200, Harry Lachanas wrote: > to...@tuxteam.de wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> On Wed, Feb 18, 2009 at 05:17:18PM +0200, Harry Lachanas wrote: >> >>> OK.. >>> So I synced the clock >>> and got >>> >>> dovecot: Time just moved backwards by 1 seconds. I'll sleep now until >>> we're back in present. http://wiki.dovecot.org/TimeMovedBackwards >>> >>> ( The first time I did this the clock moved backwards 2 hours after a >>> timezone change and dovecot suicided ) That shouldn't happen: the clock of your server should be UTC and *independent* of the time zone! The time zone should be used to display times (I'm assumig an Unix-like server here -- no clue about other OSes). >>> I think I understand the concept ... >>> However a mail server should probably be synchronized to the local time >>> >> >> You don't really mean what you are saying, I think. Anyway: what do you >> > of course I do . > The server I was talking about was a test server, fresh install, and I > corrected the time zone > So If U are offended, I am sorry No, I'm not, don't worry :-) > On the other hand if U have NOT something real to share please do not > answer at least with an empty answer See below: I did propose two ways of coping with this: rebooting (implicitly) and adjusting softly the clock. > I haven't reached the point where a summer or winter time change happened > ... :-) , yet . > I would hate the moment that I would have to explain to my users that they > have to wait for a couple of hours > until the server wakes up again ... Well -- I tried to point out alternatives. See, this problem comes up in this list time and again, and I just wanted to say that dovecot *really can't do anything about it*. It's a general problem with servers. And there seems to be more problems if you talk about changing time just because you need to change the time zone. And you shouldn't go about changing your clock when summer time comes. Don't hesitate to ask if all this is mumbo-jumbo to you (but off-list, please, as it's quite off-topic by now) Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFJnDzgBcgs9XrR2kYRAoGbAJ0Zl3OAs3DpXBQURqSXRDyiLU/yFQCfVJbA 4MSDGlk42LuuyPiJa+T1E2g= =kbJM -END PGP SIGNATURE-
Re: [Dovecot] Time moved backwards ....
on 2-18-2009 7:17 AM Harry Lachanas spake the following: > OK.. > So I synced the clock > and got > How are you syncing the clock? The preferred method is to run ntpd to keep the clock synced by nudging the timer faster or slower instead of doing large time corrections. > dovecot: Time just moved backwards by 1 seconds. I'll sleep now until > we're back in present. http://wiki.dovecot.org/TimeMovedBackwards Daemons that work with timestamped logs and files don't like time to go backwards, especially when they are running. Dovecot is nice about it, while some others will either silently die, or corrupt something. > > ( The first time I did this the clock moved backwards 2 hours after a > timezone change and dovecot suicided ) > I think I understand the concept ... > However a mail server should probably be synchronized to the local time > . > > Suggestions ...??? > > Harry. > > -- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't signature.asc Description: OpenPGP digital signature
Re: [Dovecot] Time moved backwards ....
On Feb 18, 2009, at 10:17 AM, Harry Lachanas wrote: dovecot: Time just moved backwards by 1 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards ( The first time I did this the clock moved backwards 2 hours after a timezone change and dovecot suicided ) Dovecot tracks the UTC time, which doesn't change when you change timezones..
Re: [Dovecot] Time moved backwards ....
Words by Harry Lachanas [Wed, Feb 18, 2009 at 05:17:18PM +0200]: > OK.. > So I synced the clock > and got > > dovecot: Time just moved backwards by 1 seconds. I'll sleep now until > we're back in present. http://wiki.dovecot.org/TimeMovedBackwards > > ( The first time I did this the clock moved backwards 2 hours after a > timezone change and dovecot suicided ) > I think I understand the concept ... > However a mail server should probably be synchronized to the local time > . > > Suggestions ...??? > Configure ntpd, it syncs the time by micro adjusts. Or do I fail to see the problem ? -- Jose Celestino | http://japc.uncovering.org/files/japc-pgpkey.asc "One man’s theology is another man’s belly laugh." -- Robert A. Heinlein
Re: [Dovecot] Time moved backwards ....
On Wed, 18 Feb 2009, Harry Lachanas wrote: OK.. So I synced the clock and got dovecot: Time just moved backwards by 1 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards Is this related to the leap second that occured yesterday? Rob
Re: [Dovecot] Time moved backwards ....
to...@tuxteam.de wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, Feb 18, 2009 at 05:17:18PM +0200, Harry Lachanas wrote: OK.. So I synced the clock and got dovecot: Time just moved backwards by 1 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards ( The first time I did this the clock moved backwards 2 hours after a timezone change and dovecot suicided ) I think I understand the concept ... However a mail server should probably be synchronized to the local time You don't really mean what you are saying, I think. Anyway: what do you of course I do . The server I was talking about was a test server, fresh install, and I corrected the time zone So If U are offended, I am sorry On the other hand if U have NOT something real to share please do not answer at least with an empty answer You will probably make other people tired + disappointed too, searching the list trying to locate the answer to this question do with all those little file timestamps coming from the future? I haven't reached the point where a summer or winter time change happened ... :-) , yet . I would hate the moment that I would have to explain to my users that they have to wait for a couple of hours until the server wakes up again ... Also add that I tend not to explain in techno-mambo-jumbo-geek ( metaphorically ;-) ) terminology what is going on Having said all of the above ... My apologies to the list for the extra paragraphs and being of-topic ... Cheers, Harry. Many servers dislike time jumping backwards. I've seen even cron killing itself. Above reaction of dovecot is indeed quite friendly. FWIW -- if I have to turn back the clock of a server I don't want to reboot, I just slow down the clock and wait... Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFJnCoIBcgs9XrR2kYRAvaTAJwMsK2IcRN6WDJcnaVrvuALzrmQmACfVC9O HJzrzZZl3FLDq90AhgTimUk= =4PDz -END PGP SIGNATURE-
Re: [Dovecot] Time moved backwards ....
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, Feb 18, 2009 at 05:17:18PM +0200, Harry Lachanas wrote: > OK.. > So I synced the clock > and got > > dovecot: Time just moved backwards by 1 seconds. I'll sleep now until we're > back in present. http://wiki.dovecot.org/TimeMovedBackwards > > ( The first time I did this the clock moved backwards 2 hours after a > timezone change and dovecot suicided ) > I think I understand the concept ... > However a mail server should probably be synchronized to the local time You don't really mean what you are saying, I think. Anyway: what do you do with all those little file timestamps coming from the future? Many servers dislike time jumping backwards. I've seen even cron killing itself. Above reaction of dovecot is indeed quite friendly. FWIW -- if I have to turn back the clock of a server I don't want to reboot, I just slow down the clock and wait... Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFJnCoIBcgs9XrR2kYRAvaTAJwMsK2IcRN6WDJcnaVrvuALzrmQmACfVC9O HJzrzZZl3FLDq90AhgTimUk= =4PDz -END PGP SIGNATURE-
Re: [Dovecot] Time Moved Backwards Error Keeps Happening
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > I followed the above official wiki link and apply the ntpd service for > adjusting the server timestamp. After running the ntpd service, I found > the server timestamp drift is only 14.333 microsecond, I think which > would be reasonable. That is reasonable, but dovecot (when it dies), does not think it's true. Just check more carefully to make sure there aren't any other scripts or programs changing the time more drastically. -- Sahil Tandon <[EMAIL PROTECTED]>
Re: [Dovecot] Time moved backwards by 4398 seconds
Bill Cole wrote: > You might even be better off configuring your system to not use the > TSC as a clock source. There's a strong chance that you won't really > be sacrificing anything that you actually use. Time to conclude on this. Rather than going along with my Dovecot hack, I went and changed to "clocksource=hpet", though I am not quite sure about the difference. As I was about to reboot the server anyway, I applied the updates from Red Hat, including this one (RHSA-2008:0519-24): * Due to a regression, "gettimeofday" may have gone backwards on certain x86 hardware. This issue was quite dangerous for time-sensitive systems, such as those used for transaction systems and databases, and may have caused applications to produce incorrect results, or even crash. (The same kind of patch has been going in and out of the Ubuntu kernel, where it appears to cause trouble for suspend) I have not seen the problem since applying those two changes, and am not currently willing to find out whether one of them is enough :) Cheers, Anders.
Re: [Dovecot] Time moved backwards by 4398 seconds
Bill Cole wrote: > At 11:10 AM +0200 6/20/08, Anders wrote: >>By that line, the entire "time moved backwards" thing does not belong >>in Dovecot. > > I suspect that you don't understand why that is in Dovecot. Timo has > explained it in detail a few times, but the bottom line is simple: > running through the same system-clock time more than once induces a > very real risk of destroying mail. ... and running through the same system clock twice is exactly what you will do if you fail to detect a temporary forward jump of 1.2 hours when it happens. I might follow your advice of "notsc", it makes me a bit uncomfortable that gettimeofday() can fail for other applications as well. Cheers, Anders.
Re: [Dovecot] Time moved backwards by 4398 seconds
At 11:10 AM +0200 6/20/08, Anders wrote: Johannes Berg <[EMAIL PROTECTED]> writes: On Fri, 2008-06-20 at 10:53 +0200, Anders wrote: I was puzzled that it was always 4398 seconds, in particular because this server runs an NTP daemon. A little searching for this problem shows that it is an issue with the Linux kernel gettimeofday(), see e.g. http://lkml.org/lkml/2007/8/23/96 The thread puts it down to buggy hardware and puts a workaround into the kernel where it belongs, not in dovecot. I think it is more accurate to say "hardware being used for a purpose its designers did not intend" instead. Using the TSC as a clock has been iffy for quite some time, and defaulting to it in the kernel is a risky design choice and must be implemented with extreme caution. It's not that the hardware is buggy,but rather that it does things by design that are not obvious from a high-level description. That's not helpful. By that line, the entire "time moved backwards" thing does not belong in Dovecot. I suspect that you don't understand why that is in Dovecot. Timo has explained it in detail a few times, but the bottom line is simple: running through the same system-clock time more than once induces a very real risk of destroying mail. Anyway, I was not proposing the patch to be included, just asking for advice as to whether it would be safe. I even noted that it was ugly. "Safe" is subjective. I think it would be safer (at the cost of a bounded amount of time) to nanosleep or maybe usleep once and retry the call rather than to go into the loop. As I am already compiling Dovecot myself, I prefer a patch there, rather than diverting from the distribution kernel. You might even be better off configuring your system to not use the TSC as a clock source. There's a strong chance that you won't really be sacrificing anything that you actually use. -- Bill Cole [EMAIL PROTECTED]
Re: [Dovecot] Time moved backwards by 4398 seconds
> This bug causes Dovecot to run the IO loop in the future for one > iteration, and then die when we get back to present time. > > By the time Dovecot dies, some damage could already have happend, for > example if ioloop_time is stored to permanent storage. Hmm, ok, I was under the impression it aborted early enough. > BTW, can you send a link to the post with the resolution for the kernel? > I didn't manage to find any final conclusion, but I would like to > propose the fix to our kernel provider :). I didn't actually look up any fix just there were fixes in the thread. Didn't check which one if any was merged though. johannes signature.asc Description: This is a digitally signed message part
Re: [Dovecot] Time moved backwards by 4398 seconds
Johannes Berg wrote: On Fri, 2008-06-20 at 11:10 +0200, Anders wrote: Johannes Berg <[EMAIL PROTECTED]> writes: On Fri, 2008-06-20 at 10:53 +0200, Anders wrote: I was puzzled that it was always 4398 seconds, in particular because this server runs an NTP daemon. A little searching for this problem shows that it is an issue with the Linux kernel gettimeofday(), see e.g. http://lkml.org/lkml/2007/8/23/96 The thread puts it down to buggy hardware and puts a workaround into the kernel where it belongs, not in dovecot. That's not helpful. By that line, the entire "time moved backwards" thing does not belong in Dovecot. Why? That's a different thing, dovecot is detecting that something is wrong and that it will be unsafe for it to continue operating. That's an entirely different class than trying to work around the detected problem, imho. This bug causes Dovecot to run the IO loop in the future for one iteration, and then die when we get back to present time. By the time Dovecot dies, some damage could already have happend, for example if ioloop_time is stored to permanent storage. BTW, can you send a link to the post with the resolution for the kernel? I didn't manage to find any final conclusion, but I would like to propose the fix to our kernel provider :). Thanks, Anders.
Re: [Dovecot] Time moved backwards by 4398 seconds
On Fri, 2008-06-20 at 11:10 +0200, Anders wrote: > Johannes Berg <[EMAIL PROTECTED]> writes: > > > On Fri, 2008-06-20 at 10:53 +0200, Anders wrote: > >> > >> I was puzzled that it was always 4398 seconds, in particular because > >> this server runs an NTP daemon. A little searching for this problem > >> shows that it is an issue with the Linux kernel gettimeofday(), see > >> e.g. http://lkml.org/lkml/2007/8/23/96 > > > > The thread puts it down to buggy hardware and puts a workaround into the > > kernel where it belongs, not in dovecot. > > That's not helpful. > > By that line, the entire "time moved backwards" thing does not belong > in Dovecot. Why? That's a different thing, dovecot is detecting that something is wrong and that it will be unsafe for it to continue operating. That's an entirely different class than trying to work around the detected problem, imho. > Anyway, I was not proposing the patch to be included, just asking for > advice as to whether it would be safe. I even noted that it was ugly. Ok. Yeah, it does seem safe, but as Timo said it'll loop there in case there is an actual forward jump. > As I am already compiling Dovecot myself, I prefer a patch there, > rather than diverting from the distribution kernel. Heh, ok. johannes signature.asc Description: This is a digitally signed message part
Re: [Dovecot] Time moved backwards by 4398 seconds
On Fri, 2008-06-20 at 10:53 +0200, Anders wrote: > Dovecot (v1.1.rc8) died tonight, with an error about time moving > backwards by 4398 seconds. I can see from logs that this has happend a > few times before with the imap processes, without me noticing. I sure > noticed the master process missing, though :-). > > I was puzzled that it was always 4398 seconds, in particular because > this server runs an NTP daemon. A little searching for this problem > shows that it is an issue with the Linux kernel gettimeofday(), see > e.g. http://lkml.org/lkml/2007/8/23/96 > > Below is a patch (untested) to work around this issue. Do you see > something wrong with this approach, apart from the uglyness? Only problem I can see is that if there's a legitimate jump of 4395 seconds it'll busy-loop for 5 seconds before continuing. Probably not very likely to happen. signature.asc Description: This is a digitally signed message part
Re: [Dovecot] Time moved backwards by 4398 seconds
Johannes Berg <[EMAIL PROTECTED]> writes: > On Fri, 2008-06-20 at 10:53 +0200, Anders wrote: >> >> I was puzzled that it was always 4398 seconds, in particular because >> this server runs an NTP daemon. A little searching for this problem >> shows that it is an issue with the Linux kernel gettimeofday(), see >> e.g. http://lkml.org/lkml/2007/8/23/96 > > The thread puts it down to buggy hardware and puts a workaround into the > kernel where it belongs, not in dovecot. That's not helpful. By that line, the entire "time moved backwards" thing does not belong in Dovecot. Anyway, I was not proposing the patch to be included, just asking for advice as to whether it would be safe. I even noted that it was ugly. As I am already compiling Dovecot myself, I prefer a patch there, rather than diverting from the distribution kernel. Cheers, Anders.
Re: [Dovecot] Time moved backwards by 4398 seconds
On Fri, 2008-06-20 at 10:53 +0200, Anders wrote: > Dovecot (v1.1.rc8) died tonight, with an error about time moving > backwards by 4398 seconds. I can see from logs that this has happend a > few times before with the imap processes, without me noticing. I sure > noticed the master process missing, though :-). > > I was puzzled that it was always 4398 seconds, in particular because > this server runs an NTP daemon. A little searching for this problem > shows that it is an issue with the Linux kernel gettimeofday(), see > e.g. http://lkml.org/lkml/2007/8/23/96 The thread puts it down to buggy hardware and puts a workaround into the kernel where it belongs, not in dovecot. johannes signature.asc Description: This is a digitally signed message part
Re: [Dovecot] Time moved backwards
At 10:12 AM -0400 5/15/08, Neal Becker wrote: Problem I see is that an external script that *unconditionally* relaunches dovecot could be a terribly problem. It's better for dovecot to do it itself in this particular failure, because it's the only one who knows that it was just a date issue, and relaunching is safe. That certainly does not need to be the case. Dovecot does log the reason in a trivially parsed manner, so a purpose-built watchdog could rather easily detect this particular failure mode. One truly simple change that could be made that would facilitate restarting under this special situation would be to have a specific exit value for Dovecot self-destructing in a time reversal, so a model where a parent process (e.g. launchd) is playing the watchdog role could use the exit value to decide whether to relaunch. That would be less likely to run into conflict with existing practice than internal logic terminating the existing processes and relaunching. On the other hand, a more subtle handling of this issue internally without terminating at all is probably the best approach, since only Dovecot itself can really know whether an immediate relaunch after a time reversal is really safe or how to make it so. For the specific problem of "infant mortality" at boot time that initiated this thread, the best approach is still prevention. Dovecot is far from the only daemon that will run into trouble if time jumps backwards, and there are widely used approaches (such as blocking the startup procedure on a successful ntpdate and using sound hardware whose clock doesn't drift too much in the first place) that minimize the risk of time reversal after sensitive daemons have started. If the problem of time stepping backwards after boot is really *common* then it may well be a dangerous cosmetic approach to just make Dovecot auto-recover (internally or externally) because it happens to be the only daemon that watches for and reacts to such an event. It is impossible to prevent every backwards time step, but preventing the predictable cases system-wide is a sounder approach than making one daemon adapt to what should be a very rare event. -- Bill Cole [EMAIL PROTECTED]
Re: [Dovecot] Time moved backwards
On May 15, 2008, at 5:12 PM, Neal Becker wrote: Problem I see is that an external script that *unconditionally* relaunches dovecot could be a terribly problem. It's better for dovecot to do it itself in this particular failure, because it's the only one who knows that it was just a date issue, and relaunching is safe. If someone wants to code this relaunching, feel free to do it and if the code looks good I'll include it. I'll maybe try fixing this some other way for v1.2 (verify all timestamp comparison code, make timeout handling work if clock moves, log only a warning. I'm not sure if I'm going to use a monotonic clock, since I'd like to know when time moves backwards or too much forwards and add some hooks there to do things like update dotlock file mtimes). PGP.sig Description: This is a digitally signed message part
Re: [Dovecot] Time moved backwards
Here's another thought: >From man ntpd: If the -x option is included on the command line, the clock will never be stepped and only slew corrections will be used. The issues should be carefully explored before deciding to use the -x option. The maximum slew rate possible is limited to 500 parts-per-million (PPM) as a consequence of the correct- ness principles on which the NTP protocol and algorithm design are based. As a result, the local clock can take a long time to converge to an acceptable offset, about 2,000 s for each second the clock is outside the acceptable range. During this interval the local clock will not be consistent with any other network clock and the system cannot be used for distributed applications that require correctly synchronized network time.
Re: [Dovecot] Time moved backwards
> Problem I see is that an external script that *unconditionally* relaunches > dovecot could be a terribly problem. It's better for dovecot to do it > itself in this particular failure, because it's the only one who knows that > it was just a date issue, and relaunching is safe. But as Timo has explained, simply relaunching might *not* be safe. johannes signature.asc Description: This is a digitally signed message part
Re: [Dovecot] Time moved backwards
Bill Cole wrote: > At 10:20 PM +0400 5/14/08, Eugene wrote: >>Hi people, >> >>>From: Adam McDougall <[EMAIL PROTECTED]> >>>I would just like to mention a circumstance that happened to me this >>>Sunday. We had a total power outage in our building, longer than our >>>UPS's could last and we don't have a generator for servers (nor is it >>>economical or needed). When the power came back on, my local NTP server >>>came on at the same time as my mail servers, as well a majority of my >>>other servers. My servers tried to step their time to be in sync with >>>my local NTP server, which was still busy trying to sync itself with >>>outside sources, which takes a while, so my mail servers did not get an >>>answer. Later, dovecot died because the time finally synced, and I >>>found out why pretty quick (have seen this before) but this was an >>>unusual situation. >>> >>>My point is, we had an unusual circumstance, and even though I've taken >>>steps to have my mail servers sync their time at boot and run ntpd >>>afterwards, there are some circumstances in which this is not enough, >>>and dovecot still died. Its not always because someone was lazy about >>>their time setup. >> >>My point exactly. It's amazing how some people are quick to ramble >>about someone else's administrative incompetence without taking time >>to read the situation. > > I most certainly did read your description of the situation, and my > use of the phrase "administrative incompetence" should not be taken > personally. I did not say (or mean) "administrator incompetence" and > would not try to make that sort of judgment at a distance. > >> (One person even suggested hacking the dovecot startup script to >>run ntpdate -- useless as ntpd already occupies the ports). > > That's one of the things that "ntpdate -u" is good for. > > >>Fact is, ntpd can take unpredictable delay before the initial >>time-step. Delay that can't be controlled, and it would be >>unreasonable to delay starting mail services until it is guaranteed >>to complete. Then, dovecot dies, and admin (who is not always >>immediately available) has to start it manually anyway (especially >>as it is not clear what to do with possibly unsynced timestamps) -- >>only after the unnecessary downtime. > > Or you can have an external watchdog that re-launches Dovecot if it > dies. This approach handles a broader set of failure modes and on > some OS's is a built-in feature of the startup subsystem. > > Because of the fact that Dovecot may be running in an environment > with an external watchdog, perhaps one like launchd or classical > SysV/Solaris init that can catch the exit of the process it spawned > and use it to trigger an immediate respawn. This means that adding an > internal respawn inside Dovecot that will not cause breakage on any > system is not as simple as it may seem. > >>So, the question is: why on earth can't we add a single line of code >>to dovecot to restart itself after terminating? > > You can do just that yourself if you believe that it is the best > option for your circumstances and adequate to handle the problem you > are having. One line of code might well do the trick you want on your > system. If Timo puts the functionality in the code he distributes, it > will need to be a great deal more than one line of code. > Problem I see is that an external script that *unconditionally* relaunches dovecot could be a terribly problem. It's better for dovecot to do it itself in this particular failure, because it's the only one who knows that it was just a date issue, and relaunching is safe.
Re: [Dovecot] Time moved backwards
At 10:20 PM +0400 5/14/08, Eugene wrote: Hi people, From: Adam McDougall <[EMAIL PROTECTED]> I would just like to mention a circumstance that happened to me this Sunday. We had a total power outage in our building, longer than our UPS's could last and we don't have a generator for servers (nor is it economical or needed). When the power came back on, my local NTP server came on at the same time as my mail servers, as well a majority of my other servers. My servers tried to step their time to be in sync with my local NTP server, which was still busy trying to sync itself with outside sources, which takes a while, so my mail servers did not get an answer. Later, dovecot died because the time finally synced, and I found out why pretty quick (have seen this before) but this was an unusual situation. My point is, we had an unusual circumstance, and even though I've taken steps to have my mail servers sync their time at boot and run ntpd afterwards, there are some circumstances in which this is not enough, and dovecot still died. Its not always because someone was lazy about their time setup. My point exactly. It's amazing how some people are quick to ramble about someone else's administrative incompetence without taking time to read the situation. I most certainly did read your description of the situation, and my use of the phrase "administrative incompetence" should not be taken personally. I did not say (or mean) "administrator incompetence" and would not try to make that sort of judgment at a distance. (One person even suggested hacking the dovecot startup script to run ntpdate -- useless as ntpd already occupies the ports). That's one of the things that "ntpdate -u" is good for. Fact is, ntpd can take unpredictable delay before the initial time-step. Delay that can't be controlled, and it would be unreasonable to delay starting mail services until it is guaranteed to complete. Then, dovecot dies, and admin (who is not always immediately available) has to start it manually anyway (especially as it is not clear what to do with possibly unsynced timestamps) -- only after the unnecessary downtime. Or you can have an external watchdog that re-launches Dovecot if it dies. This approach handles a broader set of failure modes and on some OS's is a built-in feature of the startup subsystem. Because of the fact that Dovecot may be running in an environment with an external watchdog, perhaps one like launchd or classical SysV/Solaris init that can catch the exit of the process it spawned and use it to trigger an immediate respawn. This means that adding an internal respawn inside Dovecot that will not cause breakage on any system is not as simple as it may seem. So, the question is: why on earth can't we add a single line of code to dovecot to restart itself after terminating? You can do just that yourself if you believe that it is the best option for your circumstances and adequate to handle the problem you are having. One line of code might well do the trick you want on your system. If Timo puts the functionality in the code he distributes, it will need to be a great deal more than one line of code. Kind of reminds me of the "fsck_y_enable=YES" option in rc.conf. Without it, if fsck does not like someting during reboot, the server would just sit there in single-user prompt, waiting for (expensive) console operations. Which is actually the right choice in some circumstances. -- Bill Cole [EMAIL PROTECTED]
Re: [Dovecot] Time moved backwards
Hi people, From: Adam McDougall <[EMAIL PROTECTED]> I would just like to mention a circumstance that happened to me this Sunday. We had a total power outage in our building, longer than our UPS's could last and we don't have a generator for servers (nor is it economical or needed). When the power came back on, my local NTP server came on at the same time as my mail servers, as well a majority of my other servers. My servers tried to step their time to be in sync with my local NTP server, which was still busy trying to sync itself with outside sources, which takes a while, so my mail servers did not get an answer. Later, dovecot died because the time finally synced, and I found out why pretty quick (have seen this before) but this was an unusual situation. My point is, we had an unusual circumstance, and even though I've taken steps to have my mail servers sync their time at boot and run ntpd afterwards, there are some circumstances in which this is not enough, and dovecot still died. Its not always because someone was lazy about their time setup. My point exactly. It's amazing how some people are quick to ramble about someone else's administrative incompetence without taking time to read the situation. (One person even suggested hacking the dovecot startup script to run ntpdate -- useless as ntpd already occupies the ports). Fact is, ntpd can take unpredictable delay before the initial time-step. Delay that can't be controlled, and it would be unreasonable to delay starting mail services until it is guaranteed to complete. Then, dovecot dies, and admin (who is not always immediately available) has to start it manually anyway (especially as it is not clear what to do with possibly unsynced timestamps) -- only after the unnecessary downtime. So, the question is: why on earth can't we add a single line of code to dovecot to restart itself after terminating? Kind of reminds me of the "fsck_y_enable=YES" option in rc.conf. Without it, if fsck does not like someting during reboot, the server would just sit there in single-user prompt, waiting for (expensive) console operations. Best wishes Eugene
Re: [Dovecot] Time moved backwards
On 5/13/08 3:33 PM, Scott Silva wrote: This would be a good case for running ntpdate on startup at least on the ntp server. Just point it to a reliable outside server. AFAIR RedHat and clones do this in the init script for ntpd. ...and how much more TIME shall we spend rechewing this non-dovecot issue ? B. Bodger
Re: [Dovecot] Time moved backwards
on 5-13-2008 12:13 PM Adam McDougall spake the following: Charles Marcus wrote: On 5/13/2008, Eugene ([EMAIL PROTECTED]) wrote: I guess terminating all current connections and restarting all processes would be pretty safe, but it's not really a high priority change for me.. Nevertheless, it would be very nice if you could fix it. It's a fairly big availability problem (for us, at least). The problem is not so much how dovecot deals with this issue, the problem is, why is your server having such drastic problems keeping its time sane? Fix that, and your problem disappears. I would just like to mention a circumstance that happened to me this Sunday. We had a total power outage in our building, longer than our UPS's could last and we don't have a generator for servers (nor is it economical or needed). When the power came back on, my local NTP server came on at the same time as my mail servers, as well a majority of my other servers. My servers tried to step their time to be in sync with my local NTP server, which was still busy trying to sync itself with outside sources, which takes a while, so my mail servers did not get an answer. Later, dovecot died because the time finally synced, and I found out why pretty quick (have seen this before) but this was an unusual situation. My point is, we had an unusual circumstance, and even though I've taken steps to have my mail servers sync their time at boot and run ntpd afterwards, there are some circumstances in which this is not enough, and dovecot still died. Its not always because someone was lazy about their time setup. But it doesn't cause me "big availability problems" since in general, my time is fine. This would be a good case for running ntpdate on startup at least on the ntp server. Just point it to a reliable outside server. AFAIR RedHat and clones do this in the init script for ntpd. -- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't signature.asc Description: OpenPGP digital signature
Re: [Dovecot] Time moved backwards
Charles Marcus wrote: On 5/13/2008, Eugene ([EMAIL PROTECTED]) wrote: I guess terminating all current connections and restarting all processes would be pretty safe, but it's not really a high priority change for me.. Nevertheless, it would be very nice if you could fix it. It's a fairly big availability problem (for us, at least). The problem is not so much how dovecot deals with this issue, the problem is, why is your server having such drastic problems keeping its time sane? Fix that, and your problem disappears. I would just like to mention a circumstance that happened to me this Sunday. We had a total power outage in our building, longer than our UPS's could last and we don't have a generator for servers (nor is it economical or needed). When the power came back on, my local NTP server came on at the same time as my mail servers, as well a majority of my other servers. My servers tried to step their time to be in sync with my local NTP server, which was still busy trying to sync itself with outside sources, which takes a while, so my mail servers did not get an answer. Later, dovecot died because the time finally synced, and I found out why pretty quick (have seen this before) but this was an unusual situation. My point is, we had an unusual circumstance, and even though I've taken steps to have my mail servers sync their time at boot and run ntpd afterwards, there are some circumstances in which this is not enough, and dovecot still died. Its not always because someone was lazy about their time setup. But it doesn't cause me "big availability problems" since in general, my time is fine.
Re: [Dovecot] Time moved backwards
At 12:48 PM +0400 5/13/08, Eugene wrote: Hello, From: "Quentin Garnier" <[EMAIL PROTECTED]> I'm not the one having trouble reading, here. The proper way to start a system is to run ntp*date* (as early as possible) and then ntpd. That's what you say, and it is far from being officially accepted. There is nothing 'official' in regards to how you start up your system that should carry more influence than having it start up correctly. NTP project clearly deprecates ntpdate for several reasons. I think that is an incorrect statement, and to the degree that it is correct, it still does not mean that it is reasonable to have an unstable clock. The practice of running ntpdate as a cron job rather than a ntp daemon certainly is deprecated and always has been, but even that is not so much because of how well it works (on most systems it is perfectly functional) but rather because it does not scale across the net: the public NTP infrastructure gets what Dr. Mills calls "little fireballs of congestion" as the cron jobs fire, and that has bad impacts on everyone's NTP-based time accuracy. On the other hand, ntpdate run once at boot still serves a purpose, even if you are running the latest ntpd with the initial stepping functionality rolled in and enabled, and it is helpful to note Per Hedeland's succinct and practical counterpoint in the FAQ immediately following Dr. Mills' long explanation of the various edge cases involved in choosing when to step and when not to. In addition, "the clock should not be stepped until a consistent offset has been observed for a sanity interval, currently 15 minutes". So ntpd may in principle step time again. http://www.ntp.org/ntpfaq/NTP-s-config.htm You are misreading and/or misrepresenting the context of the piece you quote. A good NTP daemon is properly quite conservative about stepping the clock at times *other than* at boot, and should take precautions against trusting clocks that seem wrong once it has what should be a trustworthy synchronization and characterization of its own local clock. -- Bill Cole [EMAIL PROTECTED]
Re: [Dovecot] Time moved backwards
At 3:58 PM +0200 5/13/08, AndraÏ 'ruskie' Levstik wrote: On 15:48:42 2008-05-13 Bill Cole <[EMAIL PROTECTED]> wrote: At 11:31 AM +0400 5/13/08, Eugene wrote: >Hi Timo, > >From: "Timo Sirainen" <[EMAIL PROTECTED]> >>>I suggest that Dovecot simply terminate the current connections >>>(causing the client to reconnect) or -- if the time change is really >>>that much of a problem -- to restart itself automatically. > >>I guess terminating all current connections and restarting all >>processes would be pretty safe, but it's not really a high priority >>change for me.. > >Nevertheless, it would be very nice if you could fix it. It's a >fairly big availability problem (for us, at least). Then you have a badly broken system. There is no explanation for time going backwards on a server on a frequent unplanned basis that is not reducible to administrative incompetence or malfunctioning hardware (and the latter as a chronic issue can be seen as just a special case of the former.) Harsh... I think it is not so harsh if you read what I wrote carefully. Part of what I meant to convey was that the real circumstances of a clock jumping backwards ought to be rare and predictable, such as a long period unpowered, either long enough to drain the clock battery or just long enough for the system clock to drift more than 1/8 of a second. If your system clock doesn't stay pretty close across a regular reboot, you have a hardware problem (most likely a dead clock battery...) I should probably also note that I did not use 'incompetence' as a generic term and it does not mean 'stupid' or 'bad' or anything else more general, vague, and pejorative. -- Bill Cole [EMAIL PROTECTED]
Re: [Dovecot] Time moved backwards
>>> Nevertheless, it would be very nice if you could fix it. It's a >>> fairly big availability problem (for us, at least). >> Then you have a badly broken system. There is no explanation for time >> going backwards on a server on a frequent unplanned basis that is not >> reducible to administrative incompetence or malfunctioning hardware >> (and the latter as a chronic issue can be seen as just a special case >> of the former.) > Harsh... Maybe... but it is still true... -- Best regards, Charles
Re: [Dovecot] Time moved backwards
On 15:48:42 2008-05-13 Bill Cole <[EMAIL PROTECTED]> wrote: > At 11:31 AM +0400 5/13/08, Eugene wrote: > >Hi Timo, > > > >From: "Timo Sirainen" <[EMAIL PROTECTED]> > >>>I suggest that Dovecot simply terminate the current connections > >>>(causing the client to reconnect) or -- if the time change is really > >>>that much of a problem -- to restart itself automatically. > > > >>I guess terminating all current connections and restarting all > >>processes would be pretty safe, but it's not really a high priority > >>change for me.. > > > >Nevertheless, it would be very nice if you could fix it. It's a > >fairly big availability problem (for us, at least). > > Then you have a badly broken system. There is no explanation for time > going backwards on a server on a frequent unplanned basis that is not > reducible to administrative incompetence or malfunctioning hardware > (and the latter as a chronic issue can be seen as just a special case > of the former.) > Harsh... > >And after all, if we are terminating already, adding a simple spawn > >call before that should not take much time? > > A system clock that moves backwards is indicative of a problem. > Having a service respawn itself as a response to a problem that is > outside of its control (i.e. the respawn is not itself a fix) is > begging for trouble, because that behavior has to be carefully > controlled to prevent it from contributing to a cascading problem. On > a system whose clock is untrustworthy, this is a significant > challenge. The effort to do that sort of code correctly just to > accommodate > people with broken systems seems like a terrible waste. > > > On the other hand, writing a freestanding watchdog for a critical > service is (or at least should be) something any good sysadmin can > do. If you are stuck with hardware so broken that it jumps backwards > in time without warning but not so broken that you can get it > replaced, and it lives in a network or resource environment that > prevents you from fixing the core problem, you can adapt to the > breakage yourself. I use monit for monitoring services... It so far has worked great... auto restarts etc... depneding on configuration along with a web based interface one can see a quick overview and control things(of course optional web interface) -- Andraž "ruskie" Levstik Source Mage GNU/Linux Games grimoire guru Geek/Hacker/Tinker Be sure brain is in gear before engaging mouth. Ryle hira. Key id = F4C1F89C Key fingerprint = 6FF2 8F20 4C9D DB36 B5B6 F134 884D 72CC F4C1 F89C
Re: [Dovecot] Time moved backwards
At 11:13 AM +0400 5/13/08, Eugene wrote: Hello, I would like to suggest a change in handling of 'Time moved backwards' problem. Right now dovecot just dies. So, the scenario: 1) Colocation server is shut down for some reason. The internal time drifts. 2) Server is started again. 3) Dovecot starts successfully. 4) In about a minute, NTP daemon feels confident about adjusting the system time. That's broken. Either your startup is running in the wrong order, it is missing a step, or your NTP daemon is misconfigured. This sort of problem is why some OS's default startup procedure is intentionally designed to block on 'ntpdate' running successfully. You are likely to be better off with a system that is obviously not working than one which started and then was subjected to a backwards clock change, which can harm more than Dovecot. -- Bill Cole [EMAIL PROTECTED]
Re: [Dovecot] Time moved backwards
At 11:31 AM +0400 5/13/08, Eugene wrote: Hi Timo, From: "Timo Sirainen" <[EMAIL PROTECTED]> I suggest that Dovecot simply terminate the current connections (causing the client to reconnect) or -- if the time change is really that much of a problem -- to restart itself automatically. I guess terminating all current connections and restarting all processes would be pretty safe, but it's not really a high priority change for me.. Nevertheless, it would be very nice if you could fix it. It's a fairly big availability problem (for us, at least). Then you have a badly broken system. There is no explanation for time going backwards on a server on a frequent unplanned basis that is not reducible to administrative incompetence or malfunctioning hardware (and the latter as a chronic issue can be seen as just a special case of the former.) And after all, if we are terminating already, adding a simple spawn call before that should not take much time? A system clock that moves backwards is indicative of a problem. Having a service respawn itself as a response to a problem that is outside of its control (i.e. the respawn is not itself a fix) is begging for trouble, because that behavior has to be carefully controlled to prevent it from contributing to a cascading problem. On a system whose clock is untrustworthy, this is a significant challenge. The effort to do that sort of code correctly just to accommodate people with broken systems seems like a terrible waste. On the other hand, writing a freestanding watchdog for a critical service is (or at least should be) something any good sysadmin can do. If you are stuck with hardware so broken that it jumps backwards in time without warning but not so broken that you can get it replaced, and it lives in a network or resource environment that prevents you from fixing the core problem, you can adapt to the breakage yourself. -- Bill Cole [EMAIL PROTECTED]
Re: [Dovecot] Time moved backwards
On 5/13/2008, Eugene ([EMAIL PROTECTED]) wrote: >> I guess terminating all current connections and restarting all processes >> would be pretty safe, but it's not really a high priority change for >> me.. > Nevertheless, it would be very nice if you could fix it. It's a > fairly big availability problem (for us, at least). The problem is not so much how dovecot deals with this issue, the problem is, why is your server having such drastic problems keeping its time sane? Fix that, and your problem disappears. -- Best regards, Charles
Re: [Dovecot] Time moved backwards
On 10:48:28 2008-05-13 "Eugene" <[EMAIL PROTECTED]> wrote: > Hello, > > From: "Quentin Garnier" <[EMAIL PROTECTED]> > > I'm not the one having trouble reading, here. The proper way to > > start a system is to run ntp*date* (as early as possible) and then > > ntpd. > > That's what you say, and it is far from being officially accepted. > NTP project clearly deprecates ntpdate for several reasons. In > addition, "the clock should not be stepped until a consistent offset > has been observed for a sanity interval, currently 15 minutes". So ntpd > may in principle step time again. > http://www.ntp.org/ntpfaq/NTP-s-config.htm > > Eugene Maybe you should use openntpd... that's what works fine for me... -- Andraž "ruskie" Levstik Source Mage GNU/Linux Games grimoire guru Geek/Hacker/Tinker Be sure brain is in gear before engaging mouth. Ryle hira. Key id = F4C1F89C Key fingerprint = 6FF2 8F20 4C9D DB36 B5B6 F134 884D 72CC F4C1 F89C
Re: [Dovecot] Time moved backwards
On Tue, 2008-05-13 at 12:51 +0400, Anton Yuzhaninov wrote: > Timo Sirainen пишет: > > On Tue, 2008-05-13 at 11:13 +0400, Eugene wrote: > > > >> I suggest that Dovecot simply terminate the current connections (causing > >> the > >> client to reconnect) or -- if the time change is really that much of a > >> problem -- to restart itself automatically. > > > > I guess terminating all current connections and restarting all processes > > would be pretty safe, but it's not really a high priority change for > > me.. > > IMHO more robust is to use clock_gettime(CLOCK_MONOTONIC, ..) for timeouts > and just > work fine even if time was changed via settimeofday(). Two problems with that: 1) clock_gettime() doesn't work everywhere (e.g. OSX). 2) With the current design gettimeofday() has to be called anyway to get the current real timestamp, causing extra unnecessary work. Anyway that's not the main problem. I'm more concerned about timestamp comparison code where one or both of the timestamps come from filesystem. It might cause some corruption, such as dotlock being deleted while another process still holds it. signature.asc Description: This is a digitally signed message part
Re: [Dovecot] Time moved backwards
Timo Sirainen пишет: On Tue, 2008-05-13 at 11:13 +0400, Eugene wrote: I suggest that Dovecot simply terminate the current connections (causing the client to reconnect) or -- if the time change is really that much of a problem -- to restart itself automatically. I guess terminating all current connections and restarting all processes would be pretty safe, but it's not really a high priority change for me.. IMHO more robust is to use clock_gettime(CLOCK_MONOTONIC, ..) for timeouts and just work fine even if time was changed via settimeofday(). -- WBR, Anton Yuzhaninov Rambler Mail
Re: [Dovecot] Time moved backwards
Hello, From: "Quentin Garnier" <[EMAIL PROTECTED]> I'm not the one having trouble reading, here. The proper way to start a system is to run ntp*date* (as early as possible) and then ntpd. That's what you say, and it is far from being officially accepted. NTP project clearly deprecates ntpdate for several reasons. In addition, "the clock should not be stepped until a consistent offset has been observed for a sanity interval, currently 15 minutes". So ntpd may in principle step time again. http://www.ntp.org/ntpfaq/NTP-s-config.htm Eugene
Re: [Dovecot] Time moved backwards
On Tue, May 13, 2008 at 11:39:54AM +0400, Eugene wrote: > Hello, > > From: "Quentin Garnier" <[EMAIL PROTECTED]> >>> 4) In about a minute, NTP daemon feels confident about adjusting the >>> system >>> time. >> The admin should run ntpdate before launching ntpd and dovecot. ntpd >> will _never_ move time backwards under normal drifting conditions (it >> has other ways of coping with that). > > Please read carefully. ntpd IS run before dovecot, but a change of time I'm not the one having trouble reading, here. The proper way to start a system is to run ntp*date* (as early as possible) and then ntpd. NTP documentation has even more details how to do this properly. http://support.ntp.org/bin/view/Support/StartingNTP4#Section_7.1.1. -- Quentin Garnier - [EMAIL PROTECTED] - [EMAIL PROTECTED] "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007. pgp52DiFj9EyW.pgp Description: PGP signature
Re: [Dovecot] Time moved backwards
Hello, From: "Quentin Garnier" <[EMAIL PROTECTED]> 4) In about a minute, NTP daemon feels confident about adjusting the system time. The admin should run ntpdate before launching ntpd and dovecot. ntpd will _never_ move time backwards under normal drifting conditions (it has other ways of coping with that). Please read carefully. ntpd IS run before dovecot, but a change of time happens some time later. Of course, dovecot starting script can be hacked to sleep for some time, but it feels like a wrong way to solve a problem. Best wishes Eugene
Re: [Dovecot] Time moved backwards
On Tue, May 13, 2008 at 11:13:39AM +0400, Eugene wrote: > Hello, > > I would like to suggest a change in handling of 'Time moved backwards' > problem. > Right now dovecot just dies. So, the scenario: > 1) Colocation server is shut down for some reason. The internal time drifts. > 2) Server is started again. > 3) Dovecot starts successfully. > 4) In about a minute, NTP daemon feels confident about adjusting the system > time. > 5) Dovecot sees the changed time and dies. > 6) Admin has to notice that, login and restart Dovecot manually. The admin should run ntpdate before launching ntpd and dovecot. ntpd will _never_ move time backwards under normal drifting conditions (it has other ways of coping with that). -- Quentin Garnier - [EMAIL PROTECTED] - [EMAIL PROTECTED] "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007. pgpcTVBkYkVfK.pgp Description: PGP signature
Re: [Dovecot] Time moved backwards
Hi Timo, From: "Timo Sirainen" <[EMAIL PROTECTED]> I suggest that Dovecot simply terminate the current connections (causing the client to reconnect) or -- if the time change is really that much of a problem -- to restart itself automatically. I guess terminating all current connections and restarting all processes would be pretty safe, but it's not really a high priority change for me.. Nevertheless, it would be very nice if you could fix it. It's a fairly big availability problem (for us, at least). And after all, if we are terminating already, adding a simple spawn call before that should not take much time? Best wishes Eugene
Re: [Dovecot] Time moved backwards
On 09:23:57 2008-05-13 Timo Sirainen <[EMAIL PROTECTED]> wrote: > On Tue, 2008-05-13 at 11:13 +0400, Eugene wrote: > > > I suggest that Dovecot simply terminate the current connections > > (causing the client to reconnect) or -- if the time change is really > > that much of a problem -- to restart itself automatically. > > I guess terminating all current connections and restarting all processes > would be pretty safe, but it's not really a high priority change for > me.. > > > Maybe a config option could be introduced. > > There are too many settings already. Or simply launch ntpd with the -s or whatever the appropriate switch to adjust time on bootup and ensure it starts pre-dovecot... -- Andraž "ruskie" Levstik Source Mage GNU/Linux Games grimoire guru Geek/Hacker/Tinker Be sure brain is in gear before engaging mouth. Ryle hira. Key id = F4C1F89C Key fingerprint = 6FF2 8F20 4C9D DB36 B5B6 F134 884D 72CC F4C1 F89C
Re: [Dovecot] Time moved backwards
On Tue, 2008-05-13 at 11:13 +0400, Eugene wrote: > I suggest that Dovecot simply terminate the current connections (causing the > client to reconnect) or -- if the time change is really that much of a > problem -- to restart itself automatically. I guess terminating all current connections and restarting all processes would be pretty safe, but it's not really a high priority change for me.. > Maybe a config option could be introduced. There are too many settings already. signature.asc Description: This is a digitally signed message part