>But you've used 'logger -t ntpdate' - this is can fail again and logs can be empty again. What do you mean by 'fall again'? Piping to logger uses standard blocking I/O - logger gets all the output it can reach, so it get all output strace will produce. If ntpdate will hang for some reason - we should see it in strace output. If ntpdate will exit - we will see this too.
On Wed, Jan 27, 2016 at 12:57 PM, Maksim Malchuk <mmalc...@mirantis.com> wrote: > But you've used 'logger -t ntpdate' - this is can fail again and logs can > be empty again. > My opinion we should use output redirection to the log-file directly. > > > On Wed, Jan 27, 2016 at 11:21 AM, Stanislaw Bogatkin < > sbogat...@mirantis.com> wrote: > >> Yes, I have created custom iso with debug output. It didn't help, so >> another one with strace was created. >> On Jan 27, 2016 00:56, "Alex Schultz" <aschu...@mirantis.com> wrote: >> >>> On Tue, Jan 26, 2016 at 2:16 PM, Stanislaw Bogatkin >>> <sbogat...@mirantis.com> wrote: >>> > When there is too high strata, ntpdate can understand this and always >>> write >>> > this into its log. In our case there are just no log - ntpdate send >>> first >>> > packet, get an answer - that's all. So, fudging won't save us, as I >>> think. >>> > Also, it's a really bad approach to fudge a server which doesn't have >>> a real >>> > clock onboard. >>> >>> Do you have a debug output of the ntpdate somewhere? I'm not finding >>> it in the bugs or in some of the snapshots for the failures. I did >>> find one snapshot with the -v change that didn't have any response >>> information so maybe it's the other problem where there is some >>> network connectivity isn't working correctly or the responses are >>> getting dropped somewhere? >>> >>> -Alex >>> >>> > >>> > On Tue, Jan 26, 2016 at 10:41 PM, Alex Schultz <aschu...@mirantis.com> >>> > wrote: >>> >> >>> >> On Tue, Jan 26, 2016 at 11:42 AM, Stanislaw Bogatkin >>> >> <sbogat...@mirantis.com> wrote: >>> >> > Hi guys, >>> >> > >>> >> > for some time we have a bug [0] with ntpdate. It doesn't reproduced >>> 100% >>> >> > of >>> >> > time, but breaks our BVT and swarm tests. There is no exact point >>> where >>> >> > problem root located. To better understand this, some verbosity to >>> >> > ntpdate >>> >> > output was added but in logs we can see only that packet exchange >>> >> > between >>> >> > ntpdate and server was started and was never completed. >>> >> > >>> >> >>> >> So when I've hit this in my local environments there is usually one or >>> >> two possible causes for this. 1) lack of network connectivity so ntp >>> >> server never responds or 2) the stratum is too high. My assumption is >>> >> that we're running into #2 because of our revert-resume in testing. >>> >> When we resume, the ntp server on the master may take a while to >>> >> become stable. This sync in the deployment uses the fuel master for >>> >> synchronization so if the stratum is too high, it will fail with this >>> >> lovely useless error. My assumption on what is happening is that >>> >> because we aren't using a set of internal ntp servers but rather >>> >> relying on the standard ntp.org pools. So when the master is being >>> >> resumed it's struggling to find a good enough set of servers so it >>> >> takes a while to sync. This then causes these deployment tasks to fail >>> >> because the master has not yet stabilized (might also be geolocation >>> >> related). We could either address this by fudging the stratum on the >>> >> master server in the configs or possibly introducing our own more >>> >> stable local ntp servers. I have a feeling fudging the stratum might >>> >> be better when we only use the master in our ntp configuration. >>> >> >>> >> > As this bug is blocker, I propose to merge [1] to better >>> understanding >>> >> > what's going on. I created custom ISO with this patchset and tried >>> to >>> >> > run >>> >> > about 10 BVT tests on this ISO. Absolutely with no luck. So, if we >>> will >>> >> > merge this, we would catch the problem much faster and understand >>> root >>> >> > cause. >>> >> > >>> >> >>> >> I think we should merge the increased logging patch anyway because >>> >> it'll be useful in troubleshooting but we also might want to look into >>> >> getting an ntp peers list added into the snapshot. >>> >> >>> >> > I appreciate your answers, folks. >>> >> > >>> >> > >>> >> > [0] https://bugs.launchpad.net/fuel/+bug/1533082 >>> >> > [1] https://review.openstack.org/#/c/271219/ >>> >> > -- >>> >> > with best regards, >>> >> > Stan. >>> >> > >>> >> >>> >> Thanks, >>> >> -Alex >>> >> >>> >> >>> __________________________________________________________________________ >>> >> OpenStack Development Mailing List (not for usage questions) >>> >> Unsubscribe: >>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> > >>> > >>> > >>> > >>> > -- >>> > with best regards, >>> > Stan. >>> > >>> > >>> __________________________________________________________________________ >>> > OpenStack Development Mailing List (not for usage questions) >>> > Unsubscribe: >>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> > >>> >>> >>> __________________________________________________________________________ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: >>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > > -- > Best Regards, > Maksim Malchuk, > Senior DevOps Engineer, > MOS: Product Engineering, > Mirantis, Inc > <vgor...@mirantis.com> > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- with best regards, Stan.
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev