I've been running Kea with LFC for 2 days, simulating DHCP requests at the rate of 1 lease/sec with valid lifetime of 60 seconds. I haven't reproduced the problem so far.
Marcin On 30.11.2017 09:28, Chaigneau, Nicolas wrote: > > > > I've restarted Kea a week ago with debug logs, in the hope of seeing > something when the issue manifests. > > No "luck" so far. Still waiting... > > > > >> The worker thread seems to be running. >> The following command (on the Kea process not executing LFC) shows two >> threads: >> >> ps -T -p <pid of Kea> >> PID SPID TTY TIME CMD >> 28268 28268 ? 00:00:03 kea-dhcp4 >> 28268 28269 ? 00:00:03 kea-dhcp4 >> >> >> The issue also happened on our lab, so I could try to upgrade to 1.3.0. >> >> But I would really prefer to find out the issue with 1.2.0 and fix this >> version. >> We've just deployed 1.2.0 on production, having to upgrade now would be... >> difficult :/ >> >> >> >>> It would be useful to check if the worker thread triggering timers is >>> still running or died. The kea-dhcp4 process should run main thread and >>> the second thread for timers. You could use "ps" or "top" to verify this. >>> >>> In the Kea 1.3.0, we have removed the worker thread and made several >>> significant updates to the timers scheduling. It would be useful it you >>> could verify whether Kea 1.3.0 also has this issue, but if it is >>> production environment, it is not good for experiments. >>> >>> Marcin >>> >>> On 20.11.2017 10:21, Chaigneau, Nicolas wrote: >>>> The Kea server not executing LFC is still running, so if there's something >>>> you want me to check... feel free to ask :) >>>> >>>> >>>> >>>> -----Message d'origine----- >>>> De : Marcin Siodelski [mailto:[email protected]] >>>> >>>> Nicolas, >>>> >>>> Thanks for creating the ticket. We'll have a look and see if we can >>>> reproduce it here. >>>> >>>> Marcin Siodelski >>>> ISC >>>> >>>> On 20.11.2017 09:07, Chaigneau, Nicolas wrote: >>>>> >>>>> Observed a 4th occurrence of the issue. >>>>> >>>>> Server started on 2017, November 15th at 17:20:47 >>>>> Last time LFC was executed: on November 19th at 04:50:47 >>>>> >>>>> I'll open a ticket. >>>>> >>>>> >>>>> Regards, >>>>> Nicolas. >>>>> >>>>>> >>>>>> So this is confirmed: LFC is really not executed anymore at some point, >>>>>> for a reason yet to be determined. >>>>>> >>>>>> I'll have a look at the changes between Kea 0.9.1 and 1.2.0. >>>>>> >>>>>> >>>>>> If you think of anything that could help, I'm all ears :) >>>>>> >>>>>> >>>>>> >>>>>> Regards, >>>>>> Nicolas. >>>>>> >>>>>> >>>>>> >>>>>>> Hello Marcin, >>>>>>> >>>>>>> >>>>>>> Thanks for your answer. >>>>>>> >>>>>>> >>>>>>> I was too hasty to suspect the log rotation, this has nothing to do >>>>>>> with it. Rotation occurs at night between 3 and 4am, and the last log I >>>>>>> have was in the middle of the day. >>>>>>> >>>>>>> Furthermore, I think the issue manifesting after 24H is just a >>>>>>> coincidence. Before that I've had Kea run for several days before >>>>>>> seeing the issue. So it seems random... >>>>>>> >>>>>>> I don't think there's an issue with logging either, because when I >>>>>>> stopped the server after the LFC issue appeared, then I got the normal >>>>>>> shutdown logs. >>>>>>> >>>>>>> I think the LFC is not executed because of the two observations: >>>>>>> - No log from Kea saying that LFC is launched >>>>>>> - Lease files are not modified anymore (and their last modification >>>>>>> date coincides with the last "DHCPSRV_MEMFILE_LFC_START" log from Kea) >>>>>>> >>>>>>> But you're right, I'm not 100% sure LFC is not executed... >>>>>>> >>>>>>> >>>>>>> So I've launched a little script that does loops over a "ps" to try and >>>>>>> catch the LFC process running. >>>>>>> It's not fully reliable because if LFC has nothing to do, it may start >>>>>>> and end so fast that the ps won't see it. >>>>>>> I tried that with LFC scheduled every 10 seconds. The script catches >>>>>>> about 2 LFC executions out of 3. >>>>>>> >>>>>>> So on the live system on which the LFC appears to not run anymore, I'll >>>>>>> let the script run for an hour. In this time frame LFC is normally >>>>>>> executed 6 times. >>>>>>> If I don't see anything, I'll conclude that LFC is indeed not executed. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Nicolas. >>>>>>> >>>>>>> -----Message d'origine----- >>>>>>> De : Marcin Siodelski [mailto:[email protected]] >>>>>>> >>>>>>> Nicolas, >>>>>>> >>>>>>> I think we should start from checking if the issue doesn't stem from the >>>>>>> log rotation. If I understand correctly, you're rotating the log file >>>>>>> daily. Per your message below, things go weird after around 24 hours >>>>>>> since server startup. Is it possible to verify that rotating every 12h >>>>>>> would cause the issue to begin after 12h rather than 24h? >>>>>>> >>>>>>> I can't see anything in the Kea code that could prevent periodic >>>>>>> execution of LFC, other than server reconfiguration. You're saying that >>>>>>> LFC is not executed because the lease files aren't modified. But, is >>>>>>> there any other log output present there? Perhaps the LFC is not >>>>>>> executed because of some trouble with logging being a result of log file >>>>>>> rotation? We should see if logging generally works fine after log file >>>>>>> rotation in the first place. Could you enable more verbose logging to >>>>>>> see if any log is produced when the LFC starting message is gone? >>>>>>> >>>>>>> Marcin Siodelski >>>>>>> ISC >>>>>>> >>>>>>> On 17.11.2017 10:41, Chaigneau, Nicolas wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> >>>>>>>> I've observed the issue again. >>>>>>>> >>>>>>>> With a Kea server that does not receive any DHCP packet: >>>>>>>> At some point, Kea is still running but LFC is no longer executed. >>>>>>>> >>>>>>>> In Kea log file we don't have "DHCPSRV_MEMFILE_LFC_START" messages >>>>>>>> anymore. >>>>>>>> And the lease files are not modified on disk (they would be if LFC was >>>>>>>> executed, so this is not just a logging issue). >>>>>>>> >>>>>>>> Kea was started on 2017, November 15th at 16:20:23. >>>>>>>> LFC was first executed 10 minutes later, at 16:30:23. >>>>>>>> And the last execution of LFC: November 16th, at 16:40:23. >>>>>>>> Nothing after that... >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> Nicolas. >>>>>>>> >>>>>>>> >>>>>>>>> Wait... when I restarted the server, I got logs again... before the >>>>>>>>> shutdown! >>>>>>>>> So logging actually still worked. I just didn't have anything being >>>>>>>>> logged. >>>>>>>>> >>>>>>>>> 2017-11-15 17:20:45.551 INFO [kea-dhcp4.dhcp4/29754] DHCP4_SHUTDOWN >>>>>>>>> server shutdown >>>>>>>>> 2017-11-15 17:20:47.603 INFO [kea-dhcp4.dhcp4/28141] DHCP4_STARTING >>>>>>>>> Kea DHCPv4 server version 1.2.0 starting >>>>>>>>> >>>>>>>>> >>>>>>>>> The only logs I normally have in Kea logs is LFC execution every 10 >>>>>>>>> minutes: >>>>>>>>> >>>>>>>>> 2017-11-16 02:10:47.617 INFO [kea-dhcp4.dhcpsrv/28141] >>>>>>>>> DHCPSRV_MEMFILE_LFC_START starting Lease File Cleanup >>>>>>>>> 2017-11-16 02:10:47.617 INFO [kea-dhcp4.dhcpsrv/28141] >>>>>>>>> DHCPSRV_MEMFILE_LFC_EXECUTE executing Lease File Cleanup (...) >>>>>>>>> >>>>>>>>> >>>>>>>>> I don't have these logs anymore after some time. >>>>>>>>> So... either LFC is no longer executed after a while, or Kea stops >>>>>>>>> writing these log messages ? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> A few more detail: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This worked fine before (with the same logrotate configuration), >>>>>>>>>> with: >>>>>>>>>> - Kea 0.9.1 >>>>>>>>>> - RHEL 6.4 >>>>>>>>>> >>>>>>>>>> The issue now appears with: >>>>>>>>>> - Kea 1.2.0 >>>>>>>>>> - RHEL 7.1 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Should I open a ticket for this ? (I'm not sure the problem is with >>>>>>>>>> Kea, maybe it's logrotate...) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I've set up Kea log rotation with logrotate. >>>>>>>>>>> This works for some time, and then it stops working: the new log >>>>>>>>>>> file stays empty, until the server is restarted. >>>>>>>>>>> I'm not sure what's going on... has anyone else had this issue ? >>>>>>>>>>> >>>>>>>>>>> I'm using the following options for logrotate: >>>>>>>>>>> >>>>>>>>>>> daily >>>>>>>>>>> rotate 30 >>>>>>>>>>> missingok >>>>>>>>>>> nocompress >>>>>>>>>>> copytruncate >>>>>>>>>>> dateext >>>>>>>>>>> extension .log >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I thought "copytruncate" would be good enough, but apparently it is >>>>>>>>>>> not. >>>>>>>>>>> >>>>>>>>>>> Is it possible to send a signal to Kea so that it reopens its log >>>>>>>>>>> file ? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Nicolas. > > This message contains information that may be privileged or confidential and > is the property of the Capgemini Group. It is intended only for the person to > whom it is addressed. If you are not the intended recipient, you are not > authorized to read, print, retain, copy, disseminate, distribute, or use this > message or any part thereof. If you receive this message in error, please > notify the sender immediately and delete all copies of this message. > _______________________________________________ Kea-users mailing list [email protected] https://lists.isc.org/mailman/listinfo/kea-users
