On 12/10/2009 11:37 AM, Victor Duchovni wrote:
> On Thu, Dec 10, 2009 at 11:12:36AM -0800, Emmett Culley wrote:
> 
>>>> It seems like understanding where the "delay=86457"
>>>> and "delays=86457/0/0.36/0.18" come from would probably help me to
>>>> understand the 24 delay.
>>>
>>> Not really. The message took 1 day to enter the active queue, not
>>> surprising, since pickup seems to have it a day late. Did your system
>>> clock get changed (by a day or so) while Postfix was running?
>>
>> So, where does the delay=86457 come from?  It is clear to me that the
>> cron job's email is getting stuck on the originating machine for 24 hours.
> 
> It comes from the message sitting in the maildrop queue for 24 hours,
> or from a 24 hour delay between the invocation of sendmail(1) and EOF on
> the message body. If, for example, the process that has the write-side of
> the pipe to sendmail(1) keeps it open for 24 hours, the message will be
> delayed by 24 hours.
> 
>> I cannot let go of the fact that the log shows a delay=86457 (which is
>> a few second more that 24 hours) isn't related to the late delivery.
> 
> It is of course related, but does not answer your underlying question.
> 
>> Where does the delay=86457 come from?  Who sets it?  What does it
>> represent?  Where is the messages stored for the missing 24 hours?
> 
> The pre-active delay is measured from the time message is created until
> it enters the active queue (in this case for the first time). So the
> message was either:
> 
>       - Not yet fully formed (delayed EOF in the
>               mail(1)->sendmail(1)->postdrop(1)) process chain)
> 
>       - Fully formed, but unprocessed in the maildrop queue for 24 hours.
> 
>       - Fully formed, but unprocessed in the incoming queue for 24 hours.
> 
> The last can be ruled out, because the pickup(8) and cleanup(8)
> log entries do not predate the qmgr(8) log entries by 24 hours.
> 
> So, it is one of the first two. None of this requires in-depth
> understanding of the "delay" or "delays" fields, because the same evidence
> is clear from the logs and headers. The "delays=1day" value just corroborates
> the fact that the message went nowhere for 24 hours.
> 
> Do try to rule out the possibility that the code creating the message
> is directly or via a child-process keeping the write-side pipe open
> for 24 hours or so.
> 
> If this happens every day, just list the maildrop queue a few hours
> before the expected departure time of the message:
> 
>       # ls -l /var/spool/postfix/maildrop
> 
> and report your findings. If you find any files there with a timestamp
> nearly a day old, also run:
> 
>       # fuser /var/spool/postfix/maildrop/<filename>
> 
> and report the "ps -p<pid>" output for that process id (any parent
> processes also if relevant).
> 
Thanks to the hints above I looked into the code that I was calling from the 
cron job.  I found that one of the scripts that was being called never returned 
until the next night when it was executed again.  Hence the 24 hour delay.

I fixed that script and now the email from cron arrives when it should.

Thanks for your comments.

Reply via email to