Re: [lopsa-tech] Any SCO MMDF gurus on the list?

John BORIS Sat, 10 Jan 2009 19:50:09 -0800

Phil,
Thanks. No we didn't fill any disks. I have about 18 messages hung on
each of the 20 servers. I am about to just delete all of them and
restart things. I did a restart of the mmdf sctipts but that didn't
help. In the past the main relay server used to stop accepting mail
beacause of a bad or malformed email Like you said it is flaky. I
learned this and when mail used to hang on the relay server I would find
the culprit and get delete. This time it is all 20 servers taht sort of
got in unison to be stubborn and stop sending the mail. The relay server
is sending email like a champ.

I haven't restarted any of the servers which is my last resort. I use
email on these servers to handle reports and also incident reporting
(failed and succesful backups).

Thanks for the reply. 

>>> Phil Pennock <[email protected]> 01/10/09 2:20 AM >>>
On 2009-01-08 at 20:27 -0500, John  BORIS wrote:
> I have a group of 20 remote servers running SCO Open Server 5.0.6 that
> have all stopped sending mail. This is an internal setup where they
send
> mail to a relay server. I can telnet to the smtp port on the relay
> server from each of them so it isn't a connection issue. We did have a
> WAN outage for about an hour just before this started.  I am looking
for
> something or someway to kick start the mail.  Any ideas would be
greatly
> appreciated.

$previous_employer (ISP) where I was postmaster used MMDF for incoming
customer mail until I migrated it away.  I used it for long enough to
make it vaguely reliable (Support Dept thought we'd finished the Exim
migration before it started because the complaints stopped, which I took
as a nice compliment).  However, the painful memories are being
self-censored.  Sorry.

Note that this was MMDF from SCO for historical reasons, but not running
on SCO -- I've only even touched SCO once.

If I recall correctly, MMDF's a queue-based design where mails move
between different queues as it's processed, somewhat like mailman.  I
remember liking the general design philosophy, but it's not 7-bit clean
and it's very old code.  And it involves too much FS metadata
manipulation to scale as well as some alternatives.  I think that
there's a data-file with the content in one directory and then a
control file which gets hard-linked to shuffle it between the various
stage queues, but I'm no longer sure of that.

So, what's probably happened is that the queue-runners for the queue
handling mail to the relay host have gone down; ISTR some flakiness with
queue-runners and the keep-alive scripts.  I think the runners mostly
run independently so the outbound runners can be dead and there's no
meta-daemon to restart them -- it's just the regular start-up scripts.
Again though, I'm no longer sure of my recollection.

Honestly, my first instinct would be to down and up the service using
the regular init scripts to kick all the queue-runners into service and
if they fail, start looking at logs for diagnostics.  If the volume is
high enough that the relay server caused enough mail to back up, did you
fill any disks?

-Phil

_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Re: [lopsa-tech] Any SCO MMDF gurus on the list?

Reply via email to