I need to clarify this a little bit more after doing more research and getting a confirmation of the issue.

The problem wasn't a corrupted metabase, but instead a limitation in MS SMTP that was exposed by using VamSoft's ORF which plugs-into MS SMTP.  I use this software for doing mostly address validation; an absolute requirement when you are gatewaying through IMail for customers due to the widespread dictionary attacks and backscatter.  My ORF/MS SMTP gateways relay to my IMail/Declude server for full processing.

The exact source of the problem was a limit in threads in MS SMTP, which is something like 20 per processor (not sure if hyperthreading counts in this case).  I had ORF configured to tarpit senders of invalid addresses for 60 seconds, which was done by delaying the 5xx error code by 60 seconds.  This contributed greatly to me hitting this wall.  When this wall was hit, MS SMTP would start dropping connections, and because this pool is shared in IIS with the FTP service also, it was causing FTP to be almost non-responsive.  Removing the tarpitting allows the server much more headroom before it hits this threshold.

While simple address validation and minimal blacklisting/filtering can apparently scale up to at least 1 million messages per day with ORF and MS SMTP with a light configuration, this limitation in the MS SMTP architecture makes it inappropriate for any full scale spam blocking solution such as Declude unless you have that application ride behind MS SMTP instead of being plugged into it.  That's probably bad news for Declude if they wanted to create a MS SMTP plug-in, though it would appear that it is something that could be worked around by avoiding the sink interface except for a select few tests such as address validation and before arrival blacklisting (rejecting spam during the SMTP envelope based on simple tests).  Virus scanning, external filters and custom filters would eat up the limited threads too fast to be run within this framework and they would need to create a queuing mechanism similar to what they have with IMail in order to avoid it.

One other note.  I had previously used the Windows registry hack to enable a native tarpitting feature in MS SMTP with even worse affects.  The built-in tarpitting in MS SMTP will delay all 5xx error codes by the time that you set, and this included the 552 error used when messages are over the maximum size.  The result of this terrible oversight is that a fair number of servers will not recognize the delayed 552 error and will requeue the oversized messages over and over again, and that eats up a lot of bandwidth real, real quick.  ORF only delayed it's own 5xx responses generated by it's tests, so it was better to use until the thread limit was reached.

Matt



Panda Consulting S.A. Luis Alberto Arango wrote:
Thanks a lot for the follow up and answer to your own post. It may help us
in the future. You are very kind.

I am glad you were able to solve the problem. regards

Luis Arango
 

  
-----Original Message-----
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED]] On Behalf Of Matt
Sent: Miércoles, 04 de Enero de 2006 08:37 a.m.
To: Declude.JunkMail@declude.com
Subject: Re: [Declude.JunkMail] OT: Issues with Windows 2003 
FTP service

Good morning all,

I figured that I might save those that might respond some 
time...I found and fixed the issue.

Turns out that the MS SMTP part of the metabase was still 
corrupt in some way...not sure exactly how...and this was 
causing FTP of all things to behave very, very slowly (while 
MS SMTP was operating normally).  
After a lot of playing around with things I figured out that 
it was the MS SMTP segment of the metabase that when enabled 
as it was originally would cause FTP to drag, and I also 
found that stopping the MS SMTP service would cause FTP to 
return to normal.  Why???  Who really knows, but when my 
metabase was corrupted, it was a corruption in the MS SMTP 
portion of the file and somehow it is still bad (I'm thinking 
that my backup copy that I restored had the error that 
eventually caused the corruption).

Thanks,

Matt



Matt wrote:

    
I'm at wits end with this and I figured that I would put a 
      
feeler out 
    
here to see if anyone has a clue as to what the source of my issue 
might be.

My MSFTPSVC on one server suddenly has slowed to a crawl, 
      
i.e. 15 to 
    
60 seconds from issuing a command to receiving a response.  
      
This even 
    
happens with the FTP client on the same server going to 
      
127.0.0.1.  I 
    
have also tested by installing a third-party FTP server on the same 
box and that worked fine.  There is nothing else that is remarkable 
going on with that server, and I am unsure as to what 
      
precipitated the 
    
issue, though one possibility is the last MS security rollout that 
caused my metabase to become corrupted following the reboot back on 
12/22.  I fixed that with a copy from a backup and all 
      
seemed normal.
    
The corrupted metabase showed a block of random characters in the 
middle of the XML file, and it occurred in the SMTP segment.  The 
current working metabase looks just fine, but I'm thinking that 
whatever caused the corruption might have also corrupted some other 
stuff that is affecting FTP.  The release notes on those patches 
didn't suggest anything related to the FTP service or TCP/IP.

I have tried many different things from uninstalling and 
      
reinstalling 
    
the FTP service, removing the last two MS patches (and reinstalling 
them), and a host of smaller tasks.  I have run a rootkit 
      
detector and 
    
I have real-time virus protection on the server, but that 
      
was just to 
    
eliminate the very small possibility as the server is well 
      
firewalled, 
    
completely patched, has only one regular RD user (myself), 
      
unnecessary 
    
services are disabled, and I even stay away from often exploited 
software such as Perl and PHP.  There is nothing else 
      
abnormal on the 
    
server that would suggest a bug or otherwise.  Curiously this isn't 
affecting the Web server or SMTP services that are also part of IIS 
along with FTP.

One clue to the problem is that when I reset my router, FTP 
      
works at 
    
full speed for maybe up to a minute.  Although this makes 
      
no sense in 
    
the purest sense, the same thing happens when using a client on the 
same box FTPing to 127.0.0.1...the FTP will work at normal 
      
speed for a 
    
short while when FTPing to 127.0.0.1 immediately following a router 
reload.  I am 99.9% positive that my network has nothing to do with 
causing the issue, but this one thing suggests that there is some 
interaction with TCP/IP and the FTP service that is contributing to 
the issue.  This makes me think that it is a bug with the IIS rate 
limiting which requires QOS to be bound to the NIC, and maybe the 
router resets are resetting the QOS/rate limiting, allowing it to 
operate at full speed until it adjusts back to almost no throughput.
I have rate limiting turned on for both Web and FTP, but 
      
this is only 
    
affecting FTP.  I have tried turning off QOS and rebooting, 
      
but that 
    
had no affect on the issue, yet the way that rate limiting 
      
works, it 
    
seems to explain why a router reload causes things to work 
      
well for a 
    
few moments before degrading again.

At this point my next try will probably be to uninstall and 
      
reinstall 
    
all of IIS, but I was hoping that maybe someone around here 
      
has seen 
    
this or a similar issue, or if there were any ideas about 
      
the possible 
    
interaction with QOS and rate limiting gone bad, and how to 
      
reinstall 
    
that part of Windows if possible.  I would like to avoid rebuilding 
this box, but I won't keep it running in the present state with an 
unknown issue even though I could migrate to a third-party 
      
FTP server 
    
and avoid the issue.

I would appreciate any glimmers of hope that anyone might 
      
have for me 
    
on this :)

Thanks,

Matt
---
[This E-mail was scanned for viruses by Declude EVA www.declude.com]

---
This E-mail came from the Declude.JunkMail mailing list.  To 
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type 
"unsubscribe Declude.JunkMail".  The archives can be found at 
http://www.mail-archive.com.


      
---
[This E-mail was scanned for viruses by Declude EVA www.declude.com]

---
This E-mail came from the Declude.JunkMail mailing list.  To 
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and 
type "unsubscribe Declude.JunkMail".  The archives can be 
found at http://www.mail-archive.com.
______
[Email scanned for viruses]
[Email escaneado contra virus]

    

______
[Email scanned for viruses]
[Email escaneado contra virus]


---
[This E-mail was scanned for viruses by Declude EVA www.declude.com]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.


  

Reply via email to