Re: [Synalist] Big stability problems with the https-demo on Linux/FPC

Jason Wed, 01 Sep 2010 06:25:57 -0700

Hi Jarto,


On Wed, Sep 1, 2010 at 1:15 AM, Jarto Tarpio <[email protected]> wrote:

> 1) Are you not creating the child threads as suspended? If you do
> that, you should have all the time in the world to call create and
> any init procedures before finally calling thread.resume
>
>
I do create them suspended; but have noticed the issue even on the resume
(again, this may be due to the older version of FPC being used or may just
be a left over artifact from an even older version).  But making sure the
create is done before the execute runs is in the code.


> 2) Thanks for the settings. I'll give that a try if nothing else
> helps.
>

No problem.


>
> 3) Could you give me more info about what signals you are trapping
> (and how?) and what nmap parameters you are using? I think I have
> issues with this too.
>

There are hundreds of ways to build up the nmap attacks, (where some params
can be run in parallel with others); We have many lines of NMap commands to
run all the tests we can think of and/or have been recommended for us to
run.  NMap is pretty much a whole study in itself and is just one tool, we
have perl scripts, DOS-simulators, etc that we run as well.

As for signal trapping, that is a little more complicated, I thought I had a
good reference page for signal handling and freepascal but could not find
it.  Doing a Google search for "freepascal signal handling" should give you
a good starting point.  The freepascal command fpSignal has an example as
well.



> 4) When you create the child thread with FPC, you can set the stack
> size. If you don't do that, the stack size is determined
> automatically. That results in unnecessarily huge stacks for every
> thread and you run out of memory with lots of connections. On Kylix
> you can't set the stack size in code. Using ulimit -s works there.
>
>
My understanding from this page (
http://bugs.freepascal.org/view.php?id=13105 ) is that this issue may be
fixed in newer versions; but we set the limits with unlimit before running
and that will probably not change even as/if we upgrade.


Jason




> Jason wrote:
> > Our software is Linux only but it is stable for us.
> >
> > We run tests (monthly) that stress test our server, I minimally do 16
> > simultaneous connections (from 2 or more machines but have done tests
> > with up to 128 connections on 6+ machines on 10/100/gigE connections),
> > each sending from 10 to 160000 bytes and expecting a response of 10 to
> > 160000 bytes (data in, is "modified", and sent back); we run this test
> > for 12-48 hrs (our boxes have multiple network cards to handle the
> > throughput). We run these tests because we had issues like you are
> > describing. I do know that before the SSL lock change we had a
> > slightly faster speed, but not enough to make us pause when all our
> > tests started running without issues.
> >
> >
> > Here are five items that I remember having to deal with when we were
> > trying to get the system stable (But note that we are using an older
> > version of FPC, Synapse, and OpenSSL so some, or all, of the following
> > may no longer apply):
> >
> > 1) When there are a lot of threads, we found it appears as if the
> > Execute method gets higher priority than the Create method for the
> > same thread; so keeping the Create method very lean has been paramount
> > for us (I believe we even have logic to "wait" in the execute method
> > until the create has finished.  This may be an artifact of using an
> > older version of FPC however. (We created a manager routine that
> > controls the threads, starting, stopping, reusing, etc to handle all
> > the incoming connection threads so we don't keep creating and freeing
> > threads)
> >
> > 2) I know we compiled our own OpenSSL Library with these options:
> > enable-threads enable-shared --openssldir=/usr/ no-camellia no-capieng
> > no-cms no-gmp no-hw no-jpake no-krb5 no-mdc2 no-montasm no-rc5
> > no-rfc3779 no-seed no-zlib no-zlib-dynamic
> >
> > 3) We do trap some Signals, there is a pipe error that needs to be
> > trapped if bad data comes in (usually see this if we do heavy NMap
> > Ack/syn/etc type hacking/testing (which we also do every 3-4 months)).
> >
> > 4) We also found an issue with Linux limits "ulimit -a", if the "file
> > locks" or "open files" are limited then the max threads was limited to
> > 384 and we needed far more than that...
> >
> > 5) We had to adjust some of the timeout values for the networking in
> > the kernel (I believe we do this on bootup in the /sys directory) - I
> > think the concern was running out of Ports if the client didn't close
> > the connection properly.
> >
> >
> > Hope something helps, it's been a while since I had to make this
> > stable. In anycase, this system has been stable and tested under (what
> > I consider) extreme loads for over a year now.
> >
> >
> > Jason
> >
> >
> > On Tue, Aug 31, 2010 at 7:43 AM, Jarto Tarpio <[email protected]>
> > wrote:
> >
> > > About the OpenSSL versions. Debian and Ubuntu have 0.9.8g. Suse has
> > > 0.9.8h.
> > >
> > > The locking method, you mention below results in a you using only
> > > one huge lock for all the openssl-methods. I tried the same and it
> > > didn't help. I wonder if the old synapse works better or if your
> > > site has just never been hit really hard?
> > >
> > > I experimented a bit with an extra "BigLock" TCriticalSection:
> > >
> > > If I add the BigLock around any socket operation (create, init of
> > > ssl, read, write, destroy), the server is a lot more stable. I can
> > > easily get hundeds of thousands of page views. However, this slows
> > > down everything pretty badly. I'll let a test like that run over the
> > > night.
> > >
> > > If I add the BigLock only around create, init of ssl and destroy
> > > (not around read and write), I get mixed results. On Debian it
> > > survived for almost an hour but on Suse it crashed in less than a
> > > minute.
> > >
> > > Note: I'm doing some really cruel testing here: Tens of millions of
> > > page views with nonstop traffic from several computers. The Windows
> > > version can take that kind of beating easily for days and weeks. The
> > > Linux version with http can also. The only thing making this
> > > unstable on Linux is https.
> > >
> > > Regards,
> > >
> > > Jarto
> > >
> > >
> >
>
>
>
>
> ------------------------------------------------------------------------------
> This SF.net Dev2Dev email is sponsored by:
>
> Show off your parallel programming skills.
> Enter the Intel(R) Threading Challenge 2010.
> http://p.sf.net/sfu/intel-thread-sfd
> _______________________________________________
> synalist-public mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/synalist-public
>

------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd

_______________________________________________
synalist-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/synalist-public

Re: [Synalist] Big stability problems with the https-demo on Linux/FPC

Reply via email to