On 9/11/06, Stefano Bagnara <[EMAIL PROTECTED]> wrote:
Bernd Fondermann wrote:
> So how do we reproduce this beast? Do we have a "no load" reproduction
> raising OOMs?

So until Noel won't decide to use a real profiler (with per-instance
allocation time tracking) and not hprof :-P (which leave us guessing
where the problem is) or won't provide us at least the config.xml I stop
my tests.

fair enough.

About your comparison between 2.2.0 and 2.3.0: it should be done in a
controlled environment (postage). I don't expect the number of spam
connections to Noel's server in these days to be the same of an year
ago, or even few months ago. My personal server have almost an average
of 5 times traffic than 1 year ago and I receive/send the same amount of
valid emails.

my point was, maybe this is not a newly introduced bug... but
validating this seems to be impossible since the mail server world
changed so heavily, as you suggest.

Furthermore I think that 2.3.0 is already faster than 2.2.0 but I don't
care too much of this issue: imo the real goal is that 2.3.0 should be
more RFC compliant and have less bugs (or at least less critical) than
2.2.0. If you look at the changelog you will see that 2.2.0 has really
bad bugs. If 2.3.0 fixes that bugs (without introducing worse bugs) and
work 5% slower it would be a good thing anyway.

agreed, but as I understand it, performance _is_ an issue regarding
the growing traffic and recently surfacing spooling defects.

While writing this mail, curiosity brought me testing my current postage
setup against 2.2.0 and 2.3.0-current.

yes, I did this before, too. you can really quickly drive 2.2 against
the wall, if you want.

<snip/>
So I increased the memory for James 2.2.0 to 10M. After 1 minute postage
say "unmatched: 299, matched: 0".. I don't know if this is an
incompatibility between james 2.2 and postage, but I also see all the
messages still in the spool. After 2 minutes "unmatched: 586, matched:
5", after 3 minutes "unmatched: 878, matched: 13".. then again OOM.

This time I increased the Xmx to 20M. After 3 minutes again "unmatched:
879, matched: 14", this time no OOM.. At the end of the test "1451
unmatched" (UNDELIVERED) and 31 matched.

When concerned with sending/receiving email, Postage should be
completely mail server agnostic. Only SMPT/POP3 is used.

<snip/>
After the 5 minutes I had almost 500 messages in outgoing
that have been ALL succesfully delivered to postage in the "2 minutes"
window postage wait at the end of the test. So the raw result was:
"matched: 2616, unmatched: 1". Isn't this cool?

Well, the "1" is a bug in Postage, not so cool ;-)
But yes, it is quite impressive. I also noticed similar behavior. 2.3
is a substantial  improvement over 2.2. but you know better than I do
:-)

<snip/>
Please note that I never tested 2.2.0 before: this is simply my current
"random" 2.3.0 test applied to 2.2.0. And the difference is imo
*impressive*: James 2.2.0 needed 20MB for the 300mail/minute test, James
2.3.0 did 500mail/minute (under stress pressure) using 5MB and without
throwing OOM. Furthermore James 2.2.0 had an increasing spool size while
2.3.0 had no problem on the spool. Again, this is not a test to
demonstrate that 2.3.0 is better than 2.2.0. Maybe there are tests that
works better on 2.2.0 or tests that works even better on 2.3.0. But as a
fist try I would simply say *WOW*.

That is why I think that "memory leak" is a delicate term.
If what we currently have is "the best James ever", we should release pronto.
But I expect problems to pop up, after such a long time with no
release and such big changes to the code base.

If today someone steps up and says, "I have this config and only
changed releases, and before it ran for 10 days until restart and now
it's only 5 hours", it would not be acceptable to release the
software, because the server would not be ready for enterprise
production purposes.

Bottom line: I really would love to have an "adaptive" Postage
configuration that simply try to find out what is the maximum flow for
the given configuration.

:-) This would mean a not-so-small refactoring, because the triggering
of "taking samples", e.g. sending mails, has to be completly revised.
With the changes I am about to check in you are only able to increase
the mail sizes which is not the same. Adaptive load is coming later,
except someone steps up with a solution. I'd be happy to help.

 Bernd

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to