Rainer Jung wrote:
Filip Hanik - Dev Lists schrieb:
hi Rainer,
so to tell the true tale, isn't the story...
- You got customers on 5.5 using session replication
Of course, and that helped in making the 5.5 cluster better.
- Your customers want to move to Tomcat 6
I do hope so! I thought that's something we want all our users to do?
But: I didn't do OACC because *anyone* asked me to. I reflected the
actual situation and I still think it is a good idea.
- You're not confident about the maturity of Tomcat 6's clustering
codebase, mainly cause you haven't used it, even though it was
originally developed in 2006. I would argue that the this doubt is
mainly a lack of both usage and understanding
Yes, as I said I'm not confident in its maturity. No, it mainly comes
from the fact that it is pretty young. Yes, it was developped in 2006
but it first got delivered as default with TC 6. And TC 6 is getting
wide adoption now - but not clustering! As I said, I'm talking about
experience coming from users and I expect this to need another 6 to 12
months to grow.
- So, to mitigate this, you'd like to use the ASF as the delivery
vehicle for your custom code base to allow your customers to switch
to Tomcat 6
So how should I interprete this statement? The terms "vehicle", "your
codebase" and "your customers" make it a very insulting statement. At
least I do feel insulted.
Absolutely not. My sincere apologies if you feel that way.
I really just wasn't feeling I was getting the whole story when you gave
technical argument after technical argument without and claiming
features were missing or removed, when in fact most of them are there
and improved.
even when it comes to documentation, TC6 is the first time there is a
complete reference documentation for clustering setup.
so please don't be insulted, instead be honest. don't make claims based
on what you don't know, make them based on what you do know.
Filip
But what are we talking about:
- "your codebase": its our codebase. If we had to find a single person
for this cluster codebase, then it would be your codebase. As everyone
can see from yesterdays commit, the codebase is nearly identical from
the TC 5.5 codebase. I don't want to have "my" codebase. I want us to
cooperate for a strong shared Tomcat codebase.
- "your customers": I get a lot of ideas from my customers, and yes
they belomg to our community as well as yours. I try to be very
cautious not to bring in very customer specific things into the Tomcat
project. I was thinking about cluster users of TC 5.5 and how they
could move forward and this is why I did port the TC 5.5 cluster. And
yes, I know users who would profit from that. But the motivation for
OACC is much more general. Migration concepts in the world of high
availability need to be very solid. The Tomcat project (we) cancelled
support for its existing cluster module from one major version to the
next, without any early warning. That's not good practise in the HA
world. So yes, I want to mitigate. I want to mitigate the fact, that
we didn't do the right decisions with respect to cluster users when we
dropped the existing cluster in TC 6. And I am very happy, that under
the hood we didn't change the inhterfaces too much so that it is very
simple to mitigate this. OACC was done in less than a day.
- "vehicle": The official Tomcat project is the right place to offer
OACC and to care about migration issues. We only started lately to act
more properly with respect to deprecations and end of live
announcements. This shows the necessary awareness for the needs of our
users since Tomcat is now the leading servlet container and is
ubiquitous. And I'm very open here about my motivation and plans. Most
of us need to combine business work with our community work. And the
ASF model allows this. The questions w.r.t. doing the work inside an
ASF project are more like: does it lead to the right result for the
community? Do we work together in good collaboration? Is our decision
process open? And yes: w.r.t. meritocracy you are much in front of me.
All in all I feel accused by the above formulation.
I'm ok with you doing this for those folks in sandbox, I'd probably
would recommend that we not put this as an official release to
Tomcat, as I believe our small group would do much better focusing
the effort against the current implementation. Putting into a release
means we would spend resources in the bugs that arise from the port
itself.
The effort point is a valid point. For this reason I'm OK with clearly
stating that OACC is a dead end. I already put some (probably not that
harsh) statement in the release plan file I committed yesterday before
our discussion even started.
Concerning official module releases I have a differing opinion:
- I will not suggest to bundle OACC additionally inside any other
Tomcat release file
but
- I think it will be a strong statement from a responsibility point
of view (a project asset), to provide OACC as an official
separate release download. But at least for me it's to early
to propose this, because I first want to get a clearer picture,
of what will be included in OACC. At the moment it builds and works,
but something we deliver needs to contain other parts. Not far away,
but not finished yet.
I do have some comments inline too
Rainer Jung wrote:
Although I think that a detailed technical discussion is not the
right way to determine the usefulness of OACC, some comments:
- monitoring: your reference to trunk strengthens my argument about
maturity of code. Taking trunk code instead of TC 5.5 code is the
maximum opposite approach.
- Java 5 dispatcher: I mostly agree. I got lost in the code. The
code I thought was responsible was transport/PooledSender.java which
uses a fixed pool of threads without queueing. I overlooked somehow
the Executor with queue in the Java 5 dispatcher. Nevertheless
there's still some discrepancy, because we added some aspects to
the queue in 5.5 which are gone now:
- lock fairness biased to the remover in order to reduce the
likelyness of lock starvation
the LinkedBlockingQueue implements a two-lock algorith to avoid lock
contention around simultaneous puts and takes.
Sounds very good, though Executor uses queue.offer() and offer() does
need both of them. Neverteless I'm confident, that the j.u.concurrent
queues and Executor are very efficient and stable. Using them may need
some tuning, if my reasoning about primary and secondary functionality
is right (bounding queue size, maybe more).
- taking over the whole queue by the remover instead of
removing item by item (again less lock contention paired with
less context switching)
yes, that's a neat idea, question is how it compares against the
two-lock algorithm
since TC 5.5 uses a single lock, my guess would be there is a larger
risk of contention there than a two lock algorithm
That's quite possible. The queue actually supports exactly this "take
all of it" idea, but I didn't check the Executor implementation for
it. It would increase thread to memory locality. But this is more
performance optimization. As long as we are far away from lock
contention, I'm fine.
- limited size: Favor prevention of OutOfMemoryError over replication
correctness in case we run into replication communication problems.
Priority is always on the primary function, i.e. a working webapp,
clustering is always a secondary function which should be as
transparent as possible during normal operations
this is also implemented, and very accurate. default queue size is
set to 64MB, and you can control what behavior to use when you reach
that limit.
OK, thanks. Although the term queue size is a little misleading I agree.
Why is the default to send synchronous (alwaysSend=true), when the
amount of outstanding messages reaches the limit? I have the
impression, that for session replication once we reach the limit it's
likely that replication has a problem and switching to synchronous
then makes the replication problem a problem for normal response
delivery.
We could go into detail here, but I would prefer to do this in a
separate discussion thread. I don't think that those examples are
big problems, and maybe they are not problems at all.
I do agree that JMX support is a major uha, but the other points seem
to be based on not understanding or misunderstanding the current
implementations, and I'd like to address those for you
The technical points: quite possible. The structural motivation for
OACC: no, see above.
I want to stress once again my own experience, that a huge code base
implementing a complex apparatus needs time to mature. Thus I think
it's fair to not simply lock up happy TC 5.5 cluster users inside
5.5 and offer OACC as an intermediate step on the way to migrate to
HA/Tribes.
knock yourself out :)
Quite seriously: here I've got a language problem, I don't know how to
interprete this. Sorry about that.
Regards,
Rainer
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]