I'm closer to having to make a buy decission on an architecture
and appserver vendor for a very large scale mission critical system.
100's of boxes, many 1000's of services.
This must include facilities for control, monitoring and status of the whole
system. SNMP and what else??? The vendors are weak here!
I'd like to hear what folks feel about current vendors and possibly what
might be in next releases if anyone's got some info?
Scaling, my views: Efficient Name Service (NS) and robust client Stub
- Clustering.
To me, the appserver should support flexible asymetric
clusters.
The cluster should spin up new VMs when load demands.
Weblogic does neither, Gemstone does both, Inprise supports
asymetric clusters but don't remember if they spin up VMs.
- Name service
The best of breed NS that I've seen is in Weblogic server. It
uses a multi-cast group to
keep all NS in a cluster in sync. The worst choice is using
commercial LDAP
directory servers for the local store and LDAP replication to
keep N instances
in sync. Our current appserver uses this architecture and we
hate it for so many
reasons.
- Client side Stub:
- All vendors support re-bind and re-dispatch on IOException
is there differentiation between the vendors?
- Does anyone support time out on method call?? I.E. an SLA
and re-dispatch to a different instance when time out
occurs?
To me this is much lighter weight than creating a client
side
transaction context and setting transaction timeout. The
former
approach would not appreciably increase per-method
latency,
where com.jts.UserTransaction would be a costly operation
for every method.
Other vendors stack up?
- Load balancing.
- Some vendors use a bulletin board of which services are
busiest
and do method level load balancing.
I'm not sure I want this all the time?? Another source of
latency
and bottle neck???
- Bind level load balancing. All vendors that support
clustering at least
support this form of load balancing unless they do method
level balancing.
Resiliency / fault tolerance (clustered boxes):
- NS, object activation and system status must not have single point
of failure.
Any vendor differentiation here or particularly bad designs??
- Client Stub and NS provide re-dispatch of failed (indepotient)
method call.
I need but haven't found method level timeout. Any vendors
present or future?
- When service / box fails the NS must be quickly scrubbed of dead
references
- If singleton services fail they must be restarted else where.
Any vendor support this?
]
Thanks for your thoughts and experiences!
Curt
Architect of our next-gen telephone system.
Curt Smith
Z-Tel
email: [EMAIL PROTECTED]
work: 404-237-1166 x182
FAX: 404-237-1167
===========================================================================
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff EJB-INTEREST". For general help, send email to
[EMAIL PROTECTED] and include in the body of the message "help".