Re: [AsyncHttpClient] data collection & instrumentation

Rick McGuire Wed, 23 Jan 2008 06:51:33 -0800

Sangjin Lee wrote:

The modified patch is there on JIRA.  Some follow-up discussions...
I think the current implementation works well, but one thing that'sdifficult to do is to collecting timing data. For example, some ofthe most important instrumentation data are things like averageresponse time (from request start to request complete) and averageconnect time (from connect start to connect complete).
Currently the context object that's available to monitoring listenersis the request object, along with the timestamp of the event itself.To be able to compute a response time for a given request, one wouldneed to take the timestamp from the request start event, associate itwith the request, and store it on the listener. When the requestcomplete event fires, then one would need to look up the stored datausing the request object as a key to retrieve the timestamp for therequest start event, compute the delta, and store the delta.
While all this is possible, it has a number of issues, not the leastof which is that one would need to maintain a map of request to starttime (as well as request to connect time). This would bloat memory aswell as other implications.
A substantially easier solution would be to provide the request starttime and connect start time as part of the information that's passedto the monitoring listener. Then listeners could simply compute thediff to get the elapsed time very easily with no need to maintain mapsof any kind. This could be either part of the request object itself,or if desirable, one could consider a separate context or event objectthat contains this information. What do you think?
Thanks,
Sangjin
On Jan 22, 2008 1:33 PM, Sangjin Lee <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
    I took a look at the patch on GERONIMO-3761, and it looks great.
     Thanks.  I have modified your patch for several things, though,
    and I'm nearly ready to add it to the JIRA report.  Comments about
    the changes...

    - I rewrote the EventQueue class to use an Executor.  Since the
    Executor implementation provided by the JDK is basically a thread
    pool associated with a task queue, it provides an identical
    functionality to what was in EventQueue.  I think that it is good
    to use the constructs from java.util.concurrent.* whenever it
    makes sense, and I believe this is one of them.

    - This change also enables us to remove "synchronized" from
    notifyMonitoringListener().  The notify method will be called very
    often and concurrently, and reducing the lock contention will be
    important.  Using an Executor makes it possible to eliminate
    synchronization, at least at that level.

    - I associated a shared thread pool (Executor) for all
    dispatchers.  I think it is desirable for dispatchers to share
    this thread pool rather than each instance of dispatchers creating
    and maintaining its own thread.

    - Renamed EventQueue to EventDispatcher.

    - I also moved the monitoring listener list to EventDispatcher.  I
    also used CopyOnWriteArrayList as the implementation for the list.
     CopyOnWriteArrayList is an ideal choice for this as it is thread
    safe and lock-free.  Also, our use case is heavy read-access but
    very infrequent write-access, which CopyOnWriteArrayList is
    suitable for.

    - I moved the connection_failed notification to before the
    getSession() call.  The getSession() call here always throws an
    exception (by design), and thus notification needs to be done
    before calling getSession().

    - I rewrote the CountingMonitor to use AtomicIntegers.  This
    should be slightly safer.

    - I changed the timestamp calls from System.currentTimeMillis() to
    System.nanoTime()/1000000.  The nanoTime() call is more high-res,
    as currentTimeMillis() may be tens of milliseconds accurate on
    some platforms, and thus not suitable for these measurements.

    I also have some more follow-up questions, which I'll post soon.

    Thanks,
    Sangjin


    On Jan 17, 2008 10:51 AM, Sangjin Lee <[EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>> wrote:

        I like your idea of using the event listener as the main way
        of doing this.  Basically no or multiple listeners would be
        invoked on a different thread when events occur.

        The event listener APIs would define those key methods which
        would contain all the necessary information about the events
        in an immutable fashion.

        We could provide a simple adapter that is no op so people can
        override necessary methods easily.  Also, we could provide one
        implementation which is a counting listener that does the
        basic metrics collection.

        What do you think?

Only if it can be done without having to maintain the same sort ofrequest-to-start time map that you don't wish to do with the listener.The process of adding data collection should cause memory bloat thereeither, particularly if monitoring is not being used (the more likelycase). It seems more reasonable that this type of processing should bepushed into the monitoring implementation rather than have the asyncclient try to keep track of everything. This way, the overhead is onlyintroduced while metrics are being gathered. A very simple andrelatively low cost means might be to add a couple of time stamp fieldsto the request object, but only for the most significant events.Perhaps request start and connection start, but nothing else. Anotherpossible approach would be to have a mechanism that would allow themonitor to attach an annotation object to the request that could be usedto implement a lifecycle memory if needed. The cost of doing this isrelatively minor when this type of information is not needed, but it'sflexible enough to be tailored to any type of data collection.


        Thanks,
        Sangjin

        On Jan 17, 2008 2:58 AM, Rick McGuire < [EMAIL PROTECTED]
        <mailto:[EMAIL PROTECTED]>> wrote:

            Thunderbird is playing very strange games with me this
            morning, somehow
            deleting the original post.   Anyway, here are my comments
            on this.

            > I'd like to propose changes to enable some basic stat
            collection
            > and/or instrumentation to have visibility into
            performance of AHC.
            >  For a given *AsyncHttpClient*, one might want to know
            metrics like
            >
            > - total request count
            > - total success count
            > - total exception count
            > - total timeout count
            > - connection attempt count
            > - connection failure count
            > - connect time average
            > - connection close count
            > - average response time (as measured from the invocation
            time to
            > having the response ready)
            > - and others?
            Collection of metric information would, I think, be a good
            thing.
            However, I think we should separate the consolidation of
            the information
            from the collection.  That is, the client should just have
            different
            types of events for data collection, and the event
            listener would be
            responsible for presenting the information appropriately.

            For example, to create the list above, I'd see the
            following set of
            events needed:

            - request made
            - request completed
            - request failed
            - request timeout
            - connection attempt started
            - connection failed
            - connection closed

            All events would be timestamped, which would allow metrics
            like "average
            request time" to be calculated.  This set of events would
            mean the
            client would not need to maintain any metric accumulators,
            and if the
            event information is done correctly, would even allow more
            fine grained
            monitoring (e.g., average connection time for requests to
            domain
            "foo.bar.com <http://foo.bar.com>").


            >
            > Collecting these metrics should have little effect on
            the overall
            > performance.  There would be an API to access these stats.
            >
            > I was initially thinking of an IoFilter to consolidate
            these hooks,
            > but I realize some of these metrics are not readily
            available to an
            > IoFilter (e.g. connect-related numbers).  It might be
            unavoidable to
            > spread the instrumentation in a couple of places (IoHandler,
            > ConnectFutureListener, etc.).
            >
            > Taking this one step further, one might think of
            callbacks or
            > listeners for various key events such as connect
            complete, request
            > sent, etc., so callers can provide instrumenting/logging
            code via
            > event notification.  However, I think this should be
            used judiciously
            > as such injected code may cause havoc.
            I think listeners would be the way to go.  This would
            allow multiple
            monitoring types to be attached to the pipe to gather data
            as needed.
            Perhaps the approached used with the javamail API might be
            of use here.
            The javamail Store APIs have a number of listener events
            that are
            broadcast (new mail arrived, message delete, folder
            created, etc.).
            Because there are similar concerns of havoc, the events
            get posted to a
            queue, and are dispatched on to a separate thread.  The
            queue is only
            created (and the associated thread) are only created when
            there are
            listeners available to handle the events.  This allows the
            events to
            very low overhead when there are no interested parties and
            prevents the
            listeners from interfering with normal javamail operations
            by being
            processed on a different thread.


            >
            > Thoughts?  Suggestions?

Re: [AsyncHttpClient] data collection & instrumentation

Reply via email to