Hi Patrick,

On Mon, Oct 01, 2018 at 01:08:32PM -0400, Patrick Hemmer wrote:
> What are the thoughts around putting time tracking stats around LUA
> calls? Just really basic info like how long a request spent running LUA
> code. Similar to how we already have metrics for time spent in queue,
> connecting, waiting on response, etc.

I have plans for doing this as an option. The thing is that the time
is constant during calls so if you just look at the timestamp when
entering and leaving, it will say you spent zero ms. I'm currently
working on instrumenting the scheduler with some optional clock_gettime()
calls to collect the time spent processing tasks, and as such this could
be refined to surround the Lua parts with this.

> Currently I accomplish this manually by grabbing the timestamp at the
> beginning and end of all LUA actions, store the duration in a
> transaction variable, and add that variable to the log. But I was
> wondering if this should instead have a native solution.
> I see we already have `tune.lua.service-timeout`, so it appears some
> timing data is already tracked.

In fact no, I've just checked and that's only the applet's inactivity
timeout.

> However one difference between that data
> and my custom implementation is that mine includes sleep time. So we
> might want to expose the LUA time as 2 different metrics: one as CPU run
> time, and another as wall clock time.

The wall clock time is easier to collect and doesn't require precise
accounting since you can grab the time on the first entry point and
the last exit point. It's just that I'm not sure how we know we're
definitely leaving. Probably that we can have 2 variables per stream :
  - last_lua_entry_date
  - total_lua_service_time

The first one would get now_ms when entering a lua rule/service, and
once leaving it the difference between now_ms and this value would be
added to total_lua_service_time. This would include the sleep time. We
just have to be careful not to reset last_lua_entry_date when returning
from a yield.

> The use case is that as we put more code into LUA scripts, we want to be
> aware of the impact this code is having on the performance of the
> requests, and the response times.

That's totally normal! It seems you've already studied this quite a
bit. If you can try to look closer to figure the exact points needed
to be adjusted to collect the data as proposed above, that would be
great! The same entry points will help me see where to place finer
accounting for the processing time.

Thanks,
Willy

Reply via email to