Thanks for the response!

> Is the concern just that there would be too many spans if each call to
> next() created a span?  Does sampling help address this concern?

That's part of it, though sampling would indeed help there. I think my concern 
is with sampling on the trace calls in next, but not in the overall request 
trace, we wouldn't get a full picture of how much time was spent in the 
iterators through the life of the request. That is, if the next traces only 
happen 1% of the time, our timeline would look like one long request, with 
small blips for each call to next (each of which is presumably fast 
individually). Is my understanding there correct? I'll give it a try either 
way, to see what it looks like. 

> what we'd really want, is to aggregate the time spent in
> each call to next for each iterator, and then send the spans at the
> end."  But HTrace already does this, right?  Most span receivers will
> batch up the spans they receive and send them all in one big batch,
> probably "at the end."  What am I missing?
r.e. batching span sending--that's definitely good from a performance 
perspective. The spans themselves would still be considered separate though, 
correct? What I was trying to get at though was the creation of a single, 
aggregate span composed of the time taken in each call to next() for a 
particular iterator, which could then be sent after exhaustion of the iterator. 
Does that make sense? I don't think such a span necessarily fits into htrace's 
model (nor into zipkin's, which we're currently using for visualization), so 
I've moved away from that a bit. 

Thanks again for all the help!

Andrew




> On Aug 18, 2015, at 11:39 PM, Colin P. McCabe <[email protected]> wrote:
> 
> Hi Andrew,
> 
> Thanks for posting!
> 
> Is the concern just that there would be too many spans if each call to
> next() created a span?  Does sampling help address this concern?
> 
> You said, "what we'd really want, is to aggregate the time spent in
> each call to next for each iterator, and then send the spans at the
> end."  But HTrace already does this, right?  Most span receivers will
> batch up the spans they receive and send them all in one big batch,
> probably "at the end."  What am I missing?
> 
> cheers,
> Colin
> 
> 
> On Tue, Aug 18, 2015 at 3:03 PM, Andrew Mains
> <[email protected]> wrote:
>> Hi all,
>> 
>> This is really more of a "user" question than a "dev" question, but I'm
>> posting here since I was unable to find a user list for the project; hope
>> that's alright.
>> 
>> I was hoping to get some input on the best way to trace execution through a
>> chain of iterators. Specifically, we have a database-like application which
>> pipes data through multiple iterators, performing some transformation at
>> each step. We'd like insight into how long each step is taking in total for
>> a particular request. That is, for a chain of iterators iter_1... iter_i, we
>> want the total time spent in each iter_i for that request.
>> 
>> The naive implementation would be to start a span in each call to next, but
>> that's far too fine grained, given that we'd be starting a new span for each
>> row. What we'd really want, is to aggregate the time spent in each call to
>> next for each iterator, and then send the spans at the end. This would
>> require implementing a new span subclass, which is a bit tricky to integrate
>> at the moment (since it prevents us from using the static helpers in Trace).
>> 
>> Any thoughts on the best way to approach this issue? Is there something I'm
>> missing, or some way that we can reframe the problem such that it makes
>> sense with what's currently in htrace?
>> 
>> Let me know if there's anything that's unclear, or any further info I can
>> provide about our use case.
>> 
>> Thanks for the help!
>> 
>> Andrew

Reply via email to