On Fri, Apr 28, 2017, at 10:54, Mike Miller wrote: > Hello Apache Friends, > > I am trying to track down a possible issue that Accumulo is having with a > test in our Shell code that does some tracing. The test is > ShellServerIT.trace > <https://github.com/apache/accumulo/blob/master/test/src/main/java/org/apache/accumulo/test/ShellServerIT.java#L1612> > and > it will periodically fail looking for the trace "sendMutations" on line > 1624. > > I can't figure out if this is an issue with Htrace version > 3.1.0-incubation > or something crazy we are doing with multithreading. When the test is > run > through Maven, the trace seems to always be unrooted: > The following spans are not rooted (probably due to a parent span of > length > 0ms): > 2+136 [email protected] PopulateMetadata > 1+137 [email protected] wal > 1+137 [email protected] update > 4+140 [email protected] FinishCreateTable > 1+1340 [email protected] > org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter > 1 > 1+1340 [email protected] sendMutations > > But when I perform the same trace through the Accumulo shell, entering > the > commands manually, it seems to work OK: > 5+33230 [email protected] close > 1+33230 [email protected] BinMutations 1 > 4+33231 [email protected] > org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter > 1 > 4+33231 [email protected] sendMutations > > The code that is doing the trace is in our TabletServerBatchWriter here > <https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchWriter.java#L858> > . > > I have thought of two possible reasons we are having problems... > 1 - We are tracing in 2 separate thread pools and htrace gets confused: > In org/apache/accumulo/core/client/impl/TabletServerBatchWriter.java > BinMutations trace: > binningThreadPool.execute(Trace.wrap(new Runnable() { > And sendMutations trace: > sendThreadPool.submit(Trace.wrap(new SendTask(server))); > > 2 - We change the name of the current thread just before starting the > "sendMutations" trace: > In TabletServerBatchWriter.SendTask.send() > <https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchWriter.java#L856> > > > Any ideas/insight would be greatly appreciated. Thank you! > - Mike
Hi Mike, Changes in thread name will not be an issue, because HTrace uses thread-local data to maintain its internal state. Similarly, parent spans with length 0, although not great to look at, still do count as parents. I think what trips up a lot of people tracing with thread pools is failing to close traces. Take a look and make sure you are not experiencing this too. In later versions of HTrace, we added warnings for this kind of thing when we detected it. Another thing to look at would be which actual threads are performing which operations -- i.e. if there are "handoffs" between threads, do the spans get detached and reattached too? best, Colin
