[jira] [Resolved] (ACCUMULO-3853) Contention around ConcurrentLinkedQueue.size() in AsyncSpanReceiver

2015-05-26 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved ACCUMULO-3853.
--
Resolution: Fixed

Maintain an explicit size of the collection instead of using the size() method.

> Contention around ConcurrentLinkedQueue.size() in AsyncSpanReceiver
> ---
>
> Key: ACCUMULO-3853
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3853
> Project: Accumulo
>  Issue Type: Bug
>  Components: trace
>Affects Versions: 1.7.0
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 1.8.0, 1.7.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Trying to debug slowness in the replication merkle test, I saw a lot of stack 
> traces sitting here:
> {noformat}
> "replication task 1" #297 daemon prio=5 os_prio=0 tid=0x04624000 
> nid=0x6b46 runnable [0x7ff8867b6000]
>java.lang.Thread.State: RUNNABLE
> at 
> java.util.concurrent.ConcurrentLinkedQueue.size(ConcurrentLinkedQueue.java:450)
> at 
> org.apache.accumulo.tracer.AsyncSpanReceiver.receiveSpan(AsyncSpanReceiver.java:171)
> at org.apache.htrace.Tracer.deliver(Tracer.java:81)
> at org.apache.htrace.impl.MilliSpan.stop(MilliSpan.java:177)
> - locked <0xa7fe3560> (a org.apache.htrace.impl.MilliSpan)
> at org.apache.htrace.TraceScope.close(TraceScope.java:78)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:898)
> - locked <0xb0086118> (a 
> org.apache.hadoop.hdfs.DFSInputStream)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> at 
> org.apache.accumulo.core.data.Mutation.readFields(Mutation.java:965)
> at 
> org.apache.accumulo.server.data.ServerMutation.readFields(ServerMutation.java:52)
> at 
> org.apache.accumulo.tserver.logger.LogFileValue.readFields(LogFileValue.java:45)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem.getWalEdits(AccumuloReplicaSystem.java:702)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem$WalClientExecReturn.execute(AccumuloReplicaSystem.java:531)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem$WalClientExecReturn.execute(AccumuloReplicaSystem.java:506)
> at 
> org.apache.accumulo.core.client.impl.ReplicationClient.executeServicerWithReturn(ReplicationClient.java:189)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem.replicateLogs(AccumuloReplicaSystem.java:429)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem._replicate(AccumuloReplicaSystem.java:291)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem.replicate(AccumuloReplicaSystem.java:212)
> at 
> org.apache.accumulo.tserver.replication.ReplicationProcessor.process(ReplicationProcessor.java:134)
> at 
> org.apache.accumulo.server.zookeeper.DistributedWorkQueue$1.run(DistributedWorkQueue.java:107)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
> - <0xb00dbf48> (a 
> java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> Seems like this might still be related to HDFS-8069, but one odd thing is 
> that we were sitting in the size() method.
> For those who don't remember, size() is a big o N operation on 
> ConcurrentLinkedQueue. Calling the size() method every time we receive a new 
> span is a little nasty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ACCUMULO-3853) Contention around ConcurrentLinkedQueue.size() in AsyncSpanReceiver

2015-05-27 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved ACCUMULO-3853.
--

Addressed Billie's comments about ZooTraceClient.

> Contention around ConcurrentLinkedQueue.size() in AsyncSpanReceiver
> ---
>
> Key: ACCUMULO-3853
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3853
> Project: Accumulo
>  Issue Type: Bug
>  Components: trace
>Affects Versions: 1.7.0
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 1.8.0, 1.7.1
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Trying to debug slowness in the replication merkle test, I saw a lot of stack 
> traces sitting here:
> {noformat}
> "replication task 1" #297 daemon prio=5 os_prio=0 tid=0x04624000 
> nid=0x6b46 runnable [0x7ff8867b6000]
>java.lang.Thread.State: RUNNABLE
> at 
> java.util.concurrent.ConcurrentLinkedQueue.size(ConcurrentLinkedQueue.java:450)
> at 
> org.apache.accumulo.tracer.AsyncSpanReceiver.receiveSpan(AsyncSpanReceiver.java:171)
> at org.apache.htrace.Tracer.deliver(Tracer.java:81)
> at org.apache.htrace.impl.MilliSpan.stop(MilliSpan.java:177)
> - locked <0xa7fe3560> (a org.apache.htrace.impl.MilliSpan)
> at org.apache.htrace.TraceScope.close(TraceScope.java:78)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:898)
> - locked <0xb0086118> (a 
> org.apache.hadoop.hdfs.DFSInputStream)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> at 
> org.apache.accumulo.core.data.Mutation.readFields(Mutation.java:965)
> at 
> org.apache.accumulo.server.data.ServerMutation.readFields(ServerMutation.java:52)
> at 
> org.apache.accumulo.tserver.logger.LogFileValue.readFields(LogFileValue.java:45)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem.getWalEdits(AccumuloReplicaSystem.java:702)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem$WalClientExecReturn.execute(AccumuloReplicaSystem.java:531)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem$WalClientExecReturn.execute(AccumuloReplicaSystem.java:506)
> at 
> org.apache.accumulo.core.client.impl.ReplicationClient.executeServicerWithReturn(ReplicationClient.java:189)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem.replicateLogs(AccumuloReplicaSystem.java:429)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem._replicate(AccumuloReplicaSystem.java:291)
> at 
> org.apache.accumulo.tserver.replication.AccumuloReplicaSystem.replicate(AccumuloReplicaSystem.java:212)
> at 
> org.apache.accumulo.tserver.replication.ReplicationProcessor.process(ReplicationProcessor.java:134)
> at 
> org.apache.accumulo.server.zookeeper.DistributedWorkQueue$1.run(DistributedWorkQueue.java:107)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
> - <0xb00dbf48> (a 
> java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> Seems like this might still be related to HDFS-8069, but one odd thing is 
> that we were sitting in the size() method.
> For those who don't remember, size() is a big o N operation on 
> ConcurrentLinkedQueue. Calling the size() method every time we receive a new 
> span is a little nasty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)