[ 
https://issues.apache.org/jira/browse/ACCUMULO-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069780#comment-15069780
 ] 

Dave Marion commented on ACCUMULO-4090:
---------------------------------------

Looking at a heap dump I consistently see two objects in the queue for the 
jtimer object, a FailedMutations object and an anonymous timer task. I believe 
the following should be done:

 1. When TSBW.close() is called, then FailedMutations.cancel() should be called.
 2. A reference should be kept to the TimerTask added to jtimer in the TSBW 
constructor. Then in TSBW.close() the cancel() method should be called on this 
task.

Looking at the TabletServerBatchWriter objects in the heap dump I see that the 
closed field is always false. I wonder if the root cause is that this field is 
not marked as volatile (and the flushing field may be an issue too).

> BatchWriter close not cleaning up all resources
> -----------------------------------------------
>
>                 Key: ACCUMULO-4090
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4090
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 1.7.0
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>
> I'm debugging an issue with a long-running ingestor, similar to the 
> TraceServer.
> After realizing that BatchWriter close needs to be called when a 
> MutationsRejectedException occurs (see ACCUMULO-4088), a close was added, and 
> the client became more stable.
> However, after a day, or so, the client became sluggish. When inspecting a 
> heap dump, many TabletServerBatchWriter objects were still referenced.  This 
> server should only have two BatchWriter instances at any one time, and this 
> server had >100.
> Still debugging.
> The error that initiates the issue is a SessionID not found, presumably 
> because the session timed out.  This is the cause of the 
> MutationsRejectedException seen by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to