[ 
https://issues.apache.org/jira/browse/GOBBLIN-1034?focusedWorklogId=379054&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379054
 ]

ASF GitHub Bot logged work on GOBBLIN-1034:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Jan/20 22:20
            Start Date: 29/Jan/20 22:20
    Worklog Time Spent: 10m 
      Work Description: sv2000 commented on pull request #2876: GOBBLIN-1034: 
Ensure underlying writers are expired from the Partitio…
URL: https://github.com/apache/incubator-gobblin/pull/2876#discussion_r372663165
 
 

 ##########
 File path: 
gobblin-core/src/main/java/org/apache/gobblin/writer/PartitionedDataWriter.java
 ##########
 @@ -99,13 +118,32 @@ public PartitionedDataWriter(DataWriterBuilder<S, D> 
builder, final State state)
     if(builder.schema != null) {
       this.state.setProp(WRITER_LATEST_SCHEMA, builder.getSchema());
     }
-    this.partitionWriters = CacheBuilder.newBuilder().build(new 
CacheLoader<GenericRecord, DataWriter<D>>() {
+    Long cacheExpiryInterval = 
this.state.getPropAsLong(PARTITIONED_WRITER_CACHE_TTL_SECONDS, 
DEFAULT_PARTITIONED_WRITER_CACHE_TTL_SECONDS);
+
+    this.partitionWriters = CacheBuilder.newBuilder()
+        .expireAfterAccess(cacheExpiryInterval, TimeUnit.SECONDS)
+        .removalListener(new RemovalListener<GenericRecord, DataWriter<D>>() {
+      @Override
+      public void onRemoval(RemovalNotification<GenericRecord, DataWriter<D>> 
notification) {
+        synchronized (PartitionedDataWriter.this) {
+          if (notification.getValue() != null) {
+            try {
+              DataWriter<D> writer = notification.getValue();
+              totalRecordsFromEvictedWriters += writer.recordsWritten();
+              totalBytesFromEvictedWriters += writer.bytesWritten();
+              writer.close();
+            } catch (IOException e) {
+              log.error("Exception {} encountered when closing data writer on 
cache eviction", e);
 
 Review comment:
   Hmm. I was not sure if we should make close exceptions fatal. I am 
propagating the exception per the suggestion. Just curious if there are strong 
reasons we want to make close exceptions fatal.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 379054)
    Time Spent: 40m  (was: 0.5h)

> Ensure underlying writers are expired from the PartitionedDataWriter cache to 
> avoid accumulation of writers for long running Gobblin jobs
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1034
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1034
>             Project: Apache Gobblin
>          Issue Type: Improvement
>          Components: gobblin-core
>    Affects Versions: 0.15.0
>            Reporter: Sudarshan Vasudevan
>            Assignee: Abhishek Tiwari
>            Priority: Major
>             Fix For: 0.15.0
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, the underlying writers are never evicted from the 
> PartitionedDataWriter cache. For long running Gobblin jobs (e.g. streaming), 
> this will cause a memory leak particularly if the underlying writers maintain 
> state. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to