[
https://issues.apache.org/jira/browse/APEXMALHAR-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207881#comment-15207881
]
ASF GitHub Bot commented on APEXMALHAR-2017:
--------------------------------------------
Github user chandnisingh commented on a diff in the pull request:
https://github.com/apache/incubator-apex-malhar/pull/218#discussion_r57111096
--- Diff:
library/src/main/java/com/datatorrent/lib/io/fs/AbstractFileOutputOperator.java
---
@@ -1195,6 +1188,24 @@ public void close() throws IOException
}
@Override
+ public void beforeCheckpoint(long l)
+ {
+ try {
+ Map<String, FSFilterStreamContext> openStreams =
streamsCache.asMap();
+ for (FSFilterStreamContext streamContext: openStreams.values()) {
+ long start = System.currentTimeMillis();
+ streamContext.finalizeContext();
+ totalWritingTime += System.currentTimeMillis() - start;
+ //streamContext.resetFilter();
--- End diff --
why this commented out code?
> Use pre checkpoint notification to optimize operator IO
> -------------------------------------------------------
>
> Key: APEXMALHAR-2017
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2017
> Project: Apache Apex Malhar
> Issue Type: Improvement
> Reporter: Pramod Immaneni
> Assignee: Pramod Immaneni
>
> Currently many output operators enforce persistence of data on endWindow by
> calling flush, hflush or equivalent calls. This was done to help recovery.
> Doing this always ensures that the data corresponding to checkpoint state at
> recovery is always present.
> A recent addition to the engine lets the operators know about an impending
> checkpoint just before it happens using a callback. Operators can now enforce
> persistence of data one time in this in this callback instead of end of every
> window. This results in better performance as data is not being frequently
> written to persistent storage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)