[ 
https://issues.apache.org/jira/browse/YARN-5915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678475#comment-15678475
 ] 

Atul Sikaria edited comment on YARN-5915 at 11/19/16 3:13 AM:
--------------------------------------------------------------

This was seen previously as well, in YARN-4814. 

The issue is with writeEntities method in FileSystemTimelineWriter 
(https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java#L317).
 This calls getObjectMapper().writeValue(…), which does a flush() after every 
write with default config.

{noformat} 
@Override
public void writeValue(JsonGenerator jgen, Object value)
    throws IOException, JsonGenerationException, JsonMappingException
{
    SerializationConfig config = copySerializationConfig();
    if (config.isEnabled(SerializationConfig.Feature.CLOSE_CLOSEABLE) && (value 
instanceof Closeable)) {
        _writeCloseableValue(jgen, value, config);
    } else {
        _serializerProvider.serializeValue(config, jgen, value, 
_serializerFactory);
        if 
(config.isEnabled(SerializationConfig.Feature.FLUSH_AFTER_WRITE_VALUE)) {
            jgen.flush();
        }
    }
}
{noformat} 

On filesystems that map flush() to no-op or trivial operations, this is not a 
big deal. But on filesystems where flush() incurs a larger cost, this becomes a 
bottleneck for timeline events flow.

The fix is to set the property above (FLUSH_AFTER_WRITE_VALUE) to false, so the 
JSonGenerator does not do a flush after every JSon write.

The flush of the stream is done in a timer thread at configurable interval (10 
seconds by default). As [~jlowe] pointed out in YARN-4814, the timer thread 
also needs to also do a flush() on the JsonGenerator, to make sure the json 
serializer does not have any buffered data - so the hflush() in the timer 
thread actually flushes all the data seen so far.


was (Author: asikaria):
This was seen previously as well, in YARN-4814. 

The issue is with writeEntities method in FileSystemTimelineWriter 
(https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java#L317).
 This calls getObjectMapper().writeValue(…), which does a flush() after every 
write with default config.

{noformat} 
@Override
public void writeValue(JsonGenerator jgen, Object value)
    throws IOException, JsonGenerationException, JsonMappingException
{
    SerializationConfig config = copySerializationConfig();
    if (config.isEnabled(SerializationConfig.Feature.CLOSE_CLOSEABLE) && (value 
instanceof Closeable)) {
        _writeCloseableValue(jgen, value, config);
    } else {
        _serializerProvider.serializeValue(config, jgen, value, 
_serializerFactory);
        if 
(config.isEnabled(SerializationConfig.Feature.FLUSH_AFTER_WRITE_VALUE)) {
            jgen.flush();
        }
    }
}
{noformat} 

On filesystems that map flush() to no-op or trivial operations, this is not a 
big deal. But on filesystems where flush() incurs a larger cost, this becomes a 
bottleneck for timeline events flow.

> ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every 
> event write
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-5915
>                 URL: https://issues.apache.org/jira/browse/YARN-5915
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: timelineserver
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Atul Sikaria
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to