[ https://issues.apache.org/jira/browse/HADOOP-12671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069467#comment-15069467 ]
Thomas Demoor commented on HADOOP-12671: ---------------------------------------- Good catches, [~tianyin]! >From inspecting the history >(https://github.com/apache/hadoop/commits/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java) > most of these seem to be in since the beginning and subsequent editors >(including myself :() didn't spot the inconsistencies. Thanks for bringing >these up. Milliseconds indeed (see https://github.com/aws/aws-sdk-java/blob/1.10.6/aws-java-sdk-core/src/main/java/com/amazonaws/ClientConfiguration.java#L140) Some remarks: * I feel we should set {{fs.s3a.connection.establish.timeout}} and {{fs.s3a.connection.timeout}} to the same value (20k or 50k). * I would do the change for {{fs.s3a.multipart.purge.age}} the other way around: set to 86400 in the code. This aborts all ongoing multipartuploads older than the configured value as a means of "garbage collection". For people with slow connections 4h is too short imho. * FYI: The s3 documentation (https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md) also holds these values so needs to be kept in sync (state in 3 different places -> trouble :P) > Inconsistent configuration values and incorrect comments > -------------------------------------------------------- > > Key: HADOOP-12671 > URL: https://issues.apache.org/jira/browse/HADOOP-12671 > Project: Hadoop Common > Issue Type: Bug > Components: conf, documentation, fs/s3 > Affects Versions: 2.7.1, 2.6.2 > Reporter: Tianyin Xu > Assignee: Tianyin Xu > Attachments: HADOOP-12671.000.patch > > > The two values in [core-default.xml | > https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml] > are wrong. > {{fs.s3a.multipart.purge.age}} > {{fs.s3a.connection.timeout}} > {{fs.s3a.connection.establish.timeout}} > \\ > \\ > *1. {{fs.s3a.multipart.purge.age}}* > (in both {{2.6.2}} and {{2.7.1}}) > In [core-default.xml | > https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml], > the value is {{86400}} ({{24}} hours), while in the code it is {{14400}} > ({{4}} hours). > \\ > \\ > *2. {{fs.s3a.connection.timeout}}* > (only appear in {{2.6.2}}) > In [core-default.xml (2.6.2) | > https://hadoop.apache.org/docs/r2.6.2/hadoop-project-dist/hadoop-common/core-default.xml], > the value is {{5000}}, while in the code it is {{50000}}. > {code} > // seconds until we give up on a connection to s3 > public static final String SOCKET_TIMEOUT = "fs.s3a.connection.timeout"; > public static final int DEFAULT_SOCKET_TIMEOUT = 50000; > {code} > \\ > *3. {{fs.s3a.connection.establish.timeout}}* > (only appear in {{2.7.1}}) > In [core-default.xml (2.7.1)| > https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml], > the value is {{5000}}, while in the code it is {{50000}}. > {code} > // seconds until we give up trying to establish a connection to s3 > public static final String ESTABLISH_TIMEOUT = > "fs.s3a.connection.establish.timeout"; > public static final int DEFAULT_ESTABLISH_TIMEOUT = 50000; > {code} > \\ > btw, the code comments are wrong! The two parameters are in the unit of > *milliseconds* instead of *seconds*... > {code} > - // seconds until we give up on a connection to s3 > + // milliseconds until we give up on a connection to s3 > ... > - // seconds until we give up trying to establish a connection to s3 > + // milliseconds until we give up trying to establish a connection to s3 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)