It should be safe to apply this setting to all machine sizes. This setting is mostly to workaround S3 connector timeouts failures that look like the one below.
The default value is too low to reliably run single user queries. I1227 19:29:41.471863 1490 AmazonHttpClient.java:496] Unable to execute HTTP request: Timeout waiting for connection from pool Java exception follows: com.cloudera.org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at com.cloudera.org.apache.http.impl.conn.PoolingClientConnectionManager .leaseConnection(PoolingClientConnectionManager.java:232) at com.cloudera.org.apache.http.impl.conn.PoolingClientConnectionManager $1.getConnection(PoolingClientConnectionManager.java:199) at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke( DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.cloudera.com.amazonaws.http.conn.ClientConnectionRequestFactory $Handler.invoke(ClientConnectionRequestFactory.java:70) at com.cloudera.com.amazonaws.http.conn.$Proxy21.getConnection(Unknown Source) at com.cloudera.org.apache.http.impl.client.DefaultRequestDirector.execute( DefaultRequestDirector.java:456) at com.cloudera.org.apache.http.impl.client.AbstractHttpClient.execute( AbstractHttpClient.java:906) at com.cloudera.org.apache.http.impl.client.AbstractHttpClient.execute( AbstractHttpClient.java:805) at com.cloudera.com.amazonaws.http.AmazonHttpClient.executeOneRequest( AmazonHttpClient.java:728) at com.cloudera.com.amazonaws.http.AmazonHttpClient.executeHelper( AmazonHttpClient.java:489) at com.cloudera.com.amazonaws.http.AmazonHttpClient.execute( AmazonHttpClient.java:310) at com.cloudera.com.amazonaws.services.s3.AmazonS3Client. invoke(AmazonS3Client.java:3785) at com.cloudera.com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata( AmazonS3Client.java:1050) at com.cloudera.com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata( AmazonS3Client.java:1027) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus( S3AFileSystem.java:913) at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:394) On Wed, Nov 8, 2017 at 9:12 AM, Jim Apple <jbap...@cloudera.com> wrote: > http://impala.apache.org/docs/build/html/topics/impala_s3.html > recommends "Set the safety valve fs.s3a.connection.maximum to 1500 for > impalad." For best performance, should this be increased for nodes > with very high CPU, RAM, or bandwidth? Or decreased for less-beefy > nodes? >