Re: Slow read from S3 on CDH 5.8.0 (includes HADOOP-12346)

Sebastian Nagel Sun, 21 Aug 2016 11:43:41 -0700

Hi Max,

the cluster isn't running inside a VPC. It's a web crawl which is then
published as public data set at s3://commoncrawl/. No need for VPC, since
all the data is open.


But thanks for the pointer, a VPC and a S3 endpoint could be an option
for the future.

> Any new policies put in place for your S3 bucket
> as others have mentioned something about throttling ?

No, the policies are unchanged since several months,
long time before the problems appeared. And no throttling
is configured for that bucket.

The only "throttling" I've observed the last week, is
that there is a low bandwidth (120kbit/s) between nodes
for about 20 sec. Looks like that only on a higher demand the
bandwidth is increased:
  https://forums.aws.amazon.com/thread.jspa?threadID=237530
Hope to get an answer from AWS for this phenomenon.

Whether this does also apply to the connection between
cluster nodes and S3 front servers, I don't know.
Could be related, of course.

Thanks,
Sebastian


On 08/20/2016 04:35 PM, max scalf wrote:
> Just out of curiosity, have you enabled S3 endpoint for this ?  Hopefully u 
> are running this cluster
> inside a VPC, if so an endpoint would help as the S3 traffic will not go out 
> to the Internet...
> 
> Any new policies put in place for your S3 bucket as others have mentioned 
> something about throttling ?
> 
> On Wed, Aug 17, 2016, 3:22 PM Sebastian Nagel <wastl.na...@googlemail.com
> <mailto:wastl.na...@googlemail.com>> wrote:
> 
>     Hi Dheeren, hi Chris,
> 
> 
>     >> Are you able to share a bit more about your deployment architecture?  
> Are these EC2 VMs?  If so,
>     are they co-located in the same AWS region as the S3 bucket?
> 
>     Running a cluster of 100 m1.xlarge EC2 instances with Ubuntu 14.04 
> (ami-41a20f2a).
>     The cluster is running in a single availability zone (us-east-1d), the S3 
> bucket
>     is in the same region (us-east-1).
> 
>     % lsb_release -d
>     Description:    Ubuntu 14.04.3 LTS
> 
>     % uname -a
>     Linux ip-10-91-235-121 3.13.0-61-generic #100-Ubuntu SMP Wed Jul 29 
> 11:21:34 UTC 2015 x86_64 x86_64
>     x86_64 GNU/Linux
> 
>     > Did you change java idk version as well,  as part of the upgrade?
> 
>     Java is taken as provided by Ubuntu:
> 
>     % java -version
>     java version "1.7.0_111"
>     OpenJDK Runtime Environment (IcedTea 2.6.7) (7u111-2.6.7-0ubuntu0.14.04.3)
>     OpenJDK 64-Bit Server VM (build 24.111-b01, mixed mode)
> 
>     Cloudera CDH is installed from
>       
> http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb
> 
>     After the jobs are done the cluster is shut down and bootstrapped (bash + 
> cloudinit) anew on demand.
>     A new launch of the cluster may, of course, include updates of
>      - the underlying Amazon machine image
>      - Ubuntu packages
>      - Cloudera packages
> 
>     And the real reason for the problem may come from any of these changes.
>     The update to Cloudera CDH 5.8.0 was the most obvious since the problems 
> appeared
>     (seen first 2016-08-01).
> 
>     >> If the cluster is not running in EC2 (e.g. on-premises physical 
> hardware), then are there any
>     notable differences on nodes that experienced this problem (e.g. smaller 
> capacity on the
>     outbound NIC)?
> 
>     Probably not, although I cannot exclude this. I've the last days run into 
> problems which could be
>     related: few tasks are slow, even seem to hang, e.g., reducers during 
> copy. But that's also looks
>     more like a Hadoop (configuration) problem. Network throughput between 
> nodes measured with iperf is
>     not super-performant but generally ok (5-20 MBit/s).
> 
>      >> This is just a theory, but If your bandwidth to the S3 service is 
> intermittently saturated or
>     throttled or somehow compromised, then I could see how longer timeouts 
> and more retries might
>     increase overall job time.  With the shorter settings, it might cause 
> individual task attempts to
>     fail sooner.  Then, if the next attempt gets scheduled to a different 
> node with better bandwidth to
>     S3, it would start making progress faster in the second attempt.  Then, 
> the effect on overall job
>     execution might be faster.
> 
>     That's also my assumption. While connecting to S3 a server is selected 
> which is fast now.
>     While copying 1 GB which takes a couple of minutes just because of 
> general network throughput,
>     the server may become more loaded. When reconnecting a better server is 
> chosen.
> 
>     Btw., tasks are not failing when choosing a moderate timeout - 30 sec. is 
> ok, with lower
>     values (a few seconds) the file uploads frequently fail.
> 
>     I've seen this behavior with a simple distcp from S3: with the default 
> values, it took 1 day to copy
>     300 GB from S3 to HDFS. After choosing a shorter timeout the job finished 
> within 5 hours.
> 
>     Thanks,
>     Sebastian
> 
>     On 08/16/2016 09:11 PM, Dheeren Bebortha wrote:
>     > Did you change java idk version as well,  as part of the upgrade?
>     > Dheeren
>     >
>     >> On Aug 16, 2016, at 11:59 AM, Chris Nauroth <cnaur...@hortonworks.com
>     <mailto:cnaur...@hortonworks.com>> wrote:
>     >>
>     >> Hello Sebastian,
>     >>
>     >> This is an interesting finding.  Thank you for reporting it.
>     >>
>     >> Are you able to share a bit more about your deployment architecture?  
> Are these EC2 VMs?  If
>     so, are they co-located in the same AWS region as the S3 bucket?  If the 
> cluster is not running
>     in EC2 (e.g. on-premises physical hardware), then are there any notable 
> differences on nodes
>     that experienced this problem (e.g. smaller capacity on the outbound NIC)?
>     >>
>     >> This is just a theory, but If your bandwidth to the S3 service is 
> intermittently saturated or
>     throttled or somehow compromised, then I could see how longer timeouts 
> and more retries might
>     increase overall job time.  With the shorter settings, it might cause 
> individual task attempts
>     to fail sooner.  Then, if the next attempt gets scheduled to a different 
> node with better
>     bandwidth to S3, it would start making progress faster in the second 
> attempt.  Then, the effect
>     on overall job execution might be faster.
>     >>
>     >> --Chris Nauroth
>     >>
>     >> On 8/7/16, 12:12 PM, "Sebastian Nagel" <wastl.na...@googlemail.com
>     <mailto:wastl.na...@googlemail.com>> wrote:
>     >>
>     >>    Hi,
>     >>
>     >>    recently, after upgrading to CDH 5.8.0, I've run into a performance
>     >>    issue when reading data from AWS S3 (via s3a).
>     >>
>     >>    A job [1] reads 10,000s files ("objects") from S3 and writes 
> extracted
>     >>    data back to S3. Every file/object is about 1 GB in size, processing
>     >>    is CPU-intensive and takes a couple of minutes per file/object. Each
>     >>    file/object is processed by one task using FilenameInputFormat.
>     >>
>     >>    After the upgrade to CDH 5.8.0, the job showed slow progress, 5-6
>     >>    times slower in overall than in previous runs. A significant number
>     >>    of tasks hung up without progress for up to one hour. These tasks 
> were
>     >>    dominating and most nodes in the cluster showed little or no CPU
>     >>    utilization. Tasks are not killed/restarted because the task timeout
>     >>    is set to a very large value (because S3 is known to be slow
>     >>    sometimes). Attaching to a couple of the hung tasks with jstack
>     >>    showed that these tasks hang when reading from S3 [3].
>     >>
>     >>    The problem was finally fixed by setting
>     >>      fs.s3a.connection.timeout = 30000  (default: 200000 ms)
>     >>      fs.s3a.attempts.maximum = 5        (default 20)
>     >>    Tasks now take 20min. in the worst case, the majority finishes 
> within minutes.
>     >>
>     >>    Is this the correct way to fix the problem?
>     >>    These settings have been increased recently in HADOOP-12346 [2].
>     >>    What could be the draw-backs with a lower timeout?
>     >>
>     >>    Thanks,
>     >>    Sebastian
>     >>
>     >>    [1]
>     >>   
>     
> https://github.com/commoncrawl/ia-hadoop-tools/blob/master/src/main/java/org/archive/hadoop/jobs/WEATGenerator.java
>     >>
>     >>    [2] https://issues.apache.org/jira/browse/HADOOP-12346
>     >>
>     >>    [3] "main" prio=10 tid=0x00007fad64013000 nid=0x4ab5 runnable 
> [0x00007fad6b274000]
>     >>       java.lang.Thread.State: RUNNABLE
>     >>            at java.net.SocketInputStream.socketRead0(Native Method)
>     >>            at 
> java.net.SocketInputStream.read(SocketInputStream.java:152)
>     >>            at 
> java.net.SocketInputStream.read(SocketInputStream.java:122)
>     >>            at
>     >>    com.cloudera.org.apache.http.impl.io
>     
> <http://impl.io>.AbstractSessionInputBuffer.read(AbstractSessionInputBuffer.java:204)
>     >>            at
>     >>    com.cloudera.org.apache.http.impl.io
>     
> <http://impl.io>.ContentLengthInputStream.read(ContentLengthInputStream.java:182)
>     >>            at
>     
> com.cloudera.org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:138)
>     >>            at
>     
> com.cloudera.com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
>     >>            at
>     
> com.cloudera.com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:151)
>     >>            at
>     
> com.cloudera.com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
>     >>            at
>     
> com.cloudera.com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
>     >>            at
>     
> com.cloudera.com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
>     >>            at
>     
> com.cloudera.com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:151)
>     >>            at
>     
> com.cloudera.com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
>     >>            at
>     
> com.cloudera.com.amazonaws.util.LengthCheckInputStream.read(LengthCheckInputStream.java:108)
>     >>            at
>     
> com.cloudera.com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
>     >>            at 
> org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:160)
>     >>            - locked <0x00000007765604f8> (a 
> org.apache.hadoop.fs.s3a.S3AInputStream)
>     >>            at java.io.DataInputStream.read(DataInputStream.java:149)
>     >>            ...
>     >>
>     >>    
> ---------------------------------------------------------------------
>     >>    To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
>     <mailto:user-unsubscr...@hadoop.apache.org>
>     >>    For additional commands, e-mail: user-h...@hadoop.apache.org
>     <mailto:user-h...@hadoop.apache.org>
>     >>
>     >>
>     >>
>     >>
>     >>
>     >> ---------------------------------------------------------------------
>     >> To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
>     <mailto:user-unsubscr...@hadoop.apache.org>
>     >> For additional commands, e-mail: user-h...@hadoop.apache.org 
> <mailto:user-h...@hadoop.apache.org>
>     >
> 
> 
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
>     <mailto:user-unsubscr...@hadoop.apache.org>
>     For additional commands, e-mail: user-h...@hadoop.apache.org 
> <mailto:user-h...@hadoop.apache.org>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org

Re: Slow read from S3 on CDH 5.8.0 (includes HADOOP-12346)

Reply via email to