[jira] [Commented] (MAPREDUCE-7465) performance problem in FileOutputCommiter for big list processed by single thread

Steve Loughran (Jira) Tue, 02 Jan 2024 12:21:06 -0800


    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17801918#comment-17801918
 ]


Steve Loughran commented on MAPREDUCE-7465:
-------------------------------------------

bq. I understand that developpers are reluctant to integrate the PR, but it 
does solve my problem correctly.
I would appreciate that it is disabled by default, but configurable with a 
property, so I can enable it and still use the official version without local 
patch applied to my jars.

i really, really really don't want to do this as (a) its a critical piece of 
code and (b) we'd be committing to maintaining it. 

bq. There is no problem of "controlling the throttling" in Hadoop Azure Abfss 
code... it is already very bad at using legacy class 
java.net.HttpURLConnection, so establishing very slow https connection 1 by 1 
without keeping TCP sockets alive.

bq. We do have throthling however, not from a single JVM, but because we have 
so many (>= 1000) spark applications running concurrently, and so many useless 
"Prefetching Threads" ! Trying to control throtling on a single JVM is in my 
opinion useless. Azure Abfss can support 20 Millions operations per hours (and 
per StorageAccount), and Microsoft Azure was even able to increase it more.

bq. trying to control throtling on a single JVM is in my opinion useless.

we see scale issues in job commits from renames...if you aren't seeing them 
then it's because if our attempts to handle this (HADOOP-18002 and related). 
Abfs also supports 100-continue, self-throttling and other things, for 
cross-JVM throttling. better than the s3a code by a large margin, where even 
large directory deletions can blow up and make a mess of retries within the aws 
sdk itself.

it's precisely because of rename scale issues during job commit that the 
manifest committer was written and is used in production ABFS deployments. 

If you are targeting abfs storage, it's the way to get robust job commit even 
on heavily loaded stores. And it is shipping in current releases. If you are 
having problems getting it working, happy to assist you getting set up.

> performance problem in FileOutputCommiter for big list processed  by single 
> thread
> ----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7465
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7465
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance
>    Affects Versions: 3.2.3, 3.3.2, 3.2.4, 3.3.5, 3.3.3, 3.3.4, 3.3.6
>            Reporter: Arnaud Nauwynck
>            Priority: Minor
>              Labels: pull-request-available
>
> when commiting a big hadoop job (for example via Spark) having many 
> partitions,
> the class FileOutputCommiter process thousands of dirs/files to rename with a 
> single Thread. This is performance issue, caused by lot of waits on 
> FileStystem storage operations.
> I propose that above a configurable threshold (default=3, configurable via 
> property 'mapreduce.fileoutputcommitter.parallel.threshold'), the class 
> FileOutputCommiter process the list of files to rename using parallel 
> threads, using the default jvm ExecutorService (ForkJoinPool.commonPool())
> See Pull-Request: 
> [https://github.com/apache/hadoop/pull/6378|https://github.com/apache/hadoop/pull/6378]
> Notice that sub-class instances of FileOutputCommiter are supposed to be 
> created at runtime dependending of a configurable property 
> ([https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/PathOutputCommitterFactory.java|PathOutputCommitterFactory.java]).
> But for example in Parquet + Spark, this is buggy and can not be changed at 
> runtime. 
> There is an ongoing Jira and PR to fix it in Parquet + Spark: 
> [https://issues.apache.org/jira/browse/PARQUET-2416|https://issues.apache.org/jira/browse/PARQUET-2416]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (MAPREDUCE-7465) performance problem in FileOutputCommiter for big list processed by single thread

Reply via email to