Paco Chan created MAPREDUCE-7519:
------------------------------------

             Summary: Loosen TestSplitPlacementTestUpdate since it is not 
guaranteed in implementation
                 Key: MAPREDUCE-7519
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7519
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: mapreduce-client
    Affects Versions: 3.4.1
            Reporter: Paco Chan


the Hadoop documentation states that the number of paths per split in 
{{CombineFileInputFormat}} is not fixed and can vary. 

 
{quote}"If a maxSplitSize is specified, then blocks on the same node are 
combined to form a single split. Blocks that are left over are then combined 
with other blocks in the same rack." 
[hadoop.apache.org|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/lib/CombineFileInputFormat.html?utm_source=chatgpt.com]
{quote}
This means that the number of paths in a split is determined by the block 
placement and the configuration settings, leading to potential variations in 
the number of paths per split.

 

This causes the test to sometimes fail depending on the split. As such, the 
test could be reworked to avoid strictly testing for the number of paths in 
each split. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to