[ 
https://issues.apache.org/jira/browse/TEZ-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875335#comment-17875335
 ] 

Chenyu Zheng commented on TEZ-4577:
-----------------------------------

[~yigress]

If the maxItems of a new span is 1, the kvmeta of the new span will be very 
small. Then PipelinedSorter::sort will be triggered frequently, result to be 
slow. Am I right? If so, I think it needs to be fix it. Do you have any plans 
to fix it?

In addition, I am curious, since the first span size is 16*1024*1024, why does 
maxItems become 1? Can you add some logs to your problem application to print 
the appropriate call to PipelinedSorter::sort?

> SortSpan could be created real small, resulting in eventual job failure
> -----------------------------------------------------------------------
>
>                 Key: TEZ-4577
>                 URL: https://issues.apache.org/jira/browse/TEZ-4577
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.10.4
>            Reporter: Yi Zhang
>            Priority: Major
>
> we run into a issue with overflow as in TEZ-4542, with TEZ-4542 applied, it 
> then run into an issue of real small sortspan (per record in this case), 
> eventually the job failed due to timeout
> from sample logs it looks like 
>  
> SortSpan(ByteBuffer source, int maxItems, int perItem, RawComparator 
> comparator)
>  
> once it get into a situation of maxItems=1, then it persists with maxItems=1
>  
> (also a side issue, the logging in this situation becomes huge)
>  
> sample logs:
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: Span260.length = 1, perItem = 139
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: reserved.remaining()=268396925, reserved.metasize=16
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: New Span261.length = 1, perItem = 139, counter:5307003
> 2024-08-19 19:02:28,157 [INFO] [Sorter \{scope_302 -> scope_308} #1|#1] 
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=260, 
> length=1, time=0
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: Span261.length = 1, perItem = 128
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: reserved.remaining()=268396781, reserved.metasize=16
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: New Span262.length = 1, perItem = 128, counter:5307004
> 2024-08-19 19:02:28,158 [INFO] [Sorter \{scope_302 -> scope_308} #0|#0] 
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=261, 
> length=1, time=0
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: Span262.length = 1, perItem = 145
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: reserved.remaining()=268396620, reserved.metasize=16
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: New Span263.length = 1, perItem = 145, counter:5307005
> 2024-08-19 19:02:28,158 [INFO] [Sorter \{scope_302 -> scope_308} #1|#1] 
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=262, 
> length=1, time=0
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: Span263.length = 1, perItem = 139
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: reserved.remaining()=268396465, reserved.metasize=16
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: New Span264.length = 1, perItem = 139, counter:5307006
> 2024-08-19 19:02:28,158 [INFO] [Sorter \{scope_302 -> scope_308} #0|#0] 
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=263, 
> length=1, time=0
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: Span264.length = 1, perItem = 129
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: reserved.remaining()=268396320, reserved.metasize=16
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302 
> -> scope-308: New Span265.length = 1, perItem = 129, counter:5307007
> 2024-08-19 19:02:28,158 [INFO] [Sorter \{scope_302 -> scope_308} #1|#1] 
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=264, 
> length=1, time=0
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to