awdavidson commented on issue #731:
URL: 
https://github.com/apache/incubator-uniffle/issues/731#issuecomment-1515869583

   > @awdavidson
   > 
   > In PR #749, @advancedxy propose a new possibility to solving this. refer 
[#749 
(comment)](https://github.com/apache/incubator-uniffle/pull/749#issuecomment-1486540366)
   > 
   > But for my thought, it sounds good to me. But in the current codebase, the 
taskAttemptId must be a real value, some code will retrieve this from this 
blockId and do some validation. I'm not sure whether removing this part or 
optimize.
   > 
   > > What’s the feasibility of using a combination of the stageId and taskId? 
What else can be done?
   > 
   > I'm not sure what do you mean. Could you help describe more?
   
   @zuston Thank you for the reply, yes I did notice that proposal and I agree 
with comments such as 
   
   ```
   Also, I'm more concerned about the compatibility issues. To make the length 
of fields in block ids adaptive itself is appealing,
   it's just that we really have to think it through and make sure it wouldn't 
break things.
   ```
   In my opinion exposing this to the end user introduces risk and makes the 
assumption they know what they are doing. It would be ideal to have a 
implementation that just works without the end user needing to worry.
   
   Sure, so firstly, I mean the taskId index. Currently taskId increments 
across the life of the application and represents the total number of tasks, 
which for uniffle has an upper limit.
   
   ```
   Can't support taskAttemptId[3786263], the max value should be 2097151
   ```
   
   taskId index is reset in each stage as the index relates to only the task 
set of the stage. Using a combination of the stageId and taskId index would 
help increase the upper limit especially for applications running iterative 
algorithms, rather than have `[1, ..., 20]` you would have `[1_1, .. , 1_10, 
2_1, .. , 2_10]`. The blockId would be made up of something like: `long 
partitionId, long stageId, long taskIdIndex, long atomicInt`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to