awdavidson commented on issue #731: URL: https://github.com/apache/incubator-uniffle/issues/731#issuecomment-1515869583
> @awdavidson > > In PR #749, @advancedxy propose a new possibility to solving this. refer [#749 (comment)](https://github.com/apache/incubator-uniffle/pull/749#issuecomment-1486540366) > > But for my thought, it sounds good to me. But in the current codebase, the taskAttemptId must be a real value, some code will retrieve this from this blockId and do some validation. I'm not sure whether removing this part or optimize. > > > What’s the feasibility of using a combination of the stageId and taskId? What else can be done? > > I'm not sure what do you mean. Could you help describe more? @zuston Thank you for the reply, yes I did notice that proposal and I agree with comments such as ``` Also, I'm more concerned about the compatibility issues. To make the length of fields in block ids adaptive itself is appealing, it's just that we really have to think it through and make sure it wouldn't break things. ``` In my opinion exposing this to the end user introduces risk and makes the assumption they know what they are doing. It would be ideal to have a implementation that just works without the end user needing to worry. Sure, so firstly, I mean the taskId index. Currently taskId increments across the life of the application and represents the total number of tasks, which for uniffle has an upper limit. ``` Can't support taskAttemptId[3786263], the max value should be 2097151 ``` taskId index is reset in each stage as the index relates to only the task set of the stage. Using a combination of the stageId and taskId index would help increase the upper limit especially for applications running iterative algorithms, rather than have `[1, ..., 20]` you would have `[1_1, .. , 1_10, 2_1, .. , 2_10]`. The blockId would be made up of something like: `long partitionId, long stageId, long taskIdIndex, long atomicInt` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
