Hi,
If you call endpoint.ask[CommitFilesResponse](message), you should wait for
response. If responses
is successful, you can be sure commit files succeeds. Please refer
to CommitHandler.requestCommitFilesWithRetry.
Thanks,
Keyong Zhou
于2023年7月13日周四 15:54写道:
> > Following are the main steps fo
Is there some way to use Celeborn API to check if CommitFiles succeeds in
step 6? Currently we are testing with TPC-DS 10TB data, and some heavy query
(query 24) occasionally fails with:
Caused by: java.io.IOException: Premature EOF from inputStream
We are speculating that this error occurs b
Following are the main steps for a shuffle stage:
1. LifecycleManager sends RequestSlots to Master to request slots for the
current shuffle;
2. Master allocates slots among workers for the shuffle and
returns RequestSlotsResponse;
3. LifecycleManager sends ReserveSlots to workers; workers do
initi
Hi Sungwoo,
Glad to know about your progress! For your questions,
1. In Celeborn's default implementation, ShuffleClient is a singleton in
the Executor and Driver process, I suggest to follow this practice.
It's recommended to call ShuffleClient.cleanup(int shuffleId, int
mapId, int attemptId
Hi Keyong,
Thanks for your quick reply. We thought that Celeborn API was clean and
very intuitive, and have not encountered serious problems yet for getting
our system up and running. We are not sure about just a few points that
are not immediately obvious from Celeborn API (e.g., whether or n
Hi Sungwoo,
Thanks for your effort to integrating Celeborn into MR3!
For your question, currently a reducer does wait until the completion of
all mappers
before starting to fetch shuffle data.
Briefly speaking, Celeborn client contains two modules:
1. ShuffleClient for push/fetch data, mainly us
Hi Team,
We are currently implementing a Celeborn client for our application
(called MR3 which is similar to Tez), and have a question on the internals
of Celeborn.
The question is whether a reducer should wait until the completion of all
mappers before starting to fetch mapper output. From