Hi Sungwoo, Thanks for your update.
Yes this mailing list is the right place to discuss Celeborn, any questions please feel free to ask. Thanks, Keyong Zhou <o...@pl.postech.ac.kr> 于2023年7月21日周五 13:54写道: > Hi Keyong, > > Unlike Spark/Flink clients, we had to directly modify the MR3 runtime code > to support Celeborn and thus don't add new code to Celeborn. We release > the > MR3 runtime code in Github, which could be used just as an example of > exploiting Celeborn. > > The API is clean and the code is also clearly structured and easy to > follow, but still we find a few questions ws we test the MR3-Celeborn > extension, especially for dealing with exceptions (e.g., Premature EOF > from inputStream). I hope this mailing list is the right place to ask such > questions. > > Best, > > --- Sungwoo > > On Mon, 17 Jul 2023, Keyong Zhou wrote: > > > Hi Sungwoo, > > > > It's really great to hear that! To be honest, we never expected such > things > > will happen. > > > > Just curious, is it possible that you contribute the integration with MR3 > > to Celeborn community? > > It will be a great feature for Celeborn, also the community can work > > together to better support MR3 (and Hive). > > > > Thanks, > > Keyong Zhou > > > > > > > > <o...@pl.postech.ac.kr> 于2023年7月16日周日 22:33?道: > > > >> We have extended the implementation of MR3 so that all partition > >> inputs can be fetched with a single call, e.g.: > >> > >> rssShuffleClient.readPartition(..., 0, 100) > >> > >> Now, Hive-MR3 with Celeborn runs as fast as Hive-MR3 with its own > shuffle > >> handlers when tested with 10TB TPC-DS benchmark. For some queries, it is > >> even noticeably faster. > >> > >> Thanks, > >> > >> --- Sungwoo > >> > >> On Thu, 13 Jul 2023, o...@pl.postech.ac.kr wrote: > >> > >>> Hi Team, > >>> > >>> I have a question on how a reducer should fetch the output of mappers. > >>> As an example, consider this standard scenario: > >>> > >>> 1. There are 100 mapper and 50 reducers. > >>> 2. Each mapper creates 50 partitions, each of which is to be fetched by > >> the > >>> corresponding reducer. > >>> 3. Each reducer is responsible for a single partition and tries to > fetch > >> 100 > >>> partitions (one from each mapper). > >>> > >>> In our current implementation, a reducer calls > >>> rssShuffleClient.readPartition() 100 times (one for each mapper): > >>> > >>> rssShuffleClient.readPartition(..., mapIndex, mapIndex + 1) > >>> > >>> My question is: if reducers start after the completion of all mappers, > >> can we > >>> call (or should we try to call) rssShuffleClient.readPartition() only > >> once, > >>> as in? > >>> > >>> rssShuffleClient.readPartition(..., 0, 100) > >>> > >>> My understanding of remote shuffle service (like Magnet for Spark) is > >> that > >>> all the partitions destined to the same reducer are automatically > merged > >> by > >>> the shuffle service, so we thought that just a single call might be > >> enough. > >>> > >>> Thanks, > >>> > >>> --- Sungwoo Park > >>> > >> > >