Hi all, I have tested quota management and make some minor improvements. It can work well. later I will test worker quick restart.
apache <ethanf...@apache.org> 于2022年12月21日周三 10:00写道: > Hi community: > I’ve tested branch-0.2 about load-aware slots allocation modules with a > fusion cluster of HDDs, SSDs, and ESSDs. As expected, workers with faster > disk drives have more slots on them. > > The way I tested is as follows: > I ran 1T TPC-DS while 1TB Terasort was running repeatedly and tested > different parameter combinations on three node groups. The group core has > 10x NVME SSD drives. Group task 1 has 4x ESSD drives. Group task 2 has 12x > SATA HDD drivers. > > Here are some snapshots: > > Graph 1: Only NVME SSD and HDD workers are involved and > "celeborn.slots.assign.loadAware.numDiskGroups" is set to 2. > > > Graph 2: All nodes are involved and > "celeborn.slots.assign.loadAware.numDiskGroups" is set to 3. > > > Thanks, > Ethan Feng > 在 2022年12月14日 +0800 19:41,Keyong Zhou <zho...@apache.org>,写道: > > Hi celeborn (-incubating) community: > > Currently we are preparing for the first release (branch-0.2). To ensure > code quality, I would like to test for core-path correctness and stability, > could Angerszhuuuu <angers....@gmail.com> and nafiyaix > <nafiyai...@gmail.com> help test graceful shutdown and rolling upgrade? > And > could Ethan Feng <ethan.aquarius....@gmail.com> help test load-aware slots > allocation? > > And we would be rather happy if anyone can help test for other modules > (k8s, HA, etc.). > > Thanks, > Keyong > >