Hi Ethan,

Thanks for your comments!

Regarding using Java/Scala for the CLI, I am fine with this. I had believed
that using Python would be an easier/simpler implementation given that many
CLI's are implemented in Python, but the points you make are fair. Most of
the Celeborn community uses Java/Scala, so this would be more beneficial
for the development and evolution of the CLI.

Yes, I think the CLI should contain capabilities beyond the HTTP endpoints
Celeborn exposes. The Celeborn HTTP endpoints work great for application
specific use cases, such as finding the applications or shuffles on a
particular worker, however it would not work for situations in which we
would need information on the cluster itself. For example, we use K8s and
these are use cases internally I can foresee that require communication
with an external cluster manager:

   - Retrieve all pods running masters/workers and their statuses
   - Manually evict an Celeborn unhealthy pod
   - SSH into various different Celeborn pods
   - Manage ACLs of the cluster
   - Manually restart pods
   - Wipe Ratis storage if state is messed up
   - Wipe shuffle directories if state is messed up
   - Adding/removing new nodes into our node pool
   - Perform any other manual arbitrary function on a Celeborn pod


These are just a few of the use cases I can think of, but I am sure more
will arise as more users adopt Celeborn :)

Given that users will have various different cluster managers, I think as I
mentioned before there should be an abstraction layer present that exposes
different operations. Based on the cluster manager the user is using, the
user can implement their specific logic. We can have a few default ones
included (e.g. Kubernetes).

Hope this answers your questions, let me know if you have any more
questions!

Thanks,
Aravind

On Tue, Jun 11, 2024 at 11:57 PM Ethan Feng <ethanf...@apache.org> wrote:

> Hi Aravind,
>
> I hope this message finds you well. I wanted to express my
> appreciation for the energy and creativity you've invested in the
> Celeborn project; the proposal you submitted is intriguing.
>
> I apologize for the delayed feedback on your proposal — it took me a
> bit longer to get to it than anticipated. After reviewing it, I have a
> couple of inquiries that I'd like to discuss in order to gain a
> clearer understanding:
>
> I observed that you're planning to implement the CLI in Python. Could
> you elaborate on the choice behind not leveraging the Java stack for
> this purpose? The Java ecosystem already includes mature tools such as
> "commons-cli" or "Scala CLI," which are capable of facilitating CLI
> tool development. Given the prevalent familiarity with the Java stack
> within our community, I believe leveraging it could accelerate the
> CLI's development and evolution through wider collaboration.
>
> From email discussions, you've indicated an interest in offering a
> generic interface API for Celeborn, which is certainly exciting.
> However, I'm concerned that basing a CLI on HTTP API might not fully
> align with this vision. Could you provide additional insights into how
> you envision the CLI advancing beyond the capabilities of the current
> HTTP REST API?
>
> Based on previous exchanges, the CLI is expected to communicate with
> an external cluster manager. Is there an abstraction layer in place to
> interface uniformly with various external cluster managers, or is this
> something under consideration?
>
> I'm looking forward to learning more about your perspectives and the
> pathway you foresee for the CLI's development.
>
> regards,
> Ethan
>
> Mridul Muralidharan <mri...@gmail.com> 于2024年6月12日周三 14:36写道:
> >
> > +1
> >
> > Regards,
> > Mridul
> >
> >
> > On Wed, Jun 12, 2024 at 1:08 AM Shaoyun Chen <c...@apache.org> wrote:
> >
> > > +1
> > >
> > > Keyong Zhou <zho...@apache.org> 于2024年6月12日周三 13:47写道:
> > > >
> > > > +1
> > > >
> > > > Thanks for the proposal!
> > > >
> > > > Regards,
> > > > Keyong Zhou
> > > >
> > > > Nicholas Jiang <nicholasji...@apache.org> 于2024年6月12日周三 13:02写道:
> > > >
> > > > > +1. Looking forward to Celeborn CLI.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Regards,
> > > > >
> > > > > Nicholas Jiang
> > > > >
> > > > >
> > > > > At 2024-06-12 12:26:34, "Aravind Patnam" <akpatna...@gmail.com>
> wrote:
> > > > > >Hi all,
> > > > > >
> > > > > >Sorry, this is the correct link to the Celeborn CLI CIP
> > > > > ><
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/CELEBORN/CIP+7+-+Celeborn+CLI>
> > > > > >.
> > > > > >
> > > > > >Thanks,
> > > > > >Aravind
> > > > > >
> > > > > >On Tue, Jun 11, 2024 at 9:24 PM Aravind Patnam <
> akpatna...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > >> Hi all,
> > > > > >>
> > > > > >> This is a call to vote to contribute the Celeborn CLI CIP
> > > > > >> <
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/CELEBORN/Celeborn+Improvement+Proposals
> > > >
> > > > > to
> > > > > >> Apache Celeborn.
> > > > > >>
> > > > > >> Please do vote accordingly:
> > > > > >> [ ] +1 approve
> > > > > >> [ ] +0 no opinion
> > > > > >> [ ] -1 disapprove (and the reason)
> > > > > >>
> > > > > >> Thanks once again!!
> > > > > >>
> > > > > >> Aravind
> > > > > >>
> > > > > >
> > > > > >
> > > > > >--
> > > > > >Aravind K. Patnam
> > > > >
> > >
>


-- 
Aravind K. Patnam

Reply via email to