We have support both modes. Standalone or embedded within driver?

We run javalin timeline server today during the spark write. All the file
listings during the writing actually are spark executors talking to this
driver service. This way, we don't keep listing S3/HDFS repeatedly during
the write process. This timeline server can also be run in a long running
mode as a separate process. That is what the hudi-timeline-service module
does..

I was suggesting something similar for the UI itself.. We can just get a
working UI/service first may be.. and pulling this into Spark driver won't
be a big deal. Someone else may also be interested in taking it up..

Overall +1 from me


On Wed, Jul 15, 2020 at 9:10 PM tanu dua <tanu.dua...@gmail.com> wrote:

> Sure we can go ahead with a rudimentary UI.
> On running it as a part of Spark Driver itself we should first conclude
> what are end goals of this service should be. I was rather thinking of
> hosting it as separate service so that without running a spark program I
> can browse through the table metadata as I can do in Hudi CLI but since CLI
> is shell based and everyone will not have an access to shell so those
> service can help there.
> I noticed that we start a javalin server when we start Spark program but
> honestly I don’t know where do we use it . Do we use it in hudi spark code
> ? Is it a good idea to access rest services from spark code ?
>
> On Thu, 16 Jul 2020 at 9:11 AM, Vinoth Chandar <vin...@apache.org> wrote:
>
> > Hi,
> > Sorry, did not realize my response was still stuck in my outbox.
> >
> > At a high level, that sounds good to me. I would start with a rudimentary
> > UI to begin with if possible. Having a service alone may not make this
> very
> > readily consumable?
> >
> > Other random thought is, if we can host this UI service as a part of the
> > spark driver itself? In terms of deployment model - it would be nice
> > atleast for spark streaming/DeltaStreamer continuous mode to atleast have
> > UI hosted by the spark driver. This way people don’t have to run a
> separate
> > server per se..
> > (we already have a timeline-server which we have not pursued actively as
> a
> > separate running service for the same reasons. )
> >
> > Any thoughts on this?
> >
> > Thanks for driving this forward
> > Vinoth
> >
> >
> >
> > On Fri, Jul 10, 2020 at 7:02 AM Tanuj <tanu.dua...@gmail.com> wrote:
> >
> > > This is what my high level thought and design, please correct me if I
> am
> > > wrong.
> > > 1) We are using Spring Shell for hudi cli and for each command we have
> > > class and methods annotated with CliCommand
> > > 2) We initiate the static file system fs once we connect to the table
> and
> > > then all operations interact with that fs
> > >
> > > On the similar lines, we can write a Spring Boot app  -
> > > 1) Which will spin up a new microservices server and in place of
> > > CliCommand  we will have Spring Boot end point
> > > 2) Since microservices are stateless, we can't rely on static filesytem
> > > variable fs. So in place of that we can have a map<user_session_id,fs>
> > with
> > > auto invalidation after specified time
> > > 3) We will integrate this service with LDAP using Spring Security etc
> and
> > > authorisation at table and commands/endpoint level
> > >
> > > So we should be able to leverage most of the CLI code with some
> > > modification.
> > >
> > > I am deferring UI as of now if we are ok with the service design but if
> > we
> > > go with the basic UI, we can just have a tree of tables on the left
> with
> > > all greyed out. Once user connects to the table, then relevant context
> > menu
> > > options will be enabled depending upon user authorisation. The output
> of
> > > the command can be printed on the right panel leveraging the CLI output
> > > format.
> > >
> > >
> > > On 2020/07/07 23:52:15, Vinoth Chandar <vin...@apache.org> wrote:
> > > > Nope. We can begin on a fresh slate. Feel free to even create a new
> > RFC,
> > > if
> > > > that does not fit with what you have in mind..
> > > >
> > > >
> > > >
> > > > On Mon, Jul 6, 2020 at 6:31 AM tanu dua <tanu.dua...@gmail.com>
> wrote:
> > > >
> > > > > Sure me and my team can think of in contributing here. May I know
> if
> > > > > something has already kicked off and the technologies that are used
> > to
> > > > > build the services and UI ?
> > > > >
> > > > > On Mon, 6 Jul 2020 at 5:26 PM, Vinoth Chandar <vin...@apache.org>
> > > wrote:
> > > > >
> > > > > > Hi Tanuj,
> > > > > >
> > > > > > Good idea to have a service/UI..  There is an inactive proposal
> > > around
> > > > > > this, if you want to revive and drive it forward.
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130027233
> > > > > >
> > > > > > Thanks
> > > > > > Vinoth
> > > > > >
> > > > > > On Sun, Jul 5, 2020 at 11:07 PM Tanuj <tanu.dua...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > > HUDI CLI is a great tool but I believe the biggest limitation
> of
> > > HUDI
> > > > > CLI
> > > > > > > is that you can only access it from shell and in the higher
> > > > > environments
> > > > > > we
> > > > > > > may not get a shell to execute the commands.
> > > > > > >
> > > > > > > How about exposing HUDI CLI as a service backed by LDAP and
> with
> > > all
> > > > > > > proper authorisation may be as a Spring Boot service ?
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to