Great. Please feel free to post more followup thoughts here or on an RFC, as you prefer.
On Thu, Jul 16, 2020 at 9:46 PM tanu dua <[email protected]> wrote: > Thanks Vinoth. I understand now. I would also look timeline server to > understand more how it works. > > On Fri, Jul 17, 2020 at 9:33 AM Vinoth Chandar <[email protected]> wrote: > > > We have support both modes. Standalone or embedded within driver? > > > > We run javalin timeline server today during the spark write. All the file > > listings during the writing actually are spark executors talking to this > > driver service. This way, we don't keep listing S3/HDFS repeatedly during > > the write process. This timeline server can also be run in a long running > > mode as a separate process. That is what the hudi-timeline-service module > > does.. > > > > I was suggesting something similar for the UI itself.. We can just get a > > working UI/service first may be.. and pulling this into Spark driver > won't > > be a big deal. Someone else may also be interested in taking it up.. > > > > Overall +1 from me > > > > > > On Wed, Jul 15, 2020 at 9:10 PM tanu dua <[email protected]> wrote: > > > > > Sure we can go ahead with a rudimentary UI. > > > On running it as a part of Spark Driver itself we should first conclude > > > what are end goals of this service should be. I was rather thinking of > > > hosting it as separate service so that without running a spark program > I > > > can browse through the table metadata as I can do in Hudi CLI but since > > CLI > > > is shell based and everyone will not have an access to shell so those > > > service can help there. > > > I noticed that we start a javalin server when we start Spark program > but > > > honestly I don’t know where do we use it . Do we use it in hudi spark > > code > > > ? Is it a good idea to access rest services from spark code ? > > > > > > On Thu, 16 Jul 2020 at 9:11 AM, Vinoth Chandar <[email protected]> > > wrote: > > > > > > > Hi, > > > > Sorry, did not realize my response was still stuck in my outbox. > > > > > > > > At a high level, that sounds good to me. I would start with a > > rudimentary > > > > UI to begin with if possible. Having a service alone may not make > this > > > very > > > > readily consumable? > > > > > > > > Other random thought is, if we can host this UI service as a part of > > the > > > > spark driver itself? In terms of deployment model - it would be nice > > > > atleast for spark streaming/DeltaStreamer continuous mode to atleast > > have > > > > UI hosted by the spark driver. This way people don’t have to run a > > > separate > > > > server per se.. > > > > (we already have a timeline-server which we have not pursued actively > > as > > > a > > > > separate running service for the same reasons. ) > > > > > > > > Any thoughts on this? > > > > > > > > Thanks for driving this forward > > > > Vinoth > > > > > > > > > > > > > > > > On Fri, Jul 10, 2020 at 7:02 AM Tanuj <[email protected]> wrote: > > > > > > > > > This is what my high level thought and design, please correct me > if I > > > am > > > > > wrong. > > > > > 1) We are using Spring Shell for hudi cli and for each command we > > have > > > > > class and methods annotated with CliCommand > > > > > 2) We initiate the static file system fs once we connect to the > table > > > and > > > > > then all operations interact with that fs > > > > > > > > > > On the similar lines, we can write a Spring Boot app - > > > > > 1) Which will spin up a new microservices server and in place of > > > > > CliCommand we will have Spring Boot end point > > > > > 2) Since microservices are stateless, we can't rely on static > > filesytem > > > > > variable fs. So in place of that we can have a > > map<user_session_id,fs> > > > > with > > > > > auto invalidation after specified time > > > > > 3) We will integrate this service with LDAP using Spring Security > etc > > > and > > > > > authorisation at table and commands/endpoint level > > > > > > > > > > So we should be able to leverage most of the CLI code with some > > > > > modification. > > > > > > > > > > I am deferring UI as of now if we are ok with the service design > but > > if > > > > we > > > > > go with the basic UI, we can just have a tree of tables on the left > > > with > > > > > all greyed out. Once user connects to the table, then relevant > > context > > > > menu > > > > > options will be enabled depending upon user authorisation. The > output > > > of > > > > > the command can be printed on the right panel leveraging the CLI > > output > > > > > format. > > > > > > > > > > > > > > > On 2020/07/07 23:52:15, Vinoth Chandar <[email protected]> wrote: > > > > > > Nope. We can begin on a fresh slate. Feel free to even create a > new > > > > RFC, > > > > > if > > > > > > that does not fit with what you have in mind.. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 6, 2020 at 6:31 AM tanu dua <[email protected]> > > > wrote: > > > > > > > > > > > > > Sure me and my team can think of in contributing here. May I > know > > > if > > > > > > > something has already kicked off and the technologies that are > > used > > > > to > > > > > > > build the services and UI ? > > > > > > > > > > > > > > On Mon, 6 Jul 2020 at 5:26 PM, Vinoth Chandar < > [email protected] > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Tanuj, > > > > > > > > > > > > > > > > Good idea to have a service/UI.. There is an inactive > proposal > > > > > around > > > > > > > > this, if you want to revive and drive it forward. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130027233 > > > > > > > > > > > > > > > > Thanks > > > > > > > > Vinoth > > > > > > > > > > > > > > > > On Sun, Jul 5, 2020 at 11:07 PM Tanuj <[email protected] > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > HUDI CLI is a great tool but I believe the biggest > limitation > > > of > > > > > HUDI > > > > > > > CLI > > > > > > > > > is that you can only access it from shell and in the higher > > > > > > > environments > > > > > > > > we > > > > > > > > > may not get a shell to execute the commands. > > > > > > > > > > > > > > > > > > How about exposing HUDI CLI as a service backed by LDAP and > > > with > > > > > all > > > > > > > > > proper authorisation may be as a Spring Boot service ? > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
