Can I please have an access of Confluence to post RFC On Sun, Jul 19, 2020 at 6:05 AM Vinoth Chandar <vin...@apache.org> wrote:
> Great. Please feel free to post more followup thoughts here or on an RFC, > as you prefer. > > On Thu, Jul 16, 2020 at 9:46 PM tanu dua <tanu.dua...@gmail.com> wrote: > > > Thanks Vinoth. I understand now. I would also look timeline server to > > understand more how it works. > > > > On Fri, Jul 17, 2020 at 9:33 AM Vinoth Chandar <vin...@apache.org> > wrote: > > > > > We have support both modes. Standalone or embedded within driver? > > > > > > We run javalin timeline server today during the spark write. All the > file > > > listings during the writing actually are spark executors talking to > this > > > driver service. This way, we don't keep listing S3/HDFS repeatedly > during > > > the write process. This timeline server can also be run in a long > running > > > mode as a separate process. That is what the hudi-timeline-service > module > > > does.. > > > > > > I was suggesting something similar for the UI itself.. We can just get > a > > > working UI/service first may be.. and pulling this into Spark driver > > won't > > > be a big deal. Someone else may also be interested in taking it up.. > > > > > > Overall +1 from me > > > > > > > > > On Wed, Jul 15, 2020 at 9:10 PM tanu dua <tanu.dua...@gmail.com> > wrote: > > > > > > > Sure we can go ahead with a rudimentary UI. > > > > On running it as a part of Spark Driver itself we should first > conclude > > > > what are end goals of this service should be. I was rather thinking > of > > > > hosting it as separate service so that without running a spark > program > > I > > > > can browse through the table metadata as I can do in Hudi CLI but > since > > > CLI > > > > is shell based and everyone will not have an access to shell so those > > > > service can help there. > > > > I noticed that we start a javalin server when we start Spark program > > but > > > > honestly I don’t know where do we use it . Do we use it in hudi spark > > > code > > > > ? Is it a good idea to access rest services from spark code ? > > > > > > > > On Thu, 16 Jul 2020 at 9:11 AM, Vinoth Chandar <vin...@apache.org> > > > wrote: > > > > > > > > > Hi, > > > > > Sorry, did not realize my response was still stuck in my outbox. > > > > > > > > > > At a high level, that sounds good to me. I would start with a > > > rudimentary > > > > > UI to begin with if possible. Having a service alone may not make > > this > > > > very > > > > > readily consumable? > > > > > > > > > > Other random thought is, if we can host this UI service as a part > of > > > the > > > > > spark driver itself? In terms of deployment model - it would be > nice > > > > > atleast for spark streaming/DeltaStreamer continuous mode to > atleast > > > have > > > > > UI hosted by the spark driver. This way people don’t have to run a > > > > separate > > > > > server per se.. > > > > > (we already have a timeline-server which we have not pursued > actively > > > as > > > > a > > > > > separate running service for the same reasons. ) > > > > > > > > > > Any thoughts on this? > > > > > > > > > > Thanks for driving this forward > > > > > Vinoth > > > > > > > > > > > > > > > > > > > > On Fri, Jul 10, 2020 at 7:02 AM Tanuj <tanu.dua...@gmail.com> > wrote: > > > > > > > > > > > This is what my high level thought and design, please correct me > > if I > > > > am > > > > > > wrong. > > > > > > 1) We are using Spring Shell for hudi cli and for each command we > > > have > > > > > > class and methods annotated with CliCommand > > > > > > 2) We initiate the static file system fs once we connect to the > > table > > > > and > > > > > > then all operations interact with that fs > > > > > > > > > > > > On the similar lines, we can write a Spring Boot app - > > > > > > 1) Which will spin up a new microservices server and in place of > > > > > > CliCommand we will have Spring Boot end point > > > > > > 2) Since microservices are stateless, we can't rely on static > > > filesytem > > > > > > variable fs. So in place of that we can have a > > > map<user_session_id,fs> > > > > > with > > > > > > auto invalidation after specified time > > > > > > 3) We will integrate this service with LDAP using Spring Security > > etc > > > > and > > > > > > authorisation at table and commands/endpoint level > > > > > > > > > > > > So we should be able to leverage most of the CLI code with some > > > > > > modification. > > > > > > > > > > > > I am deferring UI as of now if we are ok with the service design > > but > > > if > > > > > we > > > > > > go with the basic UI, we can just have a tree of tables on the > left > > > > with > > > > > > all greyed out. Once user connects to the table, then relevant > > > context > > > > > menu > > > > > > options will be enabled depending upon user authorisation. The > > output > > > > of > > > > > > the command can be printed on the right panel leveraging the CLI > > > output > > > > > > format. > > > > > > > > > > > > > > > > > > On 2020/07/07 23:52:15, Vinoth Chandar <vin...@apache.org> > wrote: > > > > > > > Nope. We can begin on a fresh slate. Feel free to even create a > > new > > > > > RFC, > > > > > > if > > > > > > > that does not fit with what you have in mind.. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 6, 2020 at 6:31 AM tanu dua <tanu.dua...@gmail.com > > > > > > wrote: > > > > > > > > > > > > > > > Sure me and my team can think of in contributing here. May I > > know > > > > if > > > > > > > > something has already kicked off and the technologies that > are > > > used > > > > > to > > > > > > > > build the services and UI ? > > > > > > > > > > > > > > > > On Mon, 6 Jul 2020 at 5:26 PM, Vinoth Chandar < > > vin...@apache.org > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Tanuj, > > > > > > > > > > > > > > > > > > Good idea to have a service/UI.. There is an inactive > > proposal > > > > > > around > > > > > > > > > this, if you want to revive and drive it forward. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130027233 > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > Vinoth > > > > > > > > > > > > > > > > > > On Sun, Jul 5, 2020 at 11:07 PM Tanuj < > tanu.dua...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > HUDI CLI is a great tool but I believe the biggest > > limitation > > > > of > > > > > > HUDI > > > > > > > > CLI > > > > > > > > > > is that you can only access it from shell and in the > higher > > > > > > > > environments > > > > > > > > > we > > > > > > > > > > may not get a shell to execute the commands. > > > > > > > > > > > > > > > > > > > > How about exposing HUDI CLI as a service backed by LDAP > and > > > > with > > > > > > all > > > > > > > > > > proper authorisation may be as a Spring Boot service ? > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >