Thanks Zhang Yue for drafting the RFC. It's an interesting read! I have left some comments.
While exposing certain info such as "sample_hoodie_key", we have to consider masking/obfuscation. Looking forward to the implementation. Regards, Sagar On Wed, Sep 7, 2022 at 1:49 PM Yue Zhang <zhangyue921...@163.com> wrote: > Hi Hudi, > Just raise a RFC about this diagnostic reporter > https://github.com/apache/hudi/pull/6600. PLEASE feel free to leave any > comments or concerns if you are interested! > > > | | > Yue Zhang > | > | > zhangyue921...@163.com > | > > > On 08/4/2022 19:38,Yue Zhang<zhangyue921...@163.com> wrote: > Hi Shiyan and everyone, > This is a great idea! As one of Hudi user, I also struggle to Hudi > troubleshooting sometimes. With this feature, it will definitely be able to > reduce the burden. > So I volunteer to draft a discuss and maybe raise a RFC about if you > don't mind. Thanks :) > > > | | > Yue Zhang > | > | > zhangyue921...@163.com > | > > > On 08/3/2022 00:44,冯健<fengjian...@gmail.com> wrote: > Maybe we can start this with an audit feature? Since we need some sort of > "images" to represent “facts”, can create an identity of a writer to link > them. and in this audit file, we can label each operation with IP, > environment, platform, version, write config and etc. > > On Sun, 31 Jul 2022 at 12:18, Shiyan Xu <xu.shiyan.raym...@gmail.com> > wrote: > > To bubble this up > > On Wed, Jun 15, 2022 at 11:47 PM Vinoth Chandar <vin...@apache.org> wrote: > > +1 from me. > > It will be very useful if we can have something that can gather > troubleshooting info easily. > This part takes a while currently. > > On Mon, May 30, 2022 at 9:52 AM Shiyan Xu <xu.shiyan.raym...@gmail.com> > wrote: > > Hi all, > > When troubleshooting Hudi jobs in users' environments, we always ask > users > to share configs, environment info, check spark UI, etc. Here is an RFC > idea: can we extend the Hudi metrics system and make a diagnostic > reporter? > It can be turned on like a normal metrics reporter. it should collect > common troubleshooting info and save to json or other human-readable > text > format. Users should be able to run with it and share the diagnosis > file. > The RFC should discuss what info should / can be collected. > > Does this make sense? Anyone interested in driving the RFC design and > implementation work? > > -- > Best, > Shiyan > > > -- > Best, > Shiyan > >