yes. to the append log, that is used for compaction. My guess was Kabeer's
concern was around actually sending user data into debug logs (slf4j/log4j)
which we dont.

On the second part. yes, we want to option to write parquet data inline
instead of avro. Once we harden this, any other format e.g Orc would also
be easy to do. Thats my thinking. WDYT?

On Wed, Oct 23, 2019 at 6:28 AM Jaimin Shah <shahjaimin0...@gmail.com>
wrote:

> Hi Vinoth,
>    Aren’t we writing user data to append log currently? The way I
> understand is that currently data is written in avro which you want to move
> to inline parquet. Please correct me if I am missing something.
>
> Thanks,
> Jaimin
>
> On Wednesday, 23 October 2019, Vinoth Chandar <vin...@apache.org> wrote:
>
> > Sure. Take your time!  Just to clarify, here log refers to the Hudi
> append
> > log, not user's log4j or such logs. yes that would be very strange to do.
> > :)
> >
> > On Wed, Oct 23, 2019 at 3:06 AM Kabeer Ahmed <kab...@linuxmail.org>
> wrote:
> >
> > > Hi Vinoth,
> > >
> > > Have crazy week and the next 2 to 3 weeks are going to be very busy. I
> > > havent had a chance to look into this.
> > > My thoughts are around security. The ideas of building external indexes
> > > come with loads of advantages and throwing user data into the logs etc
> > > makes me anxious. Let me do a deep dive and come back to you.
> > > Thanks
> > > Kabeer.
> > >
> > > On Oct 21 2019, at 3:07 pm, Vinoth Chandar <vin...@apache.org> wrote:
> > > > Any thoughts? :) anyone?
> > > >
> > > > On Wed, Oct 9, 2019 at 11:06 AM Vinoth Chandar <vin...@apache.org>
> > > wrote:
> > > > > Hi all,
> > > > > Wanted to share some prototyping I was doing for HUDI-46. The idea
> > > here is
> > > > > to see if we can embed a parquet file "inline" into an outer file
> > (our
> > > > > log), so that if the user chooses to they can also get parquet data
> > in
> > > the
> > > > > logs to speed up real-time view queries. We would be using the
> > standard
> > > > > ParquetWriter and ParquetReader on top of a custom FileSystem
> > > > > implementation.
> > > > >
> > > > >
> > > > >
> > > https://github.com/vinothchandar/incubator-hudi/commit/
> > c60f4578f794d0f0d0e194b3e509cc0c5f132576
> > > > > Wrote a small PoC with TODOs and gaps annotated. Wanted to see if
> you
> > > all
> > > > > can poke more holes here and see if can generalize to embedding any
> > > file
> > > > > for e.g HFile..
> > > > >
> > > > > I believe we can generalize it and thus build things like external
> > > > > indexing very easily on the existing log format.
> > > > >
> > > > > Thanks
> > > > > Vinoth
> > > >
> > > >
> > >
> > >
> >
>

Reply via email to