Hi Vinoth,

Have crazy week and the next 2 to 3 weeks are going to be very busy. I havent 
had a chance to look into this.
My thoughts are around security. The ideas of building external indexes come 
with loads of advantages and throwing user data into the logs etc makes me 
anxious. Let me do a deep dive and come back to you.
Thanks
Kabeer.

On Oct 21 2019, at 3:07 pm, Vinoth Chandar <[email protected]> wrote:
> Any thoughts? :) anyone?
>
> On Wed, Oct 9, 2019 at 11:06 AM Vinoth Chandar <[email protected]> wrote:
> > Hi all,
> > Wanted to share some prototyping I was doing for HUDI-46. The idea here is
> > to see if we can embed a parquet file "inline" into an outer file (our
> > log), so that if the user chooses to they can also get parquet data in the
> > logs to speed up real-time view queries. We would be using the standard
> > ParquetWriter and ParquetReader on top of a custom FileSystem
> > implementation.
> >
> >
> > https://github.com/vinothchandar/incubator-hudi/commit/c60f4578f794d0f0d0e194b3e509cc0c5f132576
> > Wrote a small PoC with TODOs and gaps annotated. Wanted to see if you all
> > can poke more holes here and see if can generalize to embedding any file
> > for e.g HFile..
> >
> > I believe we can generalize it and thus build things like external
> > indexing very easily on the existing log format.
> >
> > Thanks
> > Vinoth
>
>

Reply via email to