Hi,
We wanted a row based format to quickly log changes to the base files and
flexibly compact the file groups we wanted. If we wrote parquet for e.g, we
would incur costs of writing parquet (can be upto to 10x even) once during
ingest and once again during compaction.
Of course. This trades off query latency for ingest cost. There is also
ongoing work to flexibly keep log block data in parquet. See
InlineFileSystem/tests if interested.
Thanks
Vinoth
On Mon, Jun 14, 2021 at 1:54 AM LakeShen wrote:
> Hi community,
>
> I have a question, why hudi consider the Avro be the MOR's log format?
>
>
> Best,
> LakeShen
>