Can you see pictures here? https://github.com/apache/hudi/issues/3755
Thanks!  let me read that article , Im trying to create another Bloom index
mor table to see if problem still exists

On Wed, Oct 6, 2021 at 2:54 PM 管梓越 <[email protected]> wrote:

> Hi JianFeng
>     It seems that there might be something wrong with the image so that I'm
> not able to get the image in my side. Pleased to share some info about your
> first question.
>     The name of baseFile is comprised by {fileID}_writeToken_instant. For
> write token, the method makeWriteToken in org.apache.hudi.common.fs.FSUtils
> indicates how it is generated with three spark task information. As far as
> I know, write token is designed to distinguish the files in same filegroup
> generated by different task attempt.
>     Let me share a scenario. In spark compaction job, speculation is
> allowed. Two task attempt try to generate base file for the same filegroup,
> so only the file written by the succeeded task can finally be picked by
> hudi. We will use the file name returned by succeeded task to get the one
> we want. reconcileAgainstMarkers method in class HoodieTable shows how this
> process work.
>     No idea on how this problem occur, it should not happen with default
> config and hdfs. Hope these info could help you.
>     By the way, there is a Wechat account shared some perfect articles in
> chinese about hudi. For guys who are good at chinese, following article may
> provide more information. Great thanks to the author.
>
>
> https://mp.weixin.qq.com/s?__biz=MzIyMzQ0NjA0MQ==&mid=2247484306&idx=1&sn=1d853469159a600d82050c17e6a2a075&chksm=e81f56e4df68dff2da417109c4a971aef54f056bc0519558c58e23fe60b90dc6e4f8d7e92774&token=1688466117&lang=zh_CN#rd
>
> On Wed, Oct 6, 2021 at 1:35 PM Jian Feng <[email protected]> wrote:
>
> > when I run delta streamer(version 0.9) to ingest data from kafka to a
> > Hbase indexed mor table ,  after few commits, met this error when
> > compaction running
> > [image: image.png]
> >
> >  In hdfs there is a file has same fileId and commit instant but different
> > in the middle:
> >
> hdfs://tl5/projects/data_vite/mysql_ingestion/rti_vite/shopee_item_v4_db__item_v4_tab_newHbase/BR/2021-10/813800cd-1aaf-43ea-829f-4feef4a51cb3-0_19-2672-4427765_
> > *20211006051032*.parquet
> >
> > below is 20211006051032.commit's content,
> >
> >
> > [image: image.png]
> >
> >
> > What does 2672-4427765 and 2657-4368242 mean? and how can I fix this
> error?
> >
> > I tried recreate table , it happens again
> >
> >
> > --
> > *Jian Feng,冯健*
> > Shopee | Engineer | Data Infrastructure
> >
>


-- 
*Jian Feng,冯健*
Shopee | Engineer | Data Infrastructure

Reply via email to