from:"Jian Feng"

Re: [DISCUSS] Hudi data TTL

2022-10-18 Thread Jian Feng

r it? > >> > >> On Tue, Oct 18, 2022 at 10:20 AM Bingeng Huang > >> wrote: > >> > >>> Hi all, > >>> > >>> Do we have plan to integrate data TTL into HUDI, so we don't have to > >>> schedule a offline spark job to delete outdated data, just set a TTL > >>> config, then writer or some offline service will delete old data as > >>> expected. > >>> > >> > > -- *Jian Feng,冯健* Shopee | Engineer | Data Infrastructure

Re: [DISCUSS] New RFC to support Lock-free concurrency control&non-duplicate on Merge-on-read tables

2022-03-24 Thread Jian Feng

o delta logs. This is a > >lock-free process if we can make sure they don’t write data to the > same > > log > >file (plan to create multiple marker files to achieve this). And with > > log > >merge API(preCombine logic in Payload class), data in log files can be > > read > >properly > >- > > > >Since hudi already has an index type like Bucket index which can map > >key-bucket in a consistent way. Data duplicates can be eliminated > > > > > > Thanks, > > Jian Feng > > > -- *Jian Feng,冯健* Shopee | Engineer | Data Infrastructure

Re: [DISCUSS] Trino Plugin for Hudi

2021-10-20 Thread Jian Feng

odb.io/blog/2020/08/04/prestodb-and-hudi > > [3] https://github.com/trinodb/trino/pull/9641 > > [4] > > > https://cwiki.apache.org/confluence/display/HUDI/RFC+-+33++Hudi+supports+more+comprehensive+Schema+Evolution > > [5] > > > https://hudi.apache.org/blog/2021/07/21/streaming-data-lake-platform#timeline-metaserver > > [6] https://github.com/codope/trino/tree/hudi-plugin > > [7] https://trino.io/docs/current/develop/connectors.html > > > -- *Jian Feng,冯健* Shopee | Engineer | Data Infrastructure

Re: [Phishing Risk] [External] is there solution to solve hbase data screw issue

2021-10-17 Thread Jian Feng

at 12:50 AM Vinoth Chandar wrote: > Yeah all the rate limiting code in HBaseIndex is working around for these > large bulk writes. > > On Tue, Oct 5, 2021 at 11:16 AM Jian Feng wrote: > > > actually I met this problem when bootstrap a huge table，after changed > >

Re: [Phishing Risk] [External] [Delta Streamer] file name mismatch with meta when compaction running

2021-10-06 Thread Jian Feng

cle may > provide more information. Great thanks to the author. > > > https://mp.weixin.qq.com/s?__biz=MzIyMzQ0NjA0MQ==&mid=2247484306&idx=1&sn=1d853469159a600d82050c17e6a2a075&chksm=e81f56e4df68dff2da417109c4a971aef54f056bc0519558c58e23fe60b90dc6e4f8d7e92774&token=168846

[Delta Streamer] file name mismatch with meta when compaction running

2021-10-05 Thread Jian Feng

r? I tried recreate table , it happens again -- *Jian Feng,冯健* Shopee | Engineer | Data Infrastructure

Re: [Phishing Risk] [External] is there solution to solve hbase data screw issue

2021-10-05 Thread Jian Feng

tion of > the > > whole hudi project. > > > > On Mon, Oct 4, 2021, 11:29 PM wrote: > > when I bootstrape a huge hbase index table, I found all keys have a > prefix > > 'itemid:', then it caused data skew, there are 100 region servers in > hbase > > but only one was handle datas Is there any way to avoid this issue on the > > Hudi side ? -- *Jian Feng,冯健* Shopee | Engineer | Data Infrastructure > > > -- Full jian | Mobile Address

is there solution to solve hbase data screw issue

2021-10-04 Thread Jian Feng

when I bootstrape a huge hbase index table, I found all keys have a prefix 'itemid:', then it caused data skew, there are 100 region servers in hbase but only one was handle datas Is there any way to avoid this issue on the Hudi side ? -- *Jian Feng,冯健* Shopee | Engineer | Data Infrastructure

How to read hudi files with Mapreduce?

2021-08-11 Thread Jian Feng

Hi all, anyone can give me a sample？ -- FengJian Data Infrastructure Team Mobile +65 90388153 Address 5 Science Park Drive, Shopee Building, Singapore 118265

ingetst avro nested array field error occur

2021-07-22 Thread Jian Feng

anyone can help to see this issue ？ https://github.com/apache/hudi/issues/3327 when ingest data into hudi table ，I found if the avro schema has a array field with nested array field and has no other fields， error happens but if I add a dummy field ， ingestion works fine -- FengJian Data Infrast

what's different between Append only and insert in Flink stream？

2021-07-10 Thread Jian Feng

I saw a pr here https://github.com/apache/hudi/pull/3252 -- FengJian Data Infrastructure Team Mobile +65 90388153 Address 5 Science Park Drive, Shopee Building, Singapore 118265

Re: [DISCUSS] Hudi data TTL

Re: [DISCUSS] New RFC to support Lock-free concurrency control&non-duplicate on Merge-on-read tables

Re: [DISCUSS] Trino Plugin for Hudi

Re: [Phishing Risk] [External] is there solution to solve hbase data screw issue

Re: [Phishing Risk] [External] [Delta Streamer] file name mismatch with meta when compaction running

[Delta Streamer] file name mismatch with meta when compaction running

Re: [Phishing Risk] [External] is there solution to solve hbase data screw issue

is there solution to solve hbase data screw issue

How to read hudi files with Mapreduce?

ingetst avro nested array field error occur

what's different between Append only and insert in Flink stream？

11 matches

Site Navigation

Mail list logo

Footer information