Prabhu Joseph created HUDI-7354:
-----------------------------------

             Summary: Flink Batch Read from Hudi table does not return any rows
                 Key: HUDI-7354
                 URL: https://issues.apache.org/jira/browse/HUDI-7354
             Project: Apache Hudi
          Issue Type: Bug
          Components: flink-sql
    Affects Versions: 0.14.1
            Reporter: Prabhu Joseph


Flink Batch Read from Hudi table does not return any rows. The same flink sql 
script returns 8 rows as expected on 0.14.0 Hudi version.


*Repro Steps*

 1. Flink 1.18.1 and Hudi 0.14.0

2. Open Flink YARN Session
{code}
flink-yarn-session -d -D execution.checkpointing.interval=10s -D 
state.checkpoint-storage=filesystem  -D 
state.checkpoints.dir=s3://prabhuflinks3/test-output/flink/output/20eab3b1-d58a-491c-8819-15e451a549eb
{code}

3. Place CSV Input Data
{code}
cat > data <<EOF
1,Danny,23
2,Stephen,33
3,Julian,53
4,Fabian,31
5,Sophia,18
6,Emma,20
7,Bob,44
8,Han,56
EOF

hadoop fs -mkdir -p 
s3://prabhuflinks3/test-output/flink/output/8d007d79-913d-4ed4-a6e4-9af591f24c36/csvinput/
hadoop fs -put data 
s3://prabhuflinks3/test-output/flink/output/8d007d79-913d-4ed4-a6e4-9af591f24c36/csvinput/

{code}

4. Run attached Flink sql script
{code}
/usr/lib/flink/bin/sql-client.sh -f flink-hudi-hive.sql
{code}


The script makes a flink filesystem table with CSV data of 8 rows. Then, it 
forms a Hudi table and puts in the data from the filesystem table. Finally, it 
runs a select query from the Hudi table. The select query does not return any 
data.


*Analysis*

The select query and insert query run together. The select query ends quickly 
since the Hudi table has no data yet. In Hudi 0.14.0, the select query waits 
until the data loads and then retrieves it.







 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to