bithw1 opened a new issue, #17734:
URL: https://github.com/apache/hudi/issues/17734
### Describe the problem you faced
In spark sql, I run following simple query,
When I run `select * from hudi_cow_20251229_07`, the result is as follows,
I wonder why 1,2,3 and 1,3,6 are gone(I am using insert, no duplicates should
be dropped)
```
park-sql> select * from hudi_cow_20251229_07;
_hoodie_commit_time _hoodie_commit_seqno _hoodie_record_key
_hoodie_partition_path _hoodie_file_name a b c
20251229154740370 20251229154740370_0_0 1
```
```
set hoodie.spark.sql.insert.into.operation=insert;
set hoodie.datasource.write.insert.drop.duplicates=false;
set hoodie.datasource.write.insert.dup.policy=none;
CREATE TABLE IF NOT EXISTS hudi_cow_20251229_07 (
a INT,
b INT,
c INT
)
USING hudi
tblproperties(
type='cow',
primaryKey='a',
hoodie.datasource.write.precombine.field='c'
);
insert into hudi_cow_20251229_07(a,b,c) values(1,2,3),(1,4,7),(1,3,6);
```
### To Reproduce
1.
2.
3.
4.
### Expected behavior
1
### Environment Description
* Hudi version:1
* Spark version:
* Flink version:
* Hive version:
* Hadoop version:
* Storage (HDFS/S3/GCS..):
* Running on Docker? (yes/no):
### Additional context
1
### Stacktrace
```shell
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]