GitHub user geserdugarov edited a comment on the discussion: Spark DataSource
V2 read and write benchmarks?
For queries, when we read and write in the same table:
```sql
-- 1st example
INSERT INTO hudi_tbl
SELECT * FROM hudi_tbl WHERE ...
-- 2nd example
UPDATE hudi_tbl t
SET somecol = somecol + 100
WHERE EXISTS (
SELECT 1
FROM hudi_tbl s
WHERE s.id = t.id
AND s.anothercol > 100
);
```
combining of V1 write and V2 read could be tricky.
I suppose, we could change the focus on full support of DataSource V2 without
performance drop (read and write) instead of trying to support V1 write and V2
read simultaneously. In this case, we also would have to resolve compatibility
issues from the V1 >> V2 migration point of view, not some complex hybrid
migration with a lot of edge cases.
GitHub link:
https://github.com/apache/hudi/discussions/13955#discussioncomment-15059073
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]