hudi-bot opened a new issue, #15583:
URL: https://github.com/apache/hudi/issues/15583
Currently when running spark sql DML, in order to check how many rows were
affected, users need to get to the commit stats using hudi cli or stored
procedure.
We can improve user experience by returning num_affected_rows after INSERT
INTO command, so that spark sql users can easily see how many rows were
inserted without the need to go to the commits itself.
num_affected_rows can be extracted in writer itself form commitMetadata
Example:
{code:java}
spark.sql("""
create table test_mor (id int, name string)
using hudi
tblproperties (primaryKey = 'id', type='mor');
""")
spark.sql(
"""
INSERT INTO test_mor
VALUES
(1, "a"),
(2, "b"),
(3, "c"),
(4, "d"),
(5, "e"),
(6, "f"),
(7, "g")
""").show()
returns:
+-----------------+
|num_affected_rows|
+-----------------+
| 7|
+-----------------+
{code}
## JIRA info
- Link: https://issues.apache.org/jira/browse/HUDI-5243
- Type: Improvement
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]