[I] Return num_affected_rows from sql INSERT statement [hudi]

via GitHub Sat, 29 Nov 2025 22:07:57 -0800


hudi-bot opened a new issue, #15583:
URL: https://github.com/apache/hudi/issues/15583


   Currently when running spark sql DML, in order to check how many rows were 
affected, users need to get to the commit stats using hudi cli or stored 
procedure.
   
   We can improve user experience by returning num_affected_rows after INSERT 
INTO command, so that spark sql users can easily see how many rows were 
inserted without the need to go to the commits itself.
   
   num_affected_rows can be extracted in writer itself form commitMetadata
   
   Example:
   {code:java}
   spark.sql("""
   create table test_mor (id int, name string) 
   using hudi 
   tblproperties (primaryKey = 'id', type='mor');
   """)
   
   spark.sql(
   """
   INSERT INTO test_mor
   VALUES 
   (1, "a"),
   (2, "b"),
   (3, "c"),
   (4, "d"),
   (5, "e"),
   (6, "f"),
   (7, "g")
   """).show()
   
   returns:
   +-----------------+
   |num_affected_rows|
   +-----------------+
   |                7|
   +-----------------+
   {code}
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-5243
   - Type: Improvement


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Return num_affected_rows from sql INSERT statement [hudi]

Reply via email to