hudi-bot opened a new issue, #15583:
URL: https://github.com/apache/hudi/issues/15583

   Currently when running spark sql DML, in order to check how many rows were 
affected, users need to get to the commit stats using hudi cli or stored 
procedure.
   
   We can improve user experience by returning num_affected_rows after INSERT 
INTO command, so that spark sql users can easily see how many rows were 
inserted without the need to go to the commits itself.
   
   num_affected_rows can be extracted in writer itself form commitMetadata
   
   Example:
   {code:java}
   spark.sql("""
   create table test_mor (id int, name string) 
   using hudi 
   tblproperties (primaryKey = 'id', type='mor');
   """)
   
   spark.sql(
   """
   INSERT INTO test_mor
   VALUES 
   (1, "a"),
   (2, "b"),
   (3, "c"),
   (4, "d"),
   (5, "e"),
   (6, "f"),
   (7, "g")
   """).show()
   
   returns:
   +-----------------+
   |num_affected_rows|
   +-----------------+
   |                7|
   +-----------------+
   {code}
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-5243
   - Type: Improvement


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to