(paimon) branch master updated: [doc][spark] Add `__paimon_row_index` metadata column (#5127)

lzljs3620320 Thu, 20 Feb 2025 19:26:26 -0800

This is an automated email from the ASF dual-hosted git repository.

lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/paimon.git



The following commit(s) were added to refs/heads/master by this push:
     new 70359e1649 [doc][spark] Add `__paimon_row_index` metadata column 
(#5127)
70359e1649 is described below

commit 70359e16494e55605db009e9dfa947081b1a0fa6
Author: Yubin Li <[email protected]>
AuthorDate: Fri Feb 21 11:26:12 2025 +0800

    [doc][spark] Add `__paimon_row_index` metadata column (#5127)
---
 docs/content/spark/sql-query.md                                    | 5 +++--
 .../test/scala/org/apache/paimon/spark/sql/PaimonQueryTest.scala   | 7 +++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/docs/content/spark/sql-query.md b/docs/content/spark/sql-query.md
index cc420b4534..581e81c361 100644
--- a/docs/content/spark/sql-query.md
+++ b/docs/content/spark/sql-query.md
@@ -42,10 +42,11 @@ Paimon also supports reading some hidden metadata columns, 
currently supporting
 - `__paimon_file_path`: the file path of the record.
 - `__paimon_partition`: the partition of the record.
 - `__paimon_bucket`: the bucket of the record.
+- `__paimon_row_index`: the row index of the record.
 
 ```sql
--- read all columns and the corresponding file path, partition, bucket of the 
record
-SELECT *, __paimon_file_path, __paimon_partition, __paimon_bucket FROM t;
+-- read all columns and the corresponding file path, partition, bucket, 
rowIndex of the record
+SELECT *, __paimon_file_path, __paimon_partition, __paimon_bucket, 
__paimon_row_index FROM t;
 ```
 
 ### Batch Time Travel
diff --git 
a/paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/PaimonQueryTest.scala
 
b/paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/PaimonQueryTest.scala
index d8d621a0e6..ae90b5b1f3 100644
--- 
a/paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/PaimonQueryTest.scala
+++ 
b/paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/PaimonQueryTest.scala
@@ -80,11 +80,14 @@ class PaimonQueryTest extends PaimonSparkTestBase {
                   |""".stripMargin)
 
       val res = spark.sql("""
-                            |SELECT __paimon_partition, __paimon_bucket FROM T
+                            |SELECT __paimon_partition, __paimon_bucket,
+                            |min(__paimon_row_index) as min_paimon_row_index,
+                            |max(__paimon_row_index) as max_paimon_row_index
+                            |FROM T
                             |GROUP BY __paimon_partition, __paimon_bucket
                             |ORDER BY __paimon_partition, __paimon_bucket
                             |""".stripMargin)
-      checkAnswer(res, Row(Row(), 0) :: Row(Row(), 1) :: Row(Row(), 2) :: Nil)
+      checkAnswer(res, Row(Row(), 0, 0, 1) :: Row(Row(), 1, 0, 1) :: 
Row(Row(), 2, 0, 1) :: Nil)
     }
   }

(paimon) branch master updated: [doc][spark] Add `__paimon_row_index` metadata column (#5127)

Reply via email to