Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-14 Thread via GitHub


GlutenPerfBot commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1996831371

   = Performance report for TPCH SF2000 with Velox backend, for reference 
only 
   
   
   
   query
   log/native_3870_time.csv
   log/native_master_03_13_2024_d7ed0844e_time.csv
   difference
   percentage
   
   
   q1
   35.41
   38.81
   3.399
   109.60%
   
   
   q2
   25.71
   24.06
   -1.644
   93.60%
   
   
   q3
   37.01
   38.18
   1.169
   103.16%
   
   
   q4
   40.14
   38.49
   -1.657
   95.87%
   
   
   q5
   68.03
   69.71
   1.671
   102.46%
   
   
   q6
   7.43
   7.45
   0.026
   100.35%
   
   
   q7
   83.15
   82.46
   -0.695
   99.16%
   
   
   q8
   85.00
   83.21
   -1.785
   97.90%
   
   
   q9
   122.38
   121.83
   -0.549
   99.55%
   
   
   q10
   44.07
   44.36
   0.294
   100.67%
   
   
   q11
   19.80
   20.90
   1.101
   105.56%
   
   
   q12
   26.65
   28.06
   1.413
   105.30%
   
   
   q13
   48.44
   46.88
   -1.561
   96.78%
   
   
   q14
   22.24
   21.98
   -0.259
   98.83%
   
   
   q15
   32.04
   33.08
   1.046
   103.26%
   
   
   q16
   14.72
   13.84
   -0.883
   94.00%
   
   
   q17
   100.17
   101.80
   1.629
   101.63%
   
   
   q18
   142.74
   141.05
   -1.683
   98.82%
   
   
   q19
   13.54
   15.07
   1.529
   111.30%
   
   
 

Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-14 Thread via GitHub


zhli1142015 merged PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-13 Thread via GitHub


gaoyangxiaozhu commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1996582788

   > @gaoyangxiaozhu seems there is some conflicts and you need a rebase. 
Thanks.
   
   done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-13 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1996273415

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-13 Thread via GitHub


yma11 commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1996167070

   @gaoyangxiaozhu seems there is some conflicts and you need a rebase. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-13 Thread via GitHub


zhouyuan commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1996119904

   @gaoyangxiaozhu This is in good state to me. Will try with internal delta 
lake jenkins job 
   CC @zzcclp as this will also change the API for CK backend


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-13 Thread via GitHub


gaoyangxiaozhu commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1994440146

   can we merge the PR ? @zhli1142015 / @yma11 / @zhouyuan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-13 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1993960733

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-13 Thread via GitHub


gaoyangxiaozhu commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1993655548

   can we re-trigger again for failed job which all due to below
   
   
![image](https://github.com/apache/incubator-gluten/assets/9858245/ccac3728-f0d2-41f1-89be-85b40758518f)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1993211907

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


gaoyangxiaozhu commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1992105649

   various build issue , can you help re-trigger @yma11 / @zhouyuan / 
@zhli1142015 
   
![image](https://github.com/apache/incubator-gluten/assets/9858245/9138a799-b008-4fa5-9600-43b2a0b5ebc6)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1992018595

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


yma11 commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521381033


##
cpp/velox/substrait/SubstraitParser.cc:
##
@@ -104,31 +104,41 @@ std::vector 
SubstraitParser::parseNamedStruct(const ::substrait::NamedS
   return typeList;
 }
 
-std::vector SubstraitParser::parsePartitionColumns(const 
::substrait::NamedStruct& namedStruct) {
+void SubstraitParser::parsePartitionAndMetadataColumns(
+const ::substrait::NamedStruct& namedStruct,
+std::vector& isPartitionColumns,
+std::vector& isMetadataColumns) {
   const auto& columnsTypes = namedStruct.column_types();
-  std::vector isPartitionColumns;
   if (columnsTypes.size() == 0) {
-// Regard all columns as non-partitioned columns.
+// Regard all columns as regular columns.

Review Comment:
   I think so but we'd better confirm first.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1991460438

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


gaoyangxiaozhu commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521217320


##
cpp/velox/substrait/SubstraitParser.cc:
##
@@ -104,31 +104,41 @@ std::vector 
SubstraitParser::parseNamedStruct(const ::substrait::NamedS
   return typeList;
 }
 
-std::vector SubstraitParser::parsePartitionColumns(const 
::substrait::NamedStruct& namedStruct) {
+void SubstraitParser::parsePartitionAndMetadataColumns(
+const ::substrait::NamedStruct& namedStruct,
+std::vector& isPartitionColumns,
+std::vector& isMetadataColumns) {
   const auto& columnsTypes = namedStruct.column_types();
-  std::vector isPartitionColumns;
   if (columnsTypes.size() == 0) {
-// Regard all columns as non-partitioned columns.
+// Regard all columns as regular columns.

Review Comment:
   so if is true for write then we only process if `columnsTypes.size() > 0`  
otherwise return directly?



##
cpp/velox/substrait/SubstraitParser.cc:
##
@@ -104,31 +104,41 @@ std::vector 
SubstraitParser::parseNamedStruct(const ::substrait::NamedS
   return typeList;
 }
 
-std::vector SubstraitParser::parsePartitionColumns(const 
::substrait::NamedStruct& namedStruct) {
+void SubstraitParser::parsePartitionAndMetadataColumns(
+const ::substrait::NamedStruct& namedStruct,
+std::vector& isPartitionColumns,
+std::vector& isMetadataColumns) {
   const auto& columnsTypes = namedStruct.column_types();
-  std::vector isPartitionColumns;
   if (columnsTypes.size() == 0) {
-// Regard all columns as non-partitioned columns.
+// Regard all columns as regular columns.

Review Comment:
   so if is true for write then we only process if `columnsTypes.size() > 0`  
otherwise do nothing?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


gaoyangxiaozhu commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521217320


##
cpp/velox/substrait/SubstraitParser.cc:
##
@@ -104,31 +104,41 @@ std::vector 
SubstraitParser::parseNamedStruct(const ::substrait::NamedS
   return typeList;
 }
 
-std::vector SubstraitParser::parsePartitionColumns(const 
::substrait::NamedStruct& namedStruct) {
+void SubstraitParser::parsePartitionAndMetadataColumns(
+const ::substrait::NamedStruct& namedStruct,
+std::vector& isPartitionColumns,
+std::vector& isMetadataColumns) {
   const auto& columnsTypes = namedStruct.column_types();
-  std::vector isPartitionColumns;
   if (columnsTypes.size() == 0) {
-// Regard all columns as non-partitioned columns.
+// Regard all columns as regular columns.

Review Comment:
   so if is true for write then we only process under `columnsTypes.size() > 0` 
?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1991300996

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


gaoyangxiaozhu commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521209340


##
gluten-core/src/main/scala/io/glutenproject/execution/BasicScanExecTransformer.scala:
##
@@ -37,12 +37,15 @@ import com.google.protobuf.StringValue
 import scala.collection.JavaConverters._
 
 trait BasicScanExecTransformer extends LeafTransformSupport with 
BaseDataSource {
+  import org.apache.spark.sql.catalyst.util._
 
   /** Returns the filters that can be pushed down to native file scan */
   def filterExprs(): Seq[Expression]
 
   def outputAttributes(): Seq[Attribute]
 
+  def getMetadataColumns(): Seq[AttributeReference]

Review Comment:
   this is for compatible spark33 purpose which already defineds 
`metadataColumns` fields 
https://github.com/apache/spark/blob/e98872fb5d07d570e6d0516b49a5d2e58876d1a6/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L208C2-L210C1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


yma11 commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521186042


##
gluten-ut/spark33/src/test/scala/org/apache/spark/sql/execution/datasources/GlutenFileMetadataStructSuite.scala:
##
@@ -16,6 +16,156 @@
  */
 package org.apache.spark.sql.execution.datasources
 
+import io.glutenproject.execution.FilterExecTransformerBase
+import io.glutenproject.utils.BackendTestUtils
+
+import org.apache.spark.sql.{Column, DataFrame, Row}
 import org.apache.spark.sql.GlutenSQLTestsBaseTrait
+import org.apache.spark.sql.GlutenTestConstants.GLUTEN_TEST
+import org.apache.spark.sql.execution.FileSourceScanExec
+import org.apache.spark.sql.types.{IntegerType, LongType, StringType, 
StructField, StructType}
+
+import java.io.File
+import java.sql.Timestamp
+
+class GlutenFileMetadataStructSuite extends FileMetadataStructSuite with 
GlutenSQLTestsBaseTrait {
+
+  val schemaWithFilePathField: StructType = new StructType()
+.add(StructField("file_path", StringType))
+.add(StructField("age", IntegerType))
+.add(
+  StructField(
+"info",
+new StructType()
+  .add(StructField("id", LongType))
+  .add(StructField("university", StringType
+
+  private val METADATA_FILE_PATH = "_metadata.file_path"
+  private val METADATA_FILE_NAME = "_metadata.file_name"
+  private val METADATA_FILE_SIZE = "_metadata.file_size"
+  private val METADATA_FILE_MODIFICATION_TIME = 
"_metadata.file_modification_time"
+
+  private def getMetadataForFile(f: File): Map[String, Any] = {
+Map(
+  METADATA_FILE_PATH -> f.toURI.toString,
+  METADATA_FILE_NAME -> f.getName,
+  METADATA_FILE_SIZE -> f.length(),
+  METADATA_FILE_MODIFICATION_TIME -> new Timestamp(f.lastModified())
+)
+  }
+
+  private def metadataColumnsNativeTest(testName: String, fileSchema: 
StructType)(
+  f: (DataFrame, Map[String, Any], Map[String, Any]) => Unit): Unit = {
+Seq("parquet").foreach {
+  testFileFormat =>
+test(s"$GLUTEN_TEST metadata struct ($testFileFormat): " + testName) {
+  withTempDir {
+dir =>
+  import scala.collection.JavaConverters._
+
+  // 1. create df0 and df1 and save under /data/f0 and /data/f1
+  val df0 = spark.createDataFrame(data0.asJava, fileSchema)
+  val f0 = new File(dir, "data/f0").getCanonicalPath
+  df0.coalesce(1).write.format(testFileFormat).save(f0)
+
+  val df1 = spark.createDataFrame(data1.asJava, fileSchema)
+  val f1 = new File(dir, "data/f1 gluten").getCanonicalPath
+  df1.coalesce(1).write.format(testFileFormat).save(f1)
+
+  // 2. read both f0 and f1
+  val df = spark.read
+.format(testFileFormat)
+.schema(fileSchema)
+.load(new File(dir, "data").getCanonicalPath + "/*")
+  val realF0 = new File(dir, "data/f0")
+.listFiles()
+.filter(_.getName.endsWith(s".$testFileFormat"))
+.head
+  val realF1 = new File(dir, "data/f1 gluten")
+.listFiles()
+.filter(_.getName.endsWith(s".$testFileFormat"))
+.head
+  f(df, getMetadataForFile(realF0), getMetadataForFile(realF1))
+  }
+}
+}
+  }
+
+  metadataColumnsNativeTest(
+"plan check with metadata and user data select",
+schemaWithFilePathField) {
+(df, f0, f1) =>
+  var dfWithMetadata = df.select(
+METADATA_FILE_NAME,
+METADATA_FILE_PATH,
+METADATA_FILE_SIZE,
+METADATA_FILE_MODIFICATION_TIME,
+"age")
+  dfWithMetadata.collect
+  var fileScan = dfWithMetadata.queryExecution.executedPlan.collect {
+case f: FileSourceScanExec => f
+  }
+  assert(fileScan.size == 1)
+  if (BackendTestUtils.isVeloxBackendLoaded()) {
+assert(fileScan(0).nodeNamePrefix == "NativeFile")
+  } else {
+assert(fileScan(0).nodeNamePrefix == "File")
+  }
+
+  // would fallback
+  dfWithMetadata = df.select(METADATA_FILE_PATH, "file_path")
+  checkAnswer(
+dfWithMetadata,
+Seq(
+  Row(f0(METADATA_FILE_PATH), "jack"),
+  Row(f1(METADATA_FILE_PATH), "lily")
+)
+  )
+  fileScan = dfWithMetadata.queryExecution.executedPlan.collect {
+case f: FileSourceScanExec => f
+  }
+  assert(fileScan(0).nodeNamePrefix == "File")
+  }
+
+  metadataColumnsNativeTest("plan check with metadata filter", 
schemaWithFilePathField) {
+(df, f0, f1) =>
+  var filterDF = df
+.select("file_path", "age", METADATA_FILE_NAME)
+.where(Column(METADATA_FILE_NAME) === f0((METADATA_FILE_NAME)))
+  val ret = filterDF.collect
+  assert(ret.size == 1)
+  var fileScan = filterDF.queryExecution.executedPlan.collect {
+case f: FileSourceScanExec => f
+

Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


yma11 commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521185608


##
cpp/velox/substrait/SubstraitParser.cc:
##
@@ -104,31 +104,41 @@ std::vector 
SubstraitParser::parseNamedStruct(const ::substrait::NamedS
   return typeList;
 }
 
-std::vector SubstraitParser::parsePartitionColumns(const 
::substrait::NamedStruct& namedStruct) {
+void SubstraitParser::parsePartitionAndMetadataColumns(
+const ::substrait::NamedStruct& namedStruct,
+std::vector& isPartitionColumns,
+std::vector& isMetadataColumns) {
   const auto& columnsTypes = namedStruct.column_types();
-  std::vector isPartitionColumns;
   if (columnsTypes.size() == 0) {
-// Regard all columns as non-partitioned columns.
+// Regard all columns as regular columns.

Review Comment:
   @JkSelf will it happen for parquet write?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


yma11 commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521062380


##
cpp/velox/compute/WholeStageResultIterator.cc:
##
@@ -145,15 +145,31 @@ WholeStageResultIterator::WholeStageResultIterator(
 const auto& lengths = scanInfo->lengths;
 const auto& format = scanInfo->format;
 const auto& partitionColumns = scanInfo->partitionColumns;
+const auto& metadataColumns = scanInfo->metadataColumns;
 
 std::vector> 
connectorSplits;
 connectorSplits.reserve(paths.size());
 for (int idx = 0; idx < paths.size(); idx++) {
   auto partitionColumn = partitionColumns[idx];
+  auto metadataColumn = metadataColumns[idx];
   std::unordered_map> 
partitionKeys;
   constructPartitionColumns(partitionKeys, partitionColumn);
+  const std::unordered_map& customSplitInfo = {};
+  const std::shared_ptr& extraFileInfo = {};
+  const std::unordered_map& serdeParameters = {};

Review Comment:
   Do you plan to fill these variables in late PR? if not, I think we can avoid 
creating them but just use `{}` in the constructor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


gaoyangxiaozhu commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521142100


##
cpp/velox/substrait/SubstraitParser.cc:
##
@@ -104,31 +104,41 @@ std::vector 
SubstraitParser::parseNamedStruct(const ::substrait::NamedS
   return typeList;
 }
 
-std::vector SubstraitParser::parsePartitionColumns(const 
::substrait::NamedStruct& namedStruct) {
+void SubstraitParser::parsePartitionAndMetadataColumns(
+const ::substrait::NamedStruct& namedStruct,
+std::vector& isPartitionColumns,
+std::vector& isMetadataColumns) {
   const auto& columnsTypes = namedStruct.column_types();
-  std::vector isPartitionColumns;
   if (columnsTypes.size() == 0) {
-// Regard all columns as non-partitioned columns.
+// Regard all columns as regular columns.

Review Comment:
   for read node it should be true, but not sure for write node if 
`namedStruct.names()` also is 0 size. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


yma11 commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521135199


##
shims/spark34/src/main/scala/io/glutenproject/sql/shims/spark34/Spark34Shims.scala:
##
@@ -153,6 +157,34 @@ class Spark34Shims extends SparkShims {
 }
   }
 
+  override def generateMetadataColumns(
+  file: PartitionedFile,
+  metadataColumnNames: Seq[String]): JMap[String, String] = {
+val metadataColumn = new JHashMap[String, String]()
+val path = new Path(file.filePath.toString)
+for (columnName <- metadataColumnNames) {
+  columnName match {
+case FileFormat.FILE_PATH => metadataColumn.put(FileFormat.FILE_PATH, 
path.toString)
+case FileFormat.FILE_NAME => metadataColumn.put(FileFormat.FILE_NAME, 
path.getName)
+case FileFormat.FILE_SIZE =>
+  metadataColumn.put(FileFormat.FILE_SIZE, file.fileSize.toString)
+case FileFormat.FILE_MODIFICATION_TIME =>
+  val fileModifyTime = TimestampFormatter
+.getFractionFormatter(ZoneOffset.UTC)
+.format(file.modificationTime * 1000L)
+  metadataColumn.put(FileFormat.FILE_MODIFICATION_TIME, fileModifyTime)
+case FileFormat.FILE_BLOCK_START =>
+  metadataColumn.put(FileFormat.FILE_BLOCK_START, file.start.toString)
+case FileFormat.FILE_BLOCK_LENGTH =>
+  metadataColumn.put(FileFormat.FILE_BLOCK_LENGTH, 
file.length.toString)
+case _ =>
+  }
+}
+
+// TODO row_index metadata support

Review Comment:
   `TODO: `



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


zhouyuan commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1991116962

   Quick note: This patch adds the native impl for parquet metadata read 
support which is a requirement from delta lake connector. The general idea is 
to add a metadata list param for filescan/batchscan which will be filled in 
Velox native scan.
   
   Thanks,
   -yuan


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


yma11 commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521102914


##
gluten-core/src/main/scala/io/glutenproject/execution/BasicScanExecTransformer.scala:
##
@@ -37,12 +37,15 @@ import com.google.protobuf.StringValue
 import scala.collection.JavaConverters._
 
 trait BasicScanExecTransformer extends LeafTransformSupport with 
BaseDataSource {
+  import org.apache.spark.sql.catalyst.util._
 
   /** Returns the filters that can be pushed down to native file scan */
   def filterExprs(): Seq[Expression]
 
   def outputAttributes(): Seq[Attribute]
 
+  def getMetadataColumns(): Seq[AttributeReference]

Review Comment:
   rename to `metadataColums()`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


zhouyuan commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521101767


##
gluten-ut/spark33/src/test/scala/org/apache/spark/sql/execution/datasources/GlutenFileMetadataStructSuite.scala:
##
@@ -16,6 +16,156 @@
  */
 package org.apache.spark.sql.execution.datasources
 
+import io.glutenproject.execution.FilterExecTransformerBase
+import io.glutenproject.utils.BackendTestUtils
+
+import org.apache.spark.sql.{Column, DataFrame, Row}
 import org.apache.spark.sql.GlutenSQLTestsBaseTrait
+import org.apache.spark.sql.GlutenTestConstants.GLUTEN_TEST
+import org.apache.spark.sql.execution.FileSourceScanExec
+import org.apache.spark.sql.types.{IntegerType, LongType, StringType, 
StructField, StructType}
+
+import java.io.File
+import java.sql.Timestamp
+
+class GlutenFileMetadataStructSuite extends FileMetadataStructSuite with 
GlutenSQLTestsBaseTrait {
+
+  val schemaWithFilePathField: StructType = new StructType()
+.add(StructField("file_path", StringType))
+.add(StructField("age", IntegerType))
+.add(
+  StructField(
+"info",
+new StructType()
+  .add(StructField("id", LongType))
+  .add(StructField("university", StringType
+
+  private val METADATA_FILE_PATH = "_metadata.file_path"
+  private val METADATA_FILE_NAME = "_metadata.file_name"
+  private val METADATA_FILE_SIZE = "_metadata.file_size"
+  private val METADATA_FILE_MODIFICATION_TIME = 
"_metadata.file_modification_time"
+
+  private def getMetadataForFile(f: File): Map[String, Any] = {
+Map(
+  METADATA_FILE_PATH -> f.toURI.toString,
+  METADATA_FILE_NAME -> f.getName,
+  METADATA_FILE_SIZE -> f.length(),
+  METADATA_FILE_MODIFICATION_TIME -> new Timestamp(f.lastModified())
+)
+  }
+
+  private def metadataColumnsNativeTest(testName: String, fileSchema: 
StructType)(
+  f: (DataFrame, Map[String, Any], Map[String, Any]) => Unit): Unit = {
+Seq("parquet").foreach {
+  testFileFormat =>
+test(s"$GLUTEN_TEST metadata struct ($testFileFormat): " + testName) {
+  withTempDir {
+dir =>
+  import scala.collection.JavaConverters._
+
+  // 1. create df0 and df1 and save under /data/f0 and /data/f1
+  val df0 = spark.createDataFrame(data0.asJava, fileSchema)
+  val f0 = new File(dir, "data/f0").getCanonicalPath
+  df0.coalesce(1).write.format(testFileFormat).save(f0)
+
+  val df1 = spark.createDataFrame(data1.asJava, fileSchema)
+  val f1 = new File(dir, "data/f1 gluten").getCanonicalPath
+  df1.coalesce(1).write.format(testFileFormat).save(f1)
+
+  // 2. read both f0 and f1
+  val df = spark.read
+.format(testFileFormat)
+.schema(fileSchema)
+.load(new File(dir, "data").getCanonicalPath + "/*")
+  val realF0 = new File(dir, "data/f0")
+.listFiles()
+.filter(_.getName.endsWith(s".$testFileFormat"))
+.head
+  val realF1 = new File(dir, "data/f1 gluten")
+.listFiles()
+.filter(_.getName.endsWith(s".$testFileFormat"))
+.head
+  f(df, getMetadataForFile(realF0), getMetadataForFile(realF1))
+  }
+}
+}
+  }
+
+  metadataColumnsNativeTest(
+"plan check with metadata and user data select",
+schemaWithFilePathField) {
+(df, f0, f1) =>
+  var dfWithMetadata = df.select(
+METADATA_FILE_NAME,
+METADATA_FILE_PATH,
+METADATA_FILE_SIZE,
+METADATA_FILE_MODIFICATION_TIME,
+"age")
+  dfWithMetadata.collect
+  var fileScan = dfWithMetadata.queryExecution.executedPlan.collect {
+case f: FileSourceScanExec => f
+  }
+  assert(fileScan.size == 1)
+  if (BackendTestUtils.isVeloxBackendLoaded()) {
+assert(fileScan(0).nodeNamePrefix == "NativeFile")
+  } else {
+assert(fileScan(0).nodeNamePrefix == "File")
+  }
+
+  // would fallback
+  dfWithMetadata = df.select(METADATA_FILE_PATH, "file_path")
+  checkAnswer(
+dfWithMetadata,
+Seq(
+  Row(f0(METADATA_FILE_PATH), "jack"),
+  Row(f1(METADATA_FILE_PATH), "lily")
+)
+  )
+  fileScan = dfWithMetadata.queryExecution.executedPlan.collect {
+case f: FileSourceScanExec => f
+  }
+  assert(fileScan(0).nodeNamePrefix == "File")
+  }
+
+  metadataColumnsNativeTest("plan check with metadata filter", 
schemaWithFilePathField) {
+(df, f0, f1) =>
+  var filterDF = df
+.select("file_path", "age", METADATA_FILE_NAME)
+.where(Column(METADATA_FILE_NAME) === f0((METADATA_FILE_NAME)))
+  val ret = filterDF.collect
+  assert(ret.size == 1)
+  var fileScan = filterDF.queryExecution.executedPlan.collect {
+case f: FileSourceScanExec => f
+ 

Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


yma11 commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521090112


##
cpp/velox/substrait/SubstraitParser.cc:
##
@@ -104,31 +104,41 @@ std::vector 
SubstraitParser::parseNamedStruct(const ::substrait::NamedS
   return typeList;
 }
 
-std::vector SubstraitParser::parsePartitionColumns(const 
::substrait::NamedStruct& namedStruct) {
+void SubstraitParser::parsePartitionAndMetadataColumns(
+const ::substrait::NamedStruct& namedStruct,
+std::vector& isPartitionColumns,
+std::vector& isMetadataColumns) {
   const auto& columnsTypes = namedStruct.column_types();
-  std::vector isPartitionColumns;
   if (columnsTypes.size() == 0) {
-// Regard all columns as non-partitioned columns.
+// Regard all columns as regular columns.

Review Comment:
   If `columnsTypes` has 0 as size, does it mean that there is no columns in 
this table? if so, maybe we should do a check `columnsTypes.size() > 0` instead?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


yma11 commented on code in PR #3870:
URL: https://github.com/apache/incubator-gluten/pull/3870#discussion_r1521062380


##
cpp/velox/compute/WholeStageResultIterator.cc:
##
@@ -145,15 +145,31 @@ WholeStageResultIterator::WholeStageResultIterator(
 const auto& lengths = scanInfo->lengths;
 const auto& format = scanInfo->format;
 const auto& partitionColumns = scanInfo->partitionColumns;
+const auto& metadataColumns = scanInfo->metadataColumns;
 
 std::vector> 
connectorSplits;
 connectorSplits.reserve(paths.size());
 for (int idx = 0; idx < paths.size(); idx++) {
   auto partitionColumn = partitionColumns[idx];
+  auto metadataColumn = metadataColumns[idx];
   std::unordered_map> 
partitionKeys;
   constructPartitionColumns(partitionKeys, partitionColumn);
+  const std::unordered_map& customSplitInfo = {};
+  const std::shared_ptr& extraFileInfo = {};
+  const std::unordered_map& serdeParameters = {};

Review Comment:
   Do you plan to fill these variables in late PR? if not, I think it's avoid 
to create them but just use `{}` in the constructor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


zhouyuan commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1991044843

   CC: @yma11  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-12 Thread via GitHub


gaoyangxiaozhu commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1991024316

   @zhli1142015 and @zhouyuan could you help review ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-11 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1990046048

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-11 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1990049174

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-11 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1988949586

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-11 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1988851240

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-11 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1988664197

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-11 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1988662251

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-11 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1988477007

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-11 Thread via GitHub


zhli1142015 commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1987856904

   THe CI faulire looks related to your change @gaoyangxiaozhu, please check, 
thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-07 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1984028416

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org



Re: [PR] [VL] parquet file metadata columns support in velox [incubator-gluten]

2024-03-07 Thread via GitHub


github-actions[bot] commented on PR #3870:
URL: 
https://github.com/apache/incubator-gluten/pull/3870#issuecomment-1983982623

   Run Gluten Clickhouse CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org