[GitHub] [iceberg] rdblue commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

GitBox Wed, 02 Nov 2022 09:00:44 -0700


rdblue commented on code in PR #6058:
URL: https://github.com/apache/iceberg/pull/6058#discussion_r1011987876



##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java:
##########
@@ -968,4 +970,16 @@ public String unknown(
       return String.format("%s(%s) %s %s", transform, sourceName, direction, 
nullOrder);
     }
   }
+
+  public static <T extends org.apache.iceberg.Scan<T, ?, ?>> T 
addSparkMetadataToScan(
+      SparkSession spark, T scan) {
+    String executionId = 
spark.sparkContext().getLocalProperty(SQLExecution.EXECUTION_ID_KEY());
+    T updatedScan = scan;
+    if (null != executionId) {
+      updatedScan = updatedScan.option(SQLExecution.EXECUTION_ID_KEY(), 
executionId);
+    }
+    return updatedScan
+        .option(CatalogProperties.APP_ID, spark.sparkContext().applicationId())
+        .option(CatalogProperties.USER, spark.sparkContext().sparkUser());

Review Comment:
   I don't think this is a correct use of options. These aren't scan options, 
this is just using options to pass through other data because options will be 
sent in the scan report. I think it would be better to pass the app ID 
separately.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

Reply via email to