Ryan Johnson created SPARK-43226:
------------------------------------

             Summary: Define extractors for file-constant metadata columns
                 Key: SPARK-43226
                 URL: https://issues.apache.org/jira/browse/SPARK-43226
             Project: Spark
          Issue Type: New Feature
          Components: Spark Core
    Affects Versions: 3.4.0
            Reporter: Ryan Johnson


File-source constant metadata columns are often derived indirectly from 
file-level metadata values rather than exposing those values directly. For 
example, {{_metadata.file_name}} is currently hard-coded in 
{{FileFormat.updateMetadataInternalRow}} as:

 
{code:java}
UTF8String.fromString(filePath.getName){code}
 

We should add support for metadata extractors, functions that map from 
{{PartitionedFile}} to {{{}Literal{}}}, so that we can express such columns in 
a generic way instead of hard-coding them.

We can't just add them to the metadata map because then they have to be 
pre-computed even if it turns out the query does not select that field.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to