[ 
https://issues.apache.org/jira/browse/SPARK-43226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-43226:
-----------------------------------

    Assignee: Ryan Johnson

> Define extractors for file-constant metadata columns
> ----------------------------------------------------
>
>                 Key: SPARK-43226
>                 URL: https://issues.apache.org/jira/browse/SPARK-43226
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 3.4.0
>            Reporter: Ryan Johnson
>            Assignee: Ryan Johnson
>            Priority: Major
>
> File-source constant metadata columns are often derived indirectly from 
> file-level metadata values rather than exposing those values directly. For 
> example, {{_metadata.file_name}} is currently hard-coded in 
> {{FileFormat.updateMetadataInternalRow}} as:
>  
> {code:java}
> UTF8String.fromString(filePath.getName){code}
>  
> We should add support for metadata extractors, functions that map from 
> {{PartitionedFile}} to {{{}Literal{}}}, so that we can express such columns 
> in a generic way instead of hard-coding them.
> We can't just add them to the metadata map because then they have to be 
> pre-computed even if it turns out the query does not select that field.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to