SteveLauC commented on issue #6051:
URL: 
https://github.com/apache/arrow-datafusion/issues/6051#issuecomment-1791864158

   I am interested in implementing this as requested by 
[discussion#7979](https://github.com/apache/arrow-datafusion/discussions/7979), 
just checked the the previous PRs, and here are my thoughts:
   
   1. What is the correct semantics of this `input_file_name()` function
      
       1. Return all the files registered by a table
       2. The file that a specific row comes from (at runtime/execution time)
   
       > [arrow#9944](https://github.com/apache/arrow/pull/9944) seems to 
choose option 1 if I understand correctly, in my use case, I would like to have 
option 2.
   
   2. For option 1, this info is stored in different file's `ExecutionPlan` 
node, we can fetch them after generating the physical plan
   
   3. For option 2, this info will be available when different file's 
`XXXOpener` type actually opens a file (whether on the local file system or a 
object storage)
   
   ------
   
   Friendly ping @alamb and @jorgecarleitao since you guys were the reviewer of 
the previous PR, I would like to hear your thoughts! :)
   
   And, haven't made any serious contributions to DataFusion, guidance would be 
hightly appreciated!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to