[ https://issues.apache.org/jira/browse/HIVE-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790263#action_12790263 ]
Carl Steinbach commented on HIVE-837: ------------------------------------- I think this can be implemented as either a UDF or virtual column as long as we choose to define FILE() as the directory path that contains the table's/partition's data files, but I think things get significantly more complicated if we want FILE() to evaluate to the actual *file* that the current row is drawn from. At compile-time we have access to the table/partition directory and can retrieve a list of data files, but I don't think it is possible to make the row<->File association until runtime. Does this make sense or am I missing something? > virtual column support (filename) in hive > ----------------------------------------- > > Key: HIVE-837 > URL: https://issues.apache.org/jira/browse/HIVE-837 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor > Reporter: Namit Jain > > Copying from some mails: > I am dumping files into a hive partion on five minute intervals. I am using > LOAD DATA into a partition. > weblogs > web1.00 > web1.05 > web1.10 > ... > web2.00 > web2.05 > web1.10 > .... > Things that would be useful.. > Select files from the folder with a regex or exact name > select * FROM logs where FILENAME LIKE(WEB1*) > select * FROM LOGS WHERE FILENAME=web2.00 > Also it would be nice to be able to select offsets in a file, this would make > sense with appends > select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=] > select > substr(filename, 4, 7) as class_A, > substr(filename, 8, 10) as class_B > count( x ) as cnt > from FOO > group by > substr(filename, 4, 7), > substr(filename, 8, 10) ; > Hive should support virtual columns -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.