[ https://issues.apache.org/jira/browse/IMPALA-9723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zoltán Borók-Nagy updated IMPALA-9723: -------------------------------------- Priority: Minor (was: Major) > Read files created by Hive Streaming Ingestion V2 > ------------------------------------------------- > > Key: IMPALA-9723 > URL: https://issues.apache.org/jira/browse/IMPALA-9723 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend > Reporter: Zoltán Borók-Nagy > Priority: Minor > > Impala should be able to read files created by Hive Streaming Ingestion V2. > Hive Streaming only writes full ACID ORC files. Such files might contain row > stripes that Impala shouldn't read based on its validWriteIdList. > Also, Hive Streaming might append to the end of such files. In that case it > writes a "side file" next to the file that contains the last committed file > end (name of it is file name + _flush_length). Impala should take that into > consideration when it reads such files. Everything after "flush length" must > be ignored. > OrcAcidUtils.getLastFlushLength(fileSystem, filePath) can be used to > determine the committed file size. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org