[jira] [Commented] (IMPALA-8600) Reload partition does not work for transactional tables
[ https://issues.apache.org/jira/browse/IMPALA-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902575#comment-16902575 ] ASF subversion and git services commented on IMPALA-8600: - Commit df7730f7fea948409958db2dfdce2c52b0333c2f in impala's branch refs/heads/master from Gabor Kaszab [ https://gitbox.apache.org/repos/asf?p=impala.git;h=df7730f ] IMPALA-8600: Fix test_acid_compaction Apparently Hive doesn't update the writeIds on a transactional table after compaction. This breaks an assumption made in the REFRESH table logic that only does an actual refresh when the HMS writeId is different than the one cached locally. As a result the non-partitioned ACID tables aren't refreshed when a REFRESH table is invoked in Impala right after a major compaction. Change-Id: I58b79f8864b31e18eca818032ad5a9af954913f6 Reviewed-on: http://gerrit.cloudera.org:8080/14027 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Reload partition does not work for transactional tables > --- > > Key: IMPALA-8600 > URL: https://issues.apache.org/jira/browse/IMPALA-8600 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > Fix For: Impala 3.3.0 > > > If a table is transactional, a reload partition call should fetch the valid > writeIds. Without doing this, the reload will skip adding all the newly > created delta files of the transactional table pertaining to the new writeIds. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8600) Reload partition does not work for transactional tables
[ https://issues.apache.org/jira/browse/IMPALA-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901863#comment-16901863 ] Gabor Kaszab commented on IMPALA-8600: -- Apparently, Hive doesn't update the writeIds during a major compaction. This breaks the minor optimisation I implemented: For non-partitioned tables during a refresh I first get the table-level writeId and compare it with the cached writeId. If they match then I don't reload the table from HMS. However, this way Impala won't see the new base directory after a major compaction even if the user ran a REFRESH on the table. So I have to remove that optimisation and simply reload the whole table if it's transactional. > Reload partition does not work for transactional tables > --- > > Key: IMPALA-8600 > URL: https://issues.apache.org/jira/browse/IMPALA-8600 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > Fix For: Impala 3.3.0 > > > If a table is transactional, a reload partition call should fetch the valid > writeIds. Without doing this, the reload will skip adding all the newly > created delta files of the transactional table pertaining to the new writeIds. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8600) Reload partition does not work for transactional tables
[ https://issues.apache.org/jira/browse/IMPALA-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901492#comment-16901492 ] ASF subversion and git services commented on IMPALA-8600: - Commit 972104b6d6611ba0c1667671f9c25061fbe19b55 in impala's branch refs/heads/master from Gabor Kaszab [ https://gitbox.apache.org/repos/asf?p=impala.git;h=972104b ] IMPALA-8600: AnalyzerTest.TestAnalyzeTransactional() test fix Adjusts expected error message in AnalyzerTest.TestAnalyzeTransactional() after rewriting the message. Change-Id: I7f1ed5da8cd3511eae4db12fb5ce1235aee50fd6 Reviewed-on: http://gerrit.cloudera.org:8080/14017 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Reload partition does not work for transactional tables > --- > > Key: IMPALA-8600 > URL: https://issues.apache.org/jira/browse/IMPALA-8600 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > Fix For: Impala 3.3.0 > > > If a table is transactional, a reload partition call should fetch the valid > writeIds. Without doing this, the reload will skip adding all the newly > created delta files of the transactional table pertaining to the new writeIds. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8600) Reload partition does not work for transactional tables
[ https://issues.apache.org/jira/browse/IMPALA-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898332#comment-16898332 ] ASF subversion and git services commented on IMPALA-8600: - Commit 2d819655118c8c6e82649e3c3821311f3dd01174 in impala's branch refs/heads/master from Gabor Kaszab [ https://gitbox.apache.org/repos/asf?p=impala.git;h=2d81965 ] IMPALA-8600: Refresh transactional tables Refreshing a subset of partitions in a transactional table might lead us to an inconsistent state of that transactional table. As a fix user initiated partition refreshes are no longer allowed on ACID tables. Additionally, a refresh partition Metastore event actually triggers a refresh on the whole ACID table. An optimisation is implemented to check the locally latest table level writeId, fetch the same from HMS and do a refresh only if they don't match. This couldn't be done for partitioned tables as apparently Hive doesn't update the table level writeId if the transactional table is partitioned. Similarly, checking the writeId for each partition and refresh only the ones where the writeId is not up to date is not feasible either as there is no writeId update when Hive makes schema changes like adding a column neither on table level or on partition level. So after a adding a column in Hive to a partitioned ACID table and refreshing that table in Impala, still Impala wouldn't see the new column. Hence, I unconditionally refresh the whole table if it's ACID and partitioned. Note, that for non-partitioned ACID tables Hive updates the table level writeId even for schema changes. Change-Id: I1851da22452074dbe253bcdd97145e06c7552cd3 Reviewed-on: http://gerrit.cloudera.org:8080/13938 Reviewed-by: Csaba Ringhofer Tested-by: Impala Public Jenkins > Reload partition does not work for transactional tables > --- > > Key: IMPALA-8600 > URL: https://issues.apache.org/jira/browse/IMPALA-8600 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > Fix For: Impala 3.3.0 > > > If a table is transactional, a reload partition call should fetch the valid > writeIds. Without doing this, the reload will skip adding all the newly > created delta files of the transactional table pertaining to the new writeIds. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8600) Reload partition does not work for transactional tables
[ https://issues.apache.org/jira/browse/IMPALA-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898212#comment-16898212 ] Gabor Kaszab commented on IMPALA-8600: -- Created a follow-up Jira to refresh only a subset of partitions based on writeId change in case of a full table refresh. https://issues.apache.org/jira/browse/IMPALA-8809 > Reload partition does not work for transactional tables > --- > > Key: IMPALA-8600 > URL: https://issues.apache.org/jira/browse/IMPALA-8600 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > > If a table is transactional, a reload partition call should fetch the valid > writeIds. Without doing this, the reload will skip adding all the newly > created delta files of the transactional table pertaining to the new writeIds. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8600) Reload partition does not work for transactional tables
[ https://issues.apache.org/jira/browse/IMPALA-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896178#comment-16896178 ] Gabor Kaszab commented on IMPALA-8600: -- I ran into a Hive issue during the implementation: Hive doesn't increment any writeIds if I change the schema of a partitioned ACID table. It increments the high watermark, however. I opened a Jira: https://issues.apache.org/jira/browse/HIVE-22062 > Reload partition does not work for transactional tables > --- > > Key: IMPALA-8600 > URL: https://issues.apache.org/jira/browse/IMPALA-8600 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > > If a table is transactional, a reload partition call should fetch the valid > writeIds. Without doing this, the reload will skip adding all the newly > created delta files of the transactional table pertaining to the new writeIds. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org