[jira] [Updated] (IMPALA-9664) Insert events on transactional tables need to call addWriteNotificationLog API
[ https://issues.apache.org/jira/browse/IMPALA-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9664: Priority: Critical (was: Major) > Insert events on transactional tables need to call addWriteNotificationLog API > -- > > Key: IMPALA-9664 > URL: https://issues.apache.org/jira/browse/IMPALA-9664 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Xiaomeng Zhang >Priority: Critical > > According to what we see in Hive source code, for transactional tables, the > insert events are fired with a different API {{addWriteNotificationLog}}. > Currently Impala fires {{firelistenerEvent}} for both transactional and > non-transactional tables. We should look at what is the difference between > the two APIs and see if we need to handle transactional tables differently. > References: > https://github.com/apache/hive/blob/c3afb57bdb1041f566fbbd896f625328fc9656a0/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2402 > https://github.com/apache/hive/blob/c3afb57bdb1041f566fbbd896f625328fc9656a0/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2236 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9253) Blacklist additional posix error codes for failed DataStreamService RPCs
[ https://issues.apache.org/jira/browse/IMPALA-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9253: --- Assignee: Csaba Ringhofer > Blacklist additional posix error codes for failed DataStreamService RPCs > > > Key: IMPALA-9253 > URL: https://issues.apache.org/jira/browse/IMPALA-9253 > Project: IMPALA > Issue Type: Sub-task >Reporter: Sahil Takiar >Assignee: Csaba Ringhofer >Priority: Major > > Filing as a follow up to > [IMPALA-9137|http://issues.cloudera.org/browse/IMPALA-9137], > [IMPALA-9137|http://issues.cloudera.org/browse/IMPALA-9137] blacklists a node > if a RPC fails with specific posix error codes: > * 107 = ENOTCONN: Transport endpoint is not connected > * 108 = ESHUTDOWN: Cannot send after transport endpoint shutdown > * 111 = ECONNREFUSED: Connection refused > These codes were produced by running a query, killing a node running that > query, and then seeing what error codes the query failed with. > There may be other error codes that are worth using for node blacklisting as > well. One way to come up with more error codes is to use iptables to > introduce network faults between Impala processes and see how RPCs fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9253) Blacklist additional posix error codes for failed DataStreamService RPCs
[ https://issues.apache.org/jira/browse/IMPALA-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9253: --- Assignee: (was: Csaba Ringhofer) > Blacklist additional posix error codes for failed DataStreamService RPCs > > > Key: IMPALA-9253 > URL: https://issues.apache.org/jira/browse/IMPALA-9253 > Project: IMPALA > Issue Type: Sub-task >Reporter: Sahil Takiar >Priority: Major > > Filing as a follow up to > [IMPALA-9137|http://issues.cloudera.org/browse/IMPALA-9137], > [IMPALA-9137|http://issues.cloudera.org/browse/IMPALA-9137] blacklists a node > if a RPC fails with specific posix error codes: > * 107 = ENOTCONN: Transport endpoint is not connected > * 108 = ESHUTDOWN: Cannot send after transport endpoint shutdown > * 111 = ECONNREFUSED: Connection refused > These codes were produced by running a query, killing a node running that > query, and then seeing what error codes the query failed with. > There may be other error codes that are worth using for node blacklisting as > well. One way to come up with more error codes is to use iptables to > introduce network faults between Impala processes and see how RPCs fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-9625) Impala's COMPUTE STATS statement generates duplicate ALTER events
[ https://issues.apache.org/jira/browse/IMPALA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9625 started by Dinesh Garg. --- > Impala's COMPUTE STATS statement generates duplicate ALTER events > - > > Key: IMPALA-9625 > URL: https://issues.apache.org/jira/browse/IMPALA-9625 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Fang-Yu Rao >Assignee: Dinesh Garg >Priority: Critical > > Impala's COMPUTE STATS statement results in the registration of the ALTER > event twice. One is in {{Analyzer#registerAuthAndAuditEvent()}} at > [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L3131-L3133] > and the other is in {{Analyzer#getTable()}} at > [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L2862-L2863]. > In {{registerAuthAndAuditEvent()}}, the corresponding full table name > {{table.getFullName()}} is produced by a call to > {{Analyzer#resolveTableRef()}} > ([https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java#L352]). > The resulting database and table names are both in lowercase. > However, in {{getTable()}}, the fully-qualified table name is produce by a > call to {{Analyzer#getFqTableName()}} at > [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L2836]. > The resulting database and table names are in their originally unconverted > form provided by the user from the Impala shell. Hence, there is no guarantee > that the database and table names are both in lowercase. > Therefore, if a user does not provide lowercase database and table names, the > returned full table name from {{registerAuthAndAuditEvent()}} and > {{getTable()}} would differ, resulting in duplicate ALTER events for the same > table. > We should at least make the full table name consistent every time when we > register such an audit event to avoid duplicate entries in the log. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9625) Impala's COMPUTE STATS statement generates duplicate ALTER events
[ https://issues.apache.org/jira/browse/IMPALA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9625: --- Assignee: Fang-Yu Rao (was: Dinesh Garg) > Impala's COMPUTE STATS statement generates duplicate ALTER events > - > > Key: IMPALA-9625 > URL: https://issues.apache.org/jira/browse/IMPALA-9625 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Fang-Yu Rao >Assignee: Fang-Yu Rao >Priority: Critical > > Impala's COMPUTE STATS statement results in the registration of the ALTER > event twice. One is in {{Analyzer#registerAuthAndAuditEvent()}} at > [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L3131-L3133] > and the other is in {{Analyzer#getTable()}} at > [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L2862-L2863]. > In {{registerAuthAndAuditEvent()}}, the corresponding full table name > {{table.getFullName()}} is produced by a call to > {{Analyzer#resolveTableRef()}} > ([https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java#L352]). > The resulting database and table names are both in lowercase. > However, in {{getTable()}}, the fully-qualified table name is produce by a > call to {{Analyzer#getFqTableName()}} at > [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L2836]. > The resulting database and table names are in their originally unconverted > form provided by the user from the Impala shell. Hence, there is no guarantee > that the database and table names are both in lowercase. > Therefore, if a user does not provide lowercase database and table names, the > returned full table name from {{registerAuthAndAuditEvent()}} and > {{getTable()}} would differ, resulting in duplicate ALTER events for the same > table. > We should at least make the full table name consistent every time when we > register such an audit event to avoid duplicate entries in the log. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8632) Add support for self-event detection for insert events
[ https://issues.apache.org/jira/browse/IMPALA-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8632: --- Assignee: Xiaomeng Zhang (was: Dinesh Garg) > Add support for self-event detection for insert events > -- > > Key: IMPALA-8632 > URL: https://issues.apache.org/jira/browse/IMPALA-8632 > Project: IMPALA > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Xiaomeng Zhang >Priority: Critical > > In case of {{INSERT_EVENTS}} if Impala inserts into a table it causes a > refresh to the underlying table/partition. This could be unnecessary when > there is only one Impala cluster in the system. The existing self-event > detection framework cannot identify such events because they are not sending > HMS objects like tables and partitions to the HMS. Instead in case of > {{INSERT_EVENT}} HMS API only asks for a table name or partition value to > fire a insert event on it. > We can detect a self-event in such cases if the HMS API to fire a listener > event is improved to return the event id. This would be used by > EventProcessor to ignore the event when it is fetched later in the next > polling cycle. In order to support this, we will need to make a change to > Hive as well so that the enhanced API can be used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8632) Add support for self-event detection for insert events
[ https://issues.apache.org/jira/browse/IMPALA-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8632 started by Dinesh Garg. --- > Add support for self-event detection for insert events > -- > > Key: IMPALA-8632 > URL: https://issues.apache.org/jira/browse/IMPALA-8632 > Project: IMPALA > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Dinesh Garg >Priority: Critical > > In case of {{INSERT_EVENTS}} if Impala inserts into a table it causes a > refresh to the underlying table/partition. This could be unnecessary when > there is only one Impala cluster in the system. The existing self-event > detection framework cannot identify such events because they are not sending > HMS objects like tables and partitions to the HMS. Instead in case of > {{INSERT_EVENT}} HMS API only asks for a table name or partition value to > fire a insert event on it. > We can detect a self-event in such cases if the HMS API to fire a listener > event is improved to return the event id. This would be used by > EventProcessor to ignore the event when it is fetched later in the next > polling cycle. In order to support this, we will need to make a change to > Hive as well so that the enhanced API can be used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9529) OR predicates not applied correctly on table masking view
[ https://issues.apache.org/jira/browse/IMPALA-9529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9529: --- Assignee: Quanlong Huang > OR predicates not applied correctly on table masking view > - > > Key: IMPALA-9529 > URL: https://issues.apache.org/jira/browse/IMPALA-9529 > Project: IMPALA > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > Labels: correctness > > Create a column masking policy on functional_parquet.complextypestbl table: > id => 100 * id. The following query has incorrect results: > {code:sql} > select id, nested_struct.a from functional_parquet.complextypestbl t > where id = 100 or nested_struct.a = 1; > +-+-+ > | id | nested_struct.a | > +-+-+ > | 100 | 1 | > | 200 | NULL| > | 300 | NULL| > | 400 | NULL| > | 500 | NULL| > | 600 | NULL| > | 700 | 7 | > | 800 | -1 | > +-+-+ > {code} > Explaining the query shows somehow the predicates are not assigned: > {code} > Query: explain select id, nested_struct.a from > functional_parquet.complextypestbl t > where id = 100 or nested_struct.a = 1 > +---+ > | Explain String > | > +---+ > | Max Per-Host Resource Reservation: Memory=16.00KB Threads=3 > | > | Per-Host Resource Estimates: Memory=32MB > | > | WARNING: The following tables are missing relevant table and/or column > statistics.| > | functional_parquet.complextypestbl > | > | Analyzed query: SELECT id, nested_struct.a FROM (SELECT CAST(CAST(100 AS > BIGINT) | > | * id AS BIGINT) id FROM functional_parquet.complextypestbl t) WHERE id = > | > | CAST(100 AS BIGINT) OR nested_struct.a = CAST(1 AS INT) > | > | > | > | F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | > | Per-Host Resources: mem-estimate=57.78KB mem-reservation=0B > thread-reservation=1 | > | PLAN-ROOT SINK > | > | | output exprs: CAST(CAST(100 AS BIGINT) * id AS BIGINT), > nested_struct.a | > | | mem-estimate=0B mem-reservation=0B thread-reservation=0 > | > | | > | > | 01:EXCHANGE [UNPARTITIONED] > | > | mem-estimate=57.78KB mem-reservation=0B thread-reservation=0 > | > | tuple-ids=0 row-size=12B cardinality=4.40K > | > | in pipelines: 00(GETNEXT) > | > | > | > | F00:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 > | > | Per-Host Resources: mem-estimate=32.00MB mem-reservation=16.00KB > thread-reservation=2 | > | DATASTREAM SINK [FRAGMENT=F01, EXCHANGE=01, UNPARTITIONED] > | > | | mem-estimate=0B mem-reservation=0B thread-reservation=0 > | > | 00:SCAN HDFS [functional_parquet.complextypestbl t, RANDOM] > | > | HDFS partitions=1/1 files=2 size=6.92KB > | > | stored statistics: > | > |table: rows=unavailable size=unavailable > | > |columns missing stats: id > | > | extrapolated-rows=disabled max-scan-range-rows=unavailable > | > | mem-estimate=32.00MB mem-reservation=16.00KB thread-reservation=1 > | > | tuple-ids=0 row-size=12B cardinality=4.40K > | > | in pipelines: 00(GETNEXT) > | > +---+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IMPALA-9332) Investigate and use the new batch listing API from HDFS-13616
[ https://issues.apache.org/jira/browse/IMPALA-9332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9332: --- Assignee: Quanlong Huang (was: Vihang Karajgaonkar) > Investigate and use the new batch listing API from HDFS-13616 > - > > Key: IMPALA-9332 > URL: https://issues.apache.org/jira/browse/IMPALA-9332 > Project: IMPALA > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Quanlong Huang >Priority: Critical > > HDFS-13616 provides a new batch listing API which can potentially speed up > the file listing on HDFS tables when reloading the table file metadata. We > should investigate if this API is helpful for Impala and use it if there are > any performance benefits. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8870) Bump guava version when building against Hive 3
[ https://issues.apache.org/jira/browse/IMPALA-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8870: --- Assignee: Fang-Yu Rao (was: Vihang Karajgaonkar) > Bump guava version when building against Hive 3 > --- > > Key: IMPALA-8870 > URL: https://issues.apache.org/jira/browse/IMPALA-8870 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Fang-Yu Rao >Priority: Blocker > > Guava is pinned to 14.01 > https://github.com/apache/impala/blob/8094811/impala-parent/pom.xml#L59 > {code} > > 14.0.1 > {code} > I think this has likely changed in Hive 3 and we probably want to revisit > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9350) Ranger audits for column masking not produced
[ https://issues.apache.org/jira/browse/IMPALA-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9350: --- Assignee: Fang-Yu Rao (was: Dinesh Garg) > Ranger audits for column masking not produced > - > > Key: IMPALA-9350 > URL: https://issues.apache.org/jira/browse/IMPALA-9350 > Project: IMPALA > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Fang-Yu Rao >Priority: Critical > Attachments: Ranger Audit Events.png > > > The audits for applying Ranger column masking policies are missing. > Here are audit events for a query "SELECT * FROM default.sample_07" executed > by Hive and Impala. > !Ranger Audit Events.png|width=1259,height=327! > Policy 37 is a column masking policy on table default.sample_07. We should > produce the audit event when it's applied. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-9350) Ranger audits for column masking not produced
[ https://issues.apache.org/jira/browse/IMPALA-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9350 started by Dinesh Garg. --- > Ranger audits for column masking not produced > - > > Key: IMPALA-9350 > URL: https://issues.apache.org/jira/browse/IMPALA-9350 > Project: IMPALA > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Dinesh Garg >Priority: Critical > Attachments: Ranger Audit Events.png > > > The audits for applying Ranger column masking policies are missing. > Here are audit events for a query "SELECT * FROM default.sample_07" executed > by Hive and Impala. > !Ranger Audit Events.png|width=1259,height=327! > Policy 37 is a column masking policy on table default.sample_07. We should > produce the audit event when it's applied. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9350) Ranger audits for column masking not produced
[ https://issues.apache.org/jira/browse/IMPALA-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9350: --- Assignee: Fang-Yu Rao (was: Quanlong Huang) > Ranger audits for column masking not produced > - > > Key: IMPALA-9350 > URL: https://issues.apache.org/jira/browse/IMPALA-9350 > Project: IMPALA > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Fang-Yu Rao >Priority: Critical > > The audits for applying Ranger column masking policies are missing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8632) Add support for self-event detection for insert events
[ https://issues.apache.org/jira/browse/IMPALA-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8632: --- Assignee: Xiaomeng Zhang (was: Vihang Karajgaonkar) > Add support for self-event detection for insert events > -- > > Key: IMPALA-8632 > URL: https://issues.apache.org/jira/browse/IMPALA-8632 > Project: IMPALA > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Xiaomeng Zhang >Priority: Critical > > In case of {{INSERT_EVENTS}} if Impala inserts into a table it causes a > refresh to the underlying table/partition. This could be unnecessary when > there is only one Impala cluster in the system. The existing self-event > detection framework cannot identify such events because they are not sending > HMS objects like tables and partitions to the HMS. Instead in case of > {{INSERT_EVENT}} HMS API only asks for a table name or partition value to > fire a insert event on it. > We can detect a self-event in such cases if the HMS API to fire a listener > event is improved to return the event id. This would be used by > EventProcessor to ignore the event when it is fetched later in the next > polling cycle. In order to support this, we will need to make a change to > Hive as well so that the enhanced API can be used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8444) Analysis perf regression after IMPALA-7616
[ https://issues.apache.org/jira/browse/IMPALA-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8444: Description: * The patch for IMPALA-7616 caused a performance regression in analysis time when run in an environment with ~1k roles and 10.5k. privileges. The regression is evident when run as a role that has a large number of privileges. Following is the stack to look for when jstacking the coordnator. {noformat} "Thread-21" #49 prio=5 os_prio=0 tid=0x0cd6e000 nid=0x6a3d runnable [0x7fa28e4a] java.lang.Thread.State: RUNNABLE at java.lang.String.toLowerCase(String.java:2670) at org.apache.impala.catalog.PrincipalPrivilege.buildPrivilegeName(PrincipalPrivilege.java:82) at org.apache.impala.catalog.PrincipalPrivilege.getName(PrincipalPrivilege.java:143) at org.apache.impala.catalog.AuthorizationPolicy.listPrivileges(AuthorizationPolicy.java:423) - locked <0x7fa376987100> (a org.apache.impala.catalog.AuthorizationPolicy) at org.apache.impala.catalog.AuthorizationPolicy.listPrivileges(AuthorizationPolicy.java:443) - locked <0x7fa376987100> (a org.apache.impala.catalog.AuthorizationPolicy) at org.apache.sentry.provider.cache.SimpleCacheProviderBackend.getPrivileges(SimpleCacheProviderBackend.java:75) at org.apache.sentry.policy.db.SimpleDBPolicyEngine.getPrivileges(SimpleDBPolicyEngine.java:98) at org.apache.sentry.provider.common.ResourceAuthorizationProvider.getPrivileges(ResourceAuthorizationProvider.java:147) at org.apache.sentry.provider.common.ResourceAuthorizationProvider.doHasAccess(ResourceAuthorizationProvider.java:120) at org.apache.sentry.provider.common.ResourceAuthorizationProvider.hasAccess(ResourceAuthorizationProvider.java:107) at org.apache.impala.authorization.AuthorizationChecker.hasAccess(AuthorizationChecker.java:215) at org.apache.impala.authorization.AuthorizationChecker.checkAccess(AuthorizationChecker.java:128) at org.apache.impala.analysis.AnalysisContext.authorizePrivilegeRequest(AnalysisContext.java:592) at org.apache.impala.analysis.AnalysisContext.authorize(AnalysisContext.java:564) at org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:415) at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1240) at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1210) at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1182) at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:158) {noformat} Issue worsens when running concurrent workloads, because the underlying Sentry {{listPrivileges()}} is synchronized and that serializes all the query analysis requests. {noformat} public synchronized Set listPrivileges(Set groups, < ActiveRoleSet roleSet) { Set privileges = Sets.newHashSet(); if (roleSet != ActiveRoleSet.ALL) { throw new UnsupportedOperationException("Impala does not support role subsets."); } {noformat} Notes: - If the authorization metadata footprint is small, this issue can be ignored. - One workaround is to run the query using a role that has very small number of privileges (revoke privileges that are not necessary to run a given query). - Another workaround is to disable Sentry authorization. was: The patch for IMPALA-7616 caused a performance regression in analysis time when run in an environment with ~1k roles and 10.5k. privileges. The regression is evident when run as a role that has a large number of privileges. Following is the stack to look for when jstacking the coordnator. {noformat} "Thread-21" #49 prio=5 os_prio=0 tid=0x0cd6e000 nid=0x6a3d runnable [0x7fa28e4a] java.lang.Thread.State: RUNNABLE at java.lang.String.toLowerCase(String.java:2670) at org.apache.impala.catalog.PrincipalPrivilege.buildPrivilegeName(PrincipalPrivilege.java:82) at org.apache.impala.catalog.PrincipalPrivilege.getName(PrincipalPrivilege.java:143) at org.apache.impala.catalog.AuthorizationPolicy.listPrivileges(AuthorizationPolicy.java:423) - locked <0x7fa376987100> (a org.apache.impala.catalog.AuthorizationPolicy) at org.apache.impala.catalog.AuthorizationPolicy.listPrivileges(AuthorizationPolicy.java:443) - locked <0x7fa376987100> (a org.apache.impala.catalog.AuthorizationPolicy) at org.apache.sentry.provider.cache.SimpleCacheProviderBackend.getPrivileges(SimpleCacheProviderBackend.java:75) at org.apache.sentry.policy.db.SimpleDBPolicyEngine.getPrivileges(SimpleDBPolicyEngine.java:98) at org.apache.sentry.provider.common.ResourceAuthorizationProvider.getPrivileges(ResourceAuthorizationProvider.java:147) at
[jira] [Updated] (IMPALA-9072) Advanced features and write support in ORC support
[ https://issues.apache.org/jira/browse/IMPALA-9072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9072: Summary: Advanced features and write support in ORC support (was: Advanced features in ORC support) > Advanced features and write support in ORC support > -- > > Key: IMPALA-9072 > URL: https://issues.apache.org/jira/browse/IMPALA-9072 > Project: IMPALA > Issue Type: Epic >Reporter: Quanlong Huang >Priority: Major > > Support full functionality for read/write ORC file format tables. JIRAs in > this epic may have lower priority unless they're highly voted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9175) Revisit the error handling logics in ORC scanner
[ https://issues.apache.org/jira/browse/IMPALA-9175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9175: Priority: Critical (was: Major) > Revisit the error handling logics in ORC scanner > > > Key: IMPALA-9175 > URL: https://issues.apache.org/jira/browse/IMPALA-9175 > Project: IMPALA > Issue Type: Task >Reporter: Quanlong Huang >Priority: Critical > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-6772) Enable test_scanners_fuzz for ORC format
[ https://issues.apache.org/jira/browse/IMPALA-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-6772: Priority: Critical (was: Major) > Enable test_scanners_fuzz for ORC format > > > Key: IMPALA-6772 > URL: https://issues.apache.org/jira/browse/IMPALA-6772 > Project: IMPALA > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > > Currently, we haven't enabled test_scanner_fuzz for ORC yet, since the ORC > library (release-1.4.3) is not robust for corrupt files (ORC-315). We should > enable it after a new version of the ORC library is released. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9074) Add support for zstd in ORC
[ https://issues.apache.org/jira/browse/IMPALA-9074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9074: Priority: Critical (was: Major) > Add support for zstd in ORC > --- > > Key: IMPALA-9074 > URL: https://issues.apache.org/jira/browse/IMPALA-9074 > Project: IMPALA > Issue Type: New Feature >Reporter: Quanlong Huang >Priority: Critical > Attachments: id_name_zstd.orc > > > The ORC lib already supports reading/writing to zstd compressed ORC files. > However, I failed in a quick try in Impala: > {code:sql} > hive> create table orc_zstd (id int, name string) stored as orc; > $ hdfs dfs -put id_name_zstd.orc > hdfs://localhost:20500/test-warehouse/orc_zstd > impala-shell> invalidate metadata orc_zstd; > impala-shell> select * from orc_zstd; > ERROR: Encountered parse error in tail of ORC file > hdfs://localhost:20500/test-warehouse/orc_zstd/id_name_zstd.orc: Unknown > compression codec 5 > {code} > The ORC file is generated by the csv-import tool: > https://github.com/apache/orc/blob/rel/release-1.6.0/tools/src/CSVFileImport.cc > (Manually changing the compression from ZLIB to ZSTD in it) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-6943) ORC support with full functionality
[ https://issues.apache.org/jira/browse/IMPALA-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-6943: Priority: Critical (was: Major) > ORC support with full functionality > --- > > Key: IMPALA-6943 > URL: https://issues.apache.org/jira/browse/IMPALA-6943 > Project: IMPALA > Issue Type: Epic >Reporter: Quanlong Huang >Priority: Critical > > Support basic functionality for reading ORC file format tables including > stability works for edge cases. This is the first milestone to make ORC > support GA. Other advanced features will be tracked in IMPALA-9072. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9174) Revisit the memory pattern of the ORC scanner
[ https://issues.apache.org/jira/browse/IMPALA-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9174: Priority: Critical (was: Major) > Revisit the memory pattern of the ORC scanner > - > > Key: IMPALA-9174 > URL: https://issues.apache.org/jira/browse/IMPALA-9174 > Project: IMPALA > Issue Type: Task >Reporter: Quanlong Huang >Priority: Critical > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8184) Add timestamp validation to Orc scanner
[ https://issues.apache.org/jira/browse/IMPALA-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8184: Priority: Critical (was: Minor) > Add timestamp validation to Orc scanner > --- > > Key: IMPALA-8184 > URL: https://issues.apache.org/jira/browse/IMPALA-8184 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Csaba Ringhofer >Priority: Critical > > Similarly to Parquet, Orc can also contain timestamps that are not valid in > Impala, e.g. Hive can insert timestamps before 1400 while these are invalid > in Impala. These invalid timestamps are often handled similarly to NULL, bur > are actually not "real" NULLs, which can lead to some some weird behavior: > Hive: > create table orcts (ts timestamp) stored as orc; > insert into orcts values ("1200-01-01"); > Impala: > select * from orcts where ts is not null; > Returns 1 row: > NULL -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8046) Support CREATE TABLE from an ORC file
[ https://issues.apache.org/jira/browse/IMPALA-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8046: Priority: Critical (was: Major) > Support CREATE TABLE from an ORC file > - > > Key: IMPALA-8046 > URL: https://issues.apache.org/jira/browse/IMPALA-8046 > Project: IMPALA > Issue Type: New Feature >Reporter: Quanlong Huang >Priority: Critical > > Impala supports creating a table using the schema of a file. However, only > parquet is supported currently. This ticket tracks adding support for > creating table from ORC files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9009) Core support for column mask transformation in select list
[ https://issues.apache.org/jira/browse/IMPALA-9009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9009: --- Assignee: Quanlong Huang (was: Dinesh Garg) > Core support for column mask transformation in select list > -- > > Key: IMPALA-9009 > URL: https://issues.apache.org/jira/browse/IMPALA-9009 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Assignee: Quanlong Huang >Priority: Critical > > Identify masked columns from SELECT list. > Support custom (user supplied) mask SQL from Ranger. > Parse column mask expressions and substitute into original statement -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-9009) Core support for column mask transformation in select list
[ https://issues.apache.org/jira/browse/IMPALA-9009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9009 started by Dinesh Garg. --- > Core support for column mask transformation in select list > -- > > Key: IMPALA-9009 > URL: https://issues.apache.org/jira/browse/IMPALA-9009 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Assignee: Dinesh Garg >Priority: Critical > > Identify masked columns from SELECT list. > Support custom (user supplied) mask SQL from Ranger. > Parse column mask expressions and substitute into original statement -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9010) Support pre-defined mask types from Ranger UI
[ https://issues.apache.org/jira/browse/IMPALA-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9010: --- Assignee: Fang-Yu Rao > Support pre-defined mask types from Ranger UI > - > > Key: IMPALA-9010 > URL: https://issues.apache.org/jira/browse/IMPALA-9010 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Assignee: Fang-Yu Rao >Priority: Critical > > Review Hive implementation/behavior. > Redact/Partial/Hash/Nullify/Unmasked/Date > These will be implemented as static SQL transforms in Impala -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-9110) Add table loading time break-down metrics for HdfsTable
[ https://issues.apache.org/jira/browse/IMPALA-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9110 started by Dinesh Garg. --- > Add table loading time break-down metrics for HdfsTable > --- > > Key: IMPALA-9110 > URL: https://issues.apache.org/jira/browse/IMPALA-9110 > Project: IMPALA > Issue Type: New Feature > Components: Catalog, Frontend >Reporter: Jiawei Wang >Assignee: Dinesh Garg >Priority: Critical > > We are only able to get total table loading time right now, which makes it > really hard for us to debug why sometimes table loading is slow. Therefore, > it would be good to have a break-down metrics on how much time each function > cost when loading tables. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9013) Column Masking DML support
[ https://issues.apache.org/jira/browse/IMPALA-9013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9013: Priority: Critical (was: Major) > Column Masking DML support > -- > > Key: IMPALA-9013 > URL: https://issues.apache.org/jira/browse/IMPALA-9013 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > Review Hive implementation to see if anything special needs to be done for > DML. The Hive column masking design doc does not reflect the current code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9011) Support column masking on CTEs, views, and derived column names
[ https://issues.apache.org/jira/browse/IMPALA-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9011: Priority: Critical (was: Major) > Support column masking on CTEs, views, and derived column names > --- > > Key: IMPALA-9011 > URL: https://issues.apache.org/jira/browse/IMPALA-9011 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > CTE/views: dig out underlying column and table names > derived column names i.e. select * from (select 1) as foo - Handle > appropriately. > Also negative cases where the query has an invalid reference. i.e. > WITH foo AS (SELECT c1 FROM t1) SELECT c1 FROM FOO; -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9012) Allow access to columns with column masks and update tests
[ https://issues.apache.org/jira/browse/IMPALA-9012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9012: Priority: Critical (was: Major) > Allow access to columns with column masks and update tests > -- > > Key: IMPALA-9012 > URL: https://issues.apache.org/jira/browse/IMPALA-9012 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > Remove check in RangerAuthorizationChecker::authorizeTableAccess > Remove testcase in RangerAuditLogTest.java -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9010) Support pre-defined mask types from Ranger UI
[ https://issues.apache.org/jira/browse/IMPALA-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9010: Priority: Critical (was: Major) > Support pre-defined mask types from Ranger UI > - > > Key: IMPALA-9010 > URL: https://issues.apache.org/jira/browse/IMPALA-9010 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > Review Hive implementation/behavior. > Redact/Partial/Hash/Nullify/Unmasked/Date > These will be implemented as static SQL transforms in Impala -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9079) Add Auth Interfaces to retrieve column masks and implement for Ranger
[ https://issues.apache.org/jira/browse/IMPALA-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9079: Priority: Critical (was: Major) > Add Auth Interfaces to retrieve column masks and implement for Ranger > - > > Key: IMPALA-9079 > URL: https://issues.apache.org/jira/browse/IMPALA-9079 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Priority: Critical > > Masks definitions can be retrieved from the ranger plugin. Analyzer has > access to AuthorizationFactory via Analyzer::getAuthzFactory(). There are > currently no interfaces through AuthorizationFactory or AuthorizationChecker > to access the column masks from the plugin. These will need to be added and > then implemented for the Ranger plugin. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9009) Core support for column mask transformation in select list
[ https://issues.apache.org/jira/browse/IMPALA-9009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9009: Priority: Critical (was: Major) > Core support for column mask transformation in select list > -- > > Key: IMPALA-9009 > URL: https://issues.apache.org/jira/browse/IMPALA-9009 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Kurt Deschler >Priority: Critical > > Identify masked columns from SELECT list. > Support custom (user supplied) mask SQL from Ranger. > Parse column mask expressions and substitute into original statement -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9108) Unused leveldbjni dependency triggers some security scanners
[ https://issues.apache.org/jira/browse/IMPALA-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9108: --- Assignee: Tim Armstrong (was: Dinesh Garg) > Unused leveldbjni dependency triggers some security scanners > > > Key: IMPALA-9108 > URL: https://issues.apache.org/jira/browse/IMPALA-9108 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Critical > > A windows dll in leveldbjni-all-1.8.jar is flagged by some security scanners. > We shouldn't have a dependency on leveldb, so we should exclude this and not > pull in the jar. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9108) Unused leveldbjni dependency triggers some security scanners
[ https://issues.apache.org/jira/browse/IMPALA-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-9108: Priority: Critical (was: Major) > Unused leveldbjni dependency triggers some security scanners > > > Key: IMPALA-9108 > URL: https://issues.apache.org/jira/browse/IMPALA-9108 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Critical > > A windows dll in leveldbjni-all-1.8.jar is flagged by some security scanners. > We shouldn't have a dependency on leveldb, so we should exclude this and not > pull in the jar. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-9108) Unused leveldbjni dependency triggers some security scanners
[ https://issues.apache.org/jira/browse/IMPALA-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9108 started by Dinesh Garg. --- > Unused leveldbjni dependency triggers some security scanners > > > Key: IMPALA-9108 > URL: https://issues.apache.org/jira/browse/IMPALA-9108 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Dinesh Garg >Priority: Critical > > A windows dll in leveldbjni-all-1.8.jar is flagged by some security scanners. > We shouldn't have a dependency on leveldb, so we should exclude this and not > pull in the jar. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9092) Fix "show create table" tests on USE_CDP_HIVE=true to account for HIVE-22158
[ https://issues.apache.org/jira/browse/IMPALA-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-9092: --- Assignee: Vihang Karajgaonkar > Fix "show create table" tests on USE_CDP_HIVE=true to account for HIVE-22158 > > > Key: IMPALA-9092 > URL: https://issues.apache.org/jira/browse/IMPALA-9092 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.4.0 >Reporter: Joe McDonnell >Assignee: Vihang Karajgaonkar >Priority: Blocker > > Hive changed behavior with HIVE-22158 so that only transactional tables are > considered managed and all other are considered external. This means that a > regular "create table" will result in an external table with table properties > of 'TRANSLATED_TO_EXTERNAL'='TRUE', 'external.table.purge'='TRUE'. This > breaks our tests that rely on "show create table", because the table is newly > external and has extra table properties. For example: > {noformat} > query_test/test_kudu.py:842: in test_primary_key_and_distribution > db=cursor.conn.db_name, kudu_addr=KUDU_MASTER_HOSTS)) > query_test/test_kudu.py:824: in assert_show_create_equals > assert cursor.fetchall()[0][0] == \ > E assert "CREATE EXTER...='localhost')" == "CREATE TABLE ...='localhost')" > E - CREATE EXTERNAL TABLE testshowcreatetable_15312_ggn1hk.nvbpxfuxze > E ?- > E + CREATE TABLE testshowcreatetable_15312_ggn1hk.nvbpxfuxze ( > E ? ++ > E + c INT NOT NULL ENCODING AUTO_ENCODING COMPRESSION > DEFAULT_COMPRESSION, > E + PRIMARY KEY (c) > E + ) > E + PARTITION BY HASH (c) PARTITIONS 3 > E STORED AS KUDU > E - TBLPROPERTIES ('TRANSLATED_TO_EXTERNAL'='TRUE', > 'external.table.purge'='TRUE', 'kudu.master_addresses'='localhost') > E + TBLPROPERTIES ('kudu.master_addresses'='localhost'){noformat} > We need to decide on the right behavior for "show create table" and update > the tests. > For Kudu tables, tables with TRANSLATED_TO_EXTERNAL=true and > external.table.purge=TRUE should be equivalent to a non-external Kudu table, > and we can just detect this case and generate the same SQL as before. > Other cases may need new logic. I think it makes sense to also address other > tests due to MANAGED vs EXTERNAL distinction or extra table properties with > this JIRA. Here is a list of tests that seem to have this problem: > {noformat} > metadata/test_ddl.py TestDdlStatements.test_create_alter_tbl_properties > metadata/test_show_create_table.py * > query_test/test_kudu.py TestShowCreateTable* > org.apache.impala.catalog.CatalogTest.testCreateTableMetadata > org.apache.impala.catalog.local.LocalCatalogTest.testKuduTable{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8937) Fine grained table metadata loading on Catalog server
[ https://issues.apache.org/jira/browse/IMPALA-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8937: Priority: Critical (was: Major) > Fine grained table metadata loading on Catalog server > - > > Key: IMPALA-8937 > URL: https://issues.apache.org/jira/browse/IMPALA-8937 > Project: IMPALA > Issue Type: Improvement > Components: Catalog, Frontend >Affects Versions: Impala 2.12.0, Impala 3.3.0 >Reporter: Bharath Vissapragada >Priority: Critical > > *Background*: > Currently the table _on the Catalog server_ is either in a loaded or unloaded > state (IncompleteTable). When Catalog server starts for the first time, we > first fetch a list of table names for each databases and every table in this > list starts as an unloaded table. The table lists are propagated to the > coordinators so that they know whether a table with a given name exists or > not and they can start analyzing the queries. No metadata is loaded in the > incomplete tables (like schema/ownership, comments etc.) > The table metadata is loaded lazily (and the table moves into a loaded state) > when it is referenced in any query. When a load request comes in, all the > table metadata is loaded including file block information. > *Problem:* > Coordinators need some additional information when analyzing unloaded tables. > For example: IMPALA-8228. The ownership information is a part of the HMS > table schema which is not loaded until the table is marked fully loaded. > While this is not a problem for regular queries (like select * from ), > it is an issue with queries like "show tables" which do not trigger a table > load. In this particular case, due to the lack of ownership information, the > output of the table listing could be different depending on whether the table > is loaded. Another example is IMPALA-8606 where the GET_TABLES request does > not return the table comments because they are not available for unloaded > tables. > *Ask:* > We need to consider finer grained loading on the Catalog server in general. > Instead of having a binary state (loaded vs unloaded), the table could be in > a partially loaded state. We could also start with aggressively fetching > certain pieces of information that we think could aid with analysis and > lazily load the remaining pieces of metadata. Finer grained loading also > integrates well with the LocalCatalog implementation on the coordinators > where the the entire table need not be loaded on the Catalog server to serve > partial meta information (e.g: show partitions ). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-4025) add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()
[ https://issues.apache.org/jira/browse/IMPALA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-4025: Priority: Critical (was: Major) > add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN() > > > Key: IMPALA-4025 > URL: https://issues.apache.org/jira/browse/IMPALA-4025 > Project: IMPALA > Issue Type: New Feature > Components: Backend, Frontend >Affects Versions: Impala 2.2.4 >Reporter: Greg Rahn >Assignee: Yongzhi Chen >Priority: Critical > Labels: built-in-function, sql-language > > Add the following functions as both an aggregate function and window/analytic > function: > * PERCENTILE_CONT > * PERCENTILE_DISC > * MEDIAN (impmented as PERCENTILE_CONT(0.5)) > h6. Syntax > {code} > PERCENTILE_CONT() WITHIN GROUP (ORDER BY [ASC|DESC] > [NULLS {FIRST | LAST}]) [ OVER ([])] > PERCENTILE_DISC() WITHIN GROUP (ORDER BY [ASC|DESC] > [NULLS {FIRST | LAST}]) [ OVER ([])] > MEDIAN(expr) [ OVER () ] > {code} > h6. Notes from other systems > *Greenplum* > {code} > PERCENTILE_CONT(_percentage_) WITHIN GROUP (ORDER BY _expression_) > {code} > http://gpdb.docs.pivotal.io/4320/admin_guide/query.html > Greenplum Database provides the MEDIAN aggregate function, which returns the > fiftieth percentile of the PERCENTILE_CONT result and special aggregate > expressions for inverse distribution functions as follows: > Currently you can use only these two expressions with the keyword WITHIN > GROUP. > Note: aggregation fuction only > *Oracle* > {code} > PERCENTILE_CONT(expr) WITHIN GROUP (ORDER BY expr [ DESC | ASC ]) [ OVER > (query_partition_clause) ]}} > {code} > http://docs.oracle.com/database/121/SQLRF/functions141.htm#SQLRF00687 > Note: implemented as both an aggregate and window function > *Vertica* > {code} > PERCENTILE_CONT ( %_number ) WITHIN GROUP (... ORDER BY expression [ ASC | > DESC ] ) OVER (... [ window-partition-clause ] ) > {code} > https://my.vertica.com/docs/7.2.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/Analytic/PERCENTILE_CONTAnalytic.htm > Note: window fuction only > *Teradata* > {code} > PERCENTILE_CONT() WITHIN GROUP (ORDER BY > [asc | desc] [nulls {first | last}]) > {code} > Note: aggregation fuction only > *Netezza* > {code} > SELECT fn() WITHIN GROUP (ORDER BY [asc|desc] [nulls > {first | last}]) FROM [GROUP BY ]; > {code} > https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html > Note: aggregation fuction only > *Redshift* > {code} > PERCENTILE_CONT ( percentile ) WITHIN GROUP (ORDER BY expr) OVER ( [ > PARTITION BY expr_list ] ) > {code} > https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html > Note: window fuction only -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8954) Support uncorrelated subqueries in the select list
[ https://issues.apache.org/jira/browse/IMPALA-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8954: --- Assignee: Shant Hovsepian > Support uncorrelated subqueries in the select list > -- > > Key: IMPALA-8954 > URL: https://issues.apache.org/jira/browse/IMPALA-8954 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Shant Hovsepian >Priority: Critical > Labels: tpc-ds > > {noformat} > [localhost:21000] default> select 'foo', (select 'bar'); > Query: select 'foo', (select 'bar') > Query submitted at: 2019-09-18 13:44:43 (Coordinator: > http://tarmstrong-box:25000) > ERROR: AnalysisException: Subqueries are not supported in the select list. > {noformat} > I think we can support these, implemented as a nested loop join with a > cardinality check node if needed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-3531) Implement deferrable and optionally enforced PK/FK constraints
[ https://issues.apache.org/jira/browse/IMPALA-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-3531: Priority: Critical (was: Minor) > Implement deferrable and optionally enforced PK/FK constraints > -- > > Key: IMPALA-3531 > URL: https://issues.apache.org/jira/browse/IMPALA-3531 > Project: IMPALA > Issue Type: New Feature > Components: Catalog, Frontend, Perf Investigation >Affects Versions: Impala 2.5.0, Impala 2.6.0 > Environment: CDH >Reporter: Ruslan Dautkhanov >Assignee: Anurag Mantripragada >Priority: Critical > Labels: CBO, performance, ramp-up > > Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for > Hive to start with something like that for PK/FK constraints. So CBO has more > information for optimizations. It does not have to actually check if that > constraint is relationship is true; it can just "rely" on that constraint. > https://docs.oracle.com/database/121/SQLRF/clauses002.htm#sthref2289 > So it would be helpful with join cardinality estimates, and with cases like > IMPALA-2929. > https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053 > "Overview of Constraint States": > - Enforcement > - Validation > - Belief > So FK/PK with "rely novalidate" will have Enforcement disabled but > Belief = RELY as it is possible to do in Oracle and now in Hive (HIVE-13076). > It opens a lot of ways to do additional ways to optimize execution plans. > As exxplined in Tom Kyte's "Metadata matters" > http://www.peoug.org/wp-content/uploads/2009/12/MetadataMatters_PEOUG_Day2009_TKyte.pdf > pp.30 - "Tell us how the tables relate and we can remove them from the > plan...". > pp.35 - "Tell us how the tables relate and we have more access paths > available...". > Also it might be helpful when Impala is being integrated with Kudu as the > latter have to have a PK. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8291) 'DESCRIBE EXTENDED ..' does not display constraint information
[ https://issues.apache.org/jira/browse/IMPALA-8291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8291: Priority: Critical (was: Major) > 'DESCRIBE EXTENDED ..' does not display constraint information > -- > > Key: IMPALA-8291 > URL: https://issues.apache.org/jira/browse/IMPALA-8291 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Anurag Mantripragada >Assignee: Anurag Mantripragada >Priority: Critical > > Currently, DESCRIBE EXTENDED table_name command does not display constraint > information like primary key / Foreign key information for tables created > through Hive. > This work must also be extended to tables created through Impala once we have > support for pk/fk in create table syntax. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-2112) Support primary key/foreign key constraint as part of create table in Impala
[ https://issues.apache.org/jira/browse/IMPALA-2112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-2112: Priority: Critical (was: Minor) > Support primary key/foreign key constraint as part of create table in Impala > > > Key: IMPALA-2112 > URL: https://issues.apache.org/jira/browse/IMPALA-2112 > Project: IMPALA > Issue Type: Sub-task > Components: Catalog, Frontend >Affects Versions: Impala 2.2 >Reporter: Marcel Kinard >Assignee: Anurag Mantripragada >Priority: Critical > Labels: planner > > These would be advisory, ie, Impala would not attempt to enforce them. > However, they could be used for cardinality estimation during query planning. > To be compatible with Hive: > * We neither enforce or validate integrity constraints. Hence, DISABLE and > NOVALIDATE options are mandatory. > * RELY/NORELY is optional. The CBO is expected to use this information when > a user specifies “RELY”. The default is NORELY. > * Since we do not yet have UNIQUE in Hive, the FK mentioned must be Primary > Key column in parent table. > Support create table syntax like hive does: > * {{create table pk(id1 integer, id2 integer, }}{{primary key(id1, id2) > DISABLE NOVALIDATE);}} > * {{create table fk(id1 integer, id2 integer, }}{{constraint c1 foreign > key(id1, id2) references pk(id2, id1) DISABLE NOVALIDATE);}} > * {{create table T1(id integer, name string, primary key(id) DISABLE > NOVALIDATE RELY}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8290) Display constraint information in 'SHOW CREATE' statement
[ https://issues.apache.org/jira/browse/IMPALA-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8290: Priority: Critical (was: Minor) > Display constraint information in 'SHOW CREATE' statement > - > > Key: IMPALA-8290 > URL: https://issues.apache.org/jira/browse/IMPALA-8290 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Anurag Mantripragada >Assignee: Anurag Mantripragada >Priority: Critical > > Show create statement should display primary key and foreign key information. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8954) Support uncorrelated subqueries in the select list
[ https://issues.apache.org/jira/browse/IMPALA-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8954: Priority: Critical (was: Major) > Support uncorrelated subqueries in the select list > -- > > Key: IMPALA-8954 > URL: https://issues.apache.org/jira/browse/IMPALA-8954 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Priority: Critical > Labels: tpc-ds > > {noformat} > [localhost:21000] default> select 'foo', (select 'bar'); > Query: select 'foo', (select 'bar') > Query submitted at: 2019-09-18 13:44:43 (Coordinator: > http://tarmstrong-box:25000) > ERROR: AnalysisException: Subqueries are not supported in the select list. > {noformat} > I think we can support these, implemented as a nested loop join with a > cardinality check node if needed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-891) Add support for intersect and except set operations
[ https://issues.apache.org/jira/browse/IMPALA-891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-891: --- Priority: Critical (was: Major) > Add support for intersect and except set operations > --- > > Key: IMPALA-891 > URL: https://issues.apache.org/jira/browse/IMPALA-891 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 1.2.4, Impala 2.5.0, Impala 2.6.0, Impala 2.7.0 >Reporter: Jonathan Seidman >Priority: Critical > Labels: sql-language, usability > > Set functionality includes the below. Today, Impala has just {{UNION}} & > {{UNION ALL}}. > {code} > UNION [DISTINCT] > UNION ALL > INTERSECT [DISTINCT] > INTERSECT ALL > EXCEPT [DISTINCT] > EXCEPT ALL > * MINUS is an alias for EXCEPT > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-5098) Correct handling of DISTINCT in the select list
[ https://issues.apache.org/jira/browse/IMPALA-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-5098: --- Assignee: Kurt Deschler > Correct handling of DISTINCT in the select list > --- > > Key: IMPALA-5098 > URL: https://issues.apache.org/jira/browse/IMPALA-5098 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 2.6.0 >Reporter: N Campbell >Assignee: Kurt Deschler >Priority: Critical > Labels: ansi-sql, sql-language > > DB2, ORACLE and various other systems will support the following statement > but Impala will not > {noformat} > [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error > Code: 0, > SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, > errorMessage:AnalysisException: cannot combine SELECT DISTINCT with analytic > functions > ), Query: SELECT DISTINCT > `sno` AS `c1`, > `pno` AS `c2`, > SUM(`qty`) > OVER( > ) AS `c3` > FROM > `cert`.`tsupply` > ORDER BY > `sno` ASC NULLS LAST, > `pno` ASC NULLS LAST. > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-5098) Correct handling of DISTINCT in the select list
[ https://issues.apache.org/jira/browse/IMPALA-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-5098: Priority: Critical (was: Major) > Correct handling of DISTINCT in the select list > --- > > Key: IMPALA-5098 > URL: https://issues.apache.org/jira/browse/IMPALA-5098 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 2.6.0 >Reporter: N Campbell >Priority: Critical > Labels: ansi-sql, sql-language > > DB2, ORACLE and various other systems will support the following statement > but Impala will not > {noformat} > [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error > Code: 0, > SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, > errorMessage:AnalysisException: cannot combine SELECT DISTINCT with analytic > functions > ), Query: SELECT DISTINCT > `sno` AS `c1`, > `pno` AS `c2`, > SUM(`qty`) > OVER( > ) AS `c3` > FROM > `cert`.`tsupply` > ORDER BY > `sno` ASC NULLS LAST, > `pno` ASC NULLS LAST. > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-4025) add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()
[ https://issues.apache.org/jira/browse/IMPALA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-4025: --- Assignee: Yongzhi Chen (was: Tianyi Wang) > add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN() > > > Key: IMPALA-4025 > URL: https://issues.apache.org/jira/browse/IMPALA-4025 > Project: IMPALA > Issue Type: New Feature > Components: Backend, Frontend >Affects Versions: Impala 2.2.4 >Reporter: Greg Rahn >Assignee: Yongzhi Chen >Priority: Major > Labels: built-in-function, sql-language > > Add the following functions as both an aggregate function and window/analytic > function: > * PERCENTILE_CONT > * PERCENTILE_DISC > * MEDIAN (impmented as PERCENTILE_CONT(0.5)) > h6. Syntax > {code} > PERCENTILE_CONT() WITHIN GROUP (ORDER BY [ASC|DESC] > [NULLS {FIRST | LAST}]) [ OVER ([])] > PERCENTILE_DISC() WITHIN GROUP (ORDER BY [ASC|DESC] > [NULLS {FIRST | LAST}]) [ OVER ([])] > MEDIAN(expr) [ OVER () ] > {code} > h6. Notes from other systems > *Greenplum* > {code} > PERCENTILE_CONT(_percentage_) WITHIN GROUP (ORDER BY _expression_) > {code} > http://gpdb.docs.pivotal.io/4320/admin_guide/query.html > Greenplum Database provides the MEDIAN aggregate function, which returns the > fiftieth percentile of the PERCENTILE_CONT result and special aggregate > expressions for inverse distribution functions as follows: > Currently you can use only these two expressions with the keyword WITHIN > GROUP. > Note: aggregation fuction only > *Oracle* > {code} > PERCENTILE_CONT(expr) WITHIN GROUP (ORDER BY expr [ DESC | ASC ]) [ OVER > (query_partition_clause) ]}} > {code} > http://docs.oracle.com/database/121/SQLRF/functions141.htm#SQLRF00687 > Note: implemented as both an aggregate and window function > *Vertica* > {code} > PERCENTILE_CONT ( %_number ) WITHIN GROUP (... ORDER BY expression [ ASC | > DESC ] ) OVER (... [ window-partition-clause ] ) > {code} > https://my.vertica.com/docs/7.2.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/Analytic/PERCENTILE_CONTAnalytic.htm > Note: window fuction only > *Teradata* > {code} > PERCENTILE_CONT() WITHIN GROUP (ORDER BY > [asc | desc] [nulls {first | last}]) > {code} > Note: aggregation fuction only > *Netezza* > {code} > SELECT fn() WITHIN GROUP (ORDER BY [asc|desc] [nulls > {first | last}]) FROM [GROUP BY ]; > {code} > https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html > Note: aggregation fuction only > *Redshift* > {code} > PERCENTILE_CONT ( percentile ) WITHIN GROUP (ORDER BY expr) OVER ( [ > PARTITION BY expr_list ] ) > {code} > https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html > Note: window fuction only -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8981) Support column masking in Impala
[ https://issues.apache.org/jira/browse/IMPALA-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8981: Priority: Critical (was: Major) > Support column masking in Impala > > > Key: IMPALA-8981 > URL: https://issues.apache.org/jira/browse/IMPALA-8981 > Project: IMPALA > Issue Type: New Feature >Affects Versions: Impala 3.4.0 >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Critical > > Related Hive Jira https://issues.apache.org/jira/browse/HIVE-13125 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8994) Support Row Filtering in Impala
[ https://issues.apache.org/jira/browse/IMPALA-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8994: Priority: Critical (was: Major) > Support Row Filtering in Impala > --- > > Key: IMPALA-8994 > URL: https://issues.apache.org/jira/browse/IMPALA-8994 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Critical > Fix For: Impala 3.4.0 > > > Related Hive Jira https://issues.apache.org/jira/browse/HIVE-13125 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-5049) Switch Parquet timestamp format to INT64
[ https://issues.apache.org/jira/browse/IMPALA-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-5049: Priority: Critical (was: Major) > Switch Parquet timestamp format to INT64 > > > Key: IMPALA-5049 > URL: https://issues.apache.org/jira/browse/IMPALA-5049 > Project: IMPALA > Issue Type: Epic > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Lars Volker >Priority: Critical > Labels: parquet > > We currently use INT96 to store Timestamp values in Parquet files which will > be deprecated in > [PARQUET-323|https://issues.apache.org/jira/browse/PARQUET-323]. > We need to add read and write support for INT64-based logical types > (TIMESTAMP_MILLIS, TIMESTAMP_MICROS) to our Parquet scanner and writer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7506) Support global INVALIDATE METADATA on fetch-on-demand impalad
[ https://issues.apache.org/jira/browse/IMPALA-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-7506: --- Assignee: Quanlong Huang (was: Dinesh Garg) > Support global INVALIDATE METADATA on fetch-on-demand impalad > - > > Key: IMPALA-7506 > URL: https://issues.apache.org/jira/browse/IMPALA-7506 > Project: IMPALA > Issue Type: Sub-task >Reporter: Todd Lipcon >Assignee: Quanlong Huang >Priority: Major > Labels: catalog-v2 > > There is some complexity with how this is implemented in the original code: > it depends on maintaining the minimum version of any object in the impalad's > local cache. We can't determine that in an on-demand impalad, so INVALIDATE > METADATA is not supported currently on "fetch-on-demand". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-7506) Support global INVALIDATE METADATA on fetch-on-demand impalad
[ https://issues.apache.org/jira/browse/IMPALA-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-7506 started by Dinesh Garg. --- > Support global INVALIDATE METADATA on fetch-on-demand impalad > - > > Key: IMPALA-7506 > URL: https://issues.apache.org/jira/browse/IMPALA-7506 > Project: IMPALA > Issue Type: Sub-task >Reporter: Todd Lipcon >Assignee: Dinesh Garg >Priority: Major > Labels: catalog-v2 > > There is some complexity with how this is implemented in the original code: > it depends on maintaining the minimum version of any object in the impalad's > local cache. We can't determine that in an on-demand impalad, so INVALIDATE > METADATA is not supported currently on "fetch-on-demand". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8587) Show inherited privileges in show grant w/ Ranger
[ https://issues.apache.org/jira/browse/IMPALA-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8587: --- Assignee: Fang-Yu Rao (was: Austin Nobis) > Show inherited privileges in show grant w/ Ranger > - > > Key: IMPALA-8587 > URL: https://issues.apache.org/jira/browse/IMPALA-8587 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Austin Nobis >Assignee: Fang-Yu Rao >Priority: Critical > > If an admin has privileges from: > *grant all on server to user admin;* > > Currently the command below will show no results: > *show grant user admin on database functional;* > > After the change, the user should see server level privileges from: > *show grant user admin on database functional;* > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8877) CatalogException during stress test: Table modified while operation was in progress
[ https://issues.apache.org/jira/browse/IMPALA-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8877: --- Assignee: Anurag Mantripragada (was: Vihang Karajgaonkar) > CatalogException during stress test: Table modified while operation was > in progress > - > > Key: IMPALA-8877 > URL: https://issues.apache.org/jira/browse/IMPALA-8877 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.3.0 >Reporter: David Knupp >Assignee: Anurag Mantripragada >Priority: Critical > Labels: catalog-v2 > Attachments: catalogd.INFO.tar.gz, impalad.INFO.tar.gz > > > This was hit while running the stress tests to get a baseline on a deployed > cluster. > /* Mem: 12850 MB. Coordinator: quasar-mzmnbe-6.vpc.cloudera.com. */ > COMPUTE STATS catalog_sales > {noformat} > Query (id=924a50178a5a6146:29d58a73) > Summary > Session ID: 5543fb9029e2b71f:f446381b1f59ed81 > Session Type: HIVESERVER2 > HiveServer2 Protocol Version: V6 > Start Time: 2019-08-19 01:26:07.292866000 > End Time: 2019-08-19 01:26:27.248053000 > Query Type: DDL > Query State: EXCEPTION > Query Status: CatalogException: Table > 'tpcds_300_decimal_parquet.catalog_sales' was modified while operation was in > progress, aborting execution. > Impala Version: impalad version 3.3.0-SNAPSHOT RELEASE (build > df3e7c051e2641524fc53a0cd07c2a14decd55f7) > User: syst...@vpc.cloudera.com > Connected User: syst...@vpc.cloudera.com > Delegated User: > Network Address: :::10.65.6.19:39174 > Default Db: tpcds_300_decimal_parquet > Sql Statement: /* Mem: 12850 MB. Coordinator: > quasar-mzmnbe-6.vpc.cloudera.com. */ > COMPUTE STATS catalog_sales > Coordinator: quasar-mzmnbe-6.vpc.cloudera.com:22000 > Query Options (set by configuration): > ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1 > Query Options (set by configuration and planner): > ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1 > DDL Type: COMPUTE_STATS > Query Compilation > Metadata of all 1 tables cached: 5.62s (5622372318) > Analysis finished: 5.62s (5622560027) > Authorization finished (noop): 5.62s (5622568284) > Retried query planning due to inconsistent metadata 7 of 40 times: > Catalog object TCatalogObject(type:TABLE, catalog_version:94204, > table:TTable(db_name:tpcds_300_decimal_parquet, tbl_name:catalog_sales)) > changed version between accesses.: 5.95s (5949859598) > Planning finished: 5.95s (5949861145) > Query Timeline > Query submitted: 0ns (0) > Planning finished: 5.95s (5950024020) > Child queries finished: 17.85s (17849072057) > Rows available: 19.82s (19825080035) > Unregister query: 19.95s (19955080560) > Frontend > - CatalogFetch.ColumnStats.Misses: 34 (34) > - CatalogFetch.ColumnStats.Requests: 34 (34) > - CatalogFetch.ColumnStats.Time: 0 (0) > - CatalogFetch.Config.Hits: 1 (1) > - CatalogFetch.Config.Requests: 1 (1) > - CatalogFetch.Config.Time: 0 (0) > - CatalogFetch.DatabaseList.Hits: 8 (8) > - CatalogFetch.DatabaseList.Requests: 8 (8) > - CatalogFetch.DatabaseList.Time: 0 (0) > - CatalogFetch.PartitionLists.Misses: 1 (1) > - CatalogFetch.PartitionLists.Requests: 1 (1) > - CatalogFetch.PartitionLists.Time: 7 (7) > - CatalogFetch.Partitions.Hits: 1837 (1837) > - CatalogFetch.Partitions.Misses: 1837 (1837) > - CatalogFetch.Partitions.Requests: 3674 (3674) > - CatalogFetch.Partitions.Time: 325 (325) > - CatalogFetch.RPCs.Bytes: 4.7 MiB (4936030) > - CatalogFetch.RPCs.Requests: 22 (22) > - CatalogFetch.RPCs.Time: 343 (343) > - CatalogFetch.TableNames.Hits: 4 (4) > - CatalogFetch.TableNames.Misses: 4 (4) > - CatalogFetch.TableNames.Requests: 8 (8) > - CatalogFetch.TableNames.Time: 0 (0) > - CatalogFetch.Tables.Misses: 8 (8) > - CatalogFetch.Tables.Requests: 8 (8) > - CatalogFetch.Tables.Time: 74 (74) > - InactiveTotalTime: 0ns (0) > - TotalTime: 0ns (0) > ImpalaServer > - CatalogOpExecTimer: 1.97s (1972007962) > - ClientFetchWaitTimer: 0ns (0) > - InactiveTotalTime: 0ns (0) > - RowMaterializationTimer: 0ns (0) > - TotalTime: 0ns (0) > Child Queries > Table Stats Query (id=db4821e4aa5bb04d:d4a5ae45) > Column Stats Query (id=0444367557e3496d:f9435111) > {noformat} --
[jira] [Updated] (IMPALA-8877) CatalogException during stress test: Table modified while operation was in progress
[ https://issues.apache.org/jira/browse/IMPALA-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8877: Labels: catalog-v2 (was: ) > CatalogException during stress test: Table modified while operation was > in progress > - > > Key: IMPALA-8877 > URL: https://issues.apache.org/jira/browse/IMPALA-8877 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.3.0 >Reporter: David Knupp >Assignee: Vihang Karajgaonkar >Priority: Critical > Labels: catalog-v2 > Attachments: catalogd.INFO.tar.gz, impalad.INFO.tar.gz > > > This was hit while running the stress tests to get a baseline on a deployed > cluster. > /* Mem: 12850 MB. Coordinator: quasar-mzmnbe-6.vpc.cloudera.com. */ > COMPUTE STATS catalog_sales > {noformat} > Query (id=924a50178a5a6146:29d58a73) > Summary > Session ID: 5543fb9029e2b71f:f446381b1f59ed81 > Session Type: HIVESERVER2 > HiveServer2 Protocol Version: V6 > Start Time: 2019-08-19 01:26:07.292866000 > End Time: 2019-08-19 01:26:27.248053000 > Query Type: DDL > Query State: EXCEPTION > Query Status: CatalogException: Table > 'tpcds_300_decimal_parquet.catalog_sales' was modified while operation was in > progress, aborting execution. > Impala Version: impalad version 3.3.0-SNAPSHOT RELEASE (build > df3e7c051e2641524fc53a0cd07c2a14decd55f7) > User: syst...@vpc.cloudera.com > Connected User: syst...@vpc.cloudera.com > Delegated User: > Network Address: :::10.65.6.19:39174 > Default Db: tpcds_300_decimal_parquet > Sql Statement: /* Mem: 12850 MB. Coordinator: > quasar-mzmnbe-6.vpc.cloudera.com. */ > COMPUTE STATS catalog_sales > Coordinator: quasar-mzmnbe-6.vpc.cloudera.com:22000 > Query Options (set by configuration): > ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1 > Query Options (set by configuration and planner): > ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1 > DDL Type: COMPUTE_STATS > Query Compilation > Metadata of all 1 tables cached: 5.62s (5622372318) > Analysis finished: 5.62s (5622560027) > Authorization finished (noop): 5.62s (5622568284) > Retried query planning due to inconsistent metadata 7 of 40 times: > Catalog object TCatalogObject(type:TABLE, catalog_version:94204, > table:TTable(db_name:tpcds_300_decimal_parquet, tbl_name:catalog_sales)) > changed version between accesses.: 5.95s (5949859598) > Planning finished: 5.95s (5949861145) > Query Timeline > Query submitted: 0ns (0) > Planning finished: 5.95s (5950024020) > Child queries finished: 17.85s (17849072057) > Rows available: 19.82s (19825080035) > Unregister query: 19.95s (19955080560) > Frontend > - CatalogFetch.ColumnStats.Misses: 34 (34) > - CatalogFetch.ColumnStats.Requests: 34 (34) > - CatalogFetch.ColumnStats.Time: 0 (0) > - CatalogFetch.Config.Hits: 1 (1) > - CatalogFetch.Config.Requests: 1 (1) > - CatalogFetch.Config.Time: 0 (0) > - CatalogFetch.DatabaseList.Hits: 8 (8) > - CatalogFetch.DatabaseList.Requests: 8 (8) > - CatalogFetch.DatabaseList.Time: 0 (0) > - CatalogFetch.PartitionLists.Misses: 1 (1) > - CatalogFetch.PartitionLists.Requests: 1 (1) > - CatalogFetch.PartitionLists.Time: 7 (7) > - CatalogFetch.Partitions.Hits: 1837 (1837) > - CatalogFetch.Partitions.Misses: 1837 (1837) > - CatalogFetch.Partitions.Requests: 3674 (3674) > - CatalogFetch.Partitions.Time: 325 (325) > - CatalogFetch.RPCs.Bytes: 4.7 MiB (4936030) > - CatalogFetch.RPCs.Requests: 22 (22) > - CatalogFetch.RPCs.Time: 343 (343) > - CatalogFetch.TableNames.Hits: 4 (4) > - CatalogFetch.TableNames.Misses: 4 (4) > - CatalogFetch.TableNames.Requests: 8 (8) > - CatalogFetch.TableNames.Time: 0 (0) > - CatalogFetch.Tables.Misses: 8 (8) > - CatalogFetch.Tables.Requests: 8 (8) > - CatalogFetch.Tables.Time: 74 (74) > - InactiveTotalTime: 0ns (0) > - TotalTime: 0ns (0) > ImpalaServer > - CatalogOpExecTimer: 1.97s (1972007962) > - ClientFetchWaitTimer: 0ns (0) > - InactiveTotalTime: 0ns (0) > - RowMaterializationTimer: 0ns (0) > - TotalTime: 0ns (0) > Child Queries > Table Stats Query (id=db4821e4aa5bb04d:d4a5ae45) > Column Stats Query (id=0444367557e3496d:f9435111) > {noformat} -- This message was sent by Atlassian Jira
[jira] [Resolved] (IMPALA-8889) Incorrect exception message when trying unsupported option for acid tables
[ https://issues.apache.org/jira/browse/IMPALA-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg resolved IMPALA-8889. - Resolution: Fixed > Incorrect exception message when trying unsupported option for acid tables > -- > > Key: IMPALA-8889 > URL: https://issues.apache.org/jira/browse/IMPALA-8889 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.3.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Critical > > when we try unsupported option say alter table on acid tables from , it thows > an exception which is expected but it gives a wrong message : > It says we only support Read for insert-only tables which is not true > anymore, since we also support insert, drop ( and soon truncate) also now. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Resolved] (IMPALA-8889) Incorrect exception message when trying unsupported option for acid tables
[ https://issues.apache.org/jira/browse/IMPALA-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg resolved IMPALA-8889. - Resolution: Fixed > Incorrect exception message when trying unsupported option for acid tables > -- > > Key: IMPALA-8889 > URL: https://issues.apache.org/jira/browse/IMPALA-8889 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.3.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Critical > > when we try unsupported option say alter table on acid tables from , it thows > an exception which is expected but it gives a wrong message : > It says we only support Read for insert-only tables which is not true > anymore, since we also support insert, drop ( and soon truncate) also now. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8793) Implement TRUNCATE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg resolved IMPALA-8793. - Resolution: Fixed > Implement TRUNCATE for insert-only ACID tables > -- > > Key: IMPALA-8793 > URL: https://issues.apache.org/jira/browse/IMPALA-8793 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: impala-acid > > Impala currently cannot TRUNCATE insert-only tables. > TRUNCATE is a DDL statement that deletes all the files and drops all column > and table statistics. (Impala currently cannot truncate specific partitions, > only the whole table. Truncating specific partitions is out of scope of this > Jira.) > TRUNCATE doesn't only mean to create a new empty base directory, but to > really remove all the files, this is the behavior of Hive as well. > To implement TRUNCATE Impala must acquire an EXCLUSIVE lock on the table. > After that Impala must recursively delete all the data files belonging to the > table. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8793) Implement TRUNCATE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg resolved IMPALA-8793. - Resolution: Fixed > Implement TRUNCATE for insert-only ACID tables > -- > > Key: IMPALA-8793 > URL: https://issues.apache.org/jira/browse/IMPALA-8793 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: impala-acid > > Impala currently cannot TRUNCATE insert-only tables. > TRUNCATE is a DDL statement that deletes all the files and drops all column > and table statistics. (Impala currently cannot truncate specific partitions, > only the whole table. Truncating specific partitions is out of scope of this > Jira.) > TRUNCATE doesn't only mean to create a new empty base directory, but to > really remove all the files, this is the behavior of Hive as well. > To implement TRUNCATE Impala must acquire an EXCLUSIVE lock on the table. > After that Impala must recursively delete all the data files belonging to the > table. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Assigned] (IMPALA-8793) Implement TRUNCATE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8793: --- Assignee: Zoltán Borók-Nagy > Implement TRUNCATE for insert-only ACID tables > -- > > Key: IMPALA-8793 > URL: https://issues.apache.org/jira/browse/IMPALA-8793 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: impala-acid > > Impala currently cannot TRUNCATE insert-only tables. > TRUNCATE is a DDL statement that deletes all the files and drops all column > and table statistics. (Impala currently cannot truncate specific partitions, > only the whole table. Truncating specific partitions is out of scope of this > Jira.) > TRUNCATE doesn't only mean to create a new empty base directory, but to > really remove all the files, this is the behavior of Hive as well. > To implement TRUNCATE Impala must acquire an EXCLUSIVE lock on the table. > After that Impala must recursively delete all the data files belonging to the > table. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8793) Implement TRUNCATE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8793: --- Assignee: (was: Dinesh Garg) > Implement TRUNCATE for insert-only ACID tables > -- > > Key: IMPALA-8793 > URL: https://issues.apache.org/jira/browse/IMPALA-8793 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Priority: Major > Labels: impala-acid > > Impala currently cannot TRUNCATE insert-only tables. > TRUNCATE is a DDL statement that deletes all the files and drops all column > and table statistics. (Impala currently cannot truncate specific partitions, > only the whole table. Truncating specific partitions is out of scope of this > Jira.) > TRUNCATE doesn't only mean to create a new empty base directory, but to > really remove all the files, this is the behavior of Hive as well. > To implement TRUNCATE Impala must acquire an EXCLUSIVE lock on the table. > After that Impala must recursively delete all the data files belonging to the > table. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8793) Implement TRUNCATE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916393#comment-16916393 ] Dinesh Garg commented on IMPALA-8793: - [https://gerrit.cloudera.org/c/14071/] > Implement TRUNCATE for insert-only ACID tables > -- > > Key: IMPALA-8793 > URL: https://issues.apache.org/jira/browse/IMPALA-8793 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Dinesh Garg >Priority: Major > Labels: impala-acid > > Impala currently cannot TRUNCATE insert-only tables. > TRUNCATE is a DDL statement that deletes all the files and drops all column > and table statistics. (Impala currently cannot truncate specific partitions, > only the whole table. Truncating specific partitions is out of scope of this > Jira.) > TRUNCATE doesn't only mean to create a new empty base directory, but to > really remove all the files, this is the behavior of Hive as well. > To implement TRUNCATE Impala must acquire an EXCLUSIVE lock on the table. > After that Impala must recursively delete all the data files belonging to the > table. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8793) Implement TRUNCATE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8793 started by Dinesh Garg. --- > Implement TRUNCATE for insert-only ACID tables > -- > > Key: IMPALA-8793 > URL: https://issues.apache.org/jira/browse/IMPALA-8793 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Dinesh Garg >Priority: Major > Labels: impala-acid > > Impala currently cannot TRUNCATE insert-only tables. > TRUNCATE is a DDL statement that deletes all the files and drops all column > and table statistics. (Impala currently cannot truncate specific partitions, > only the whole table. Truncating specific partitions is out of scope of this > Jira.) > TRUNCATE doesn't only mean to create a new empty base directory, but to > really remove all the files, this is the behavior of Hive as well. > To implement TRUNCATE Impala must acquire an EXCLUSIVE lock on the table. > After that Impala must recursively delete all the data files belonging to the > table. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7506) Support global INVALIDATE METADATA on fetch-on-demand impalad
[ https://issues.apache.org/jira/browse/IMPALA-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-7506: --- Assignee: Quanlong Huang > Support global INVALIDATE METADATA on fetch-on-demand impalad > - > > Key: IMPALA-7506 > URL: https://issues.apache.org/jira/browse/IMPALA-7506 > Project: IMPALA > Issue Type: Sub-task >Reporter: Todd Lipcon >Assignee: Quanlong Huang >Priority: Major > Labels: catalog-v2 > > There is some complexity with how this is implemented in the original code: > it depends on maintaining the minimum version of any object in the impalad's > local cache. We can't determine that in an on-demand impalad, so INVALIDATE > METADATA is not supported currently on "fetch-on-demand". -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8875) TestHmsIntegration.test_drop_column_maintains_stats seems flaky
[ https://issues.apache.org/jira/browse/IMPALA-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8875: Labels: broken-build impala-stats (was: broken-build) > TestHmsIntegration.test_drop_column_maintains_stats seems flaky > --- > > Key: IMPALA-8875 > URL: https://issues.apache.org/jira/browse/IMPALA-8875 > Project: IMPALA > Issue Type: Bug >Reporter: Fang-Yu Rao >Assignee: Gabor Kaszab >Priority: Blocker > Labels: broken-build, impala-stats > > The test of TestHmsIntegration.test_drop_column_maintains_stats seems flaky. > The related test file was updated recently due to > https://issues.apache.org/jira/browse/IMPALA-8823. Create this JIRA to track > this failed test. Maybe [~gaborkaszab] you could take a brief look at this? > Thanks! > The error messages are provided in the following. > {code:java} > Error Message > assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...} > Common items: {'avg_col_len': '', 'bitVector': '', 'col_name': 'x', > 'comment': 'from deserializer', 'data_type': 'int', 'distinct_count': '0', > 'max': '0', 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': > '0', 'num_trues': ''} Right contains more items: {'COLUMN_STATS_ACCURATE': > '{}'} Full diff: + {'COLUMN_STATS_ACCURATE': '{}', - {'avg_col_len': '', ? ^ > + 'avg_col_len': '', ? ^ 'bitVector': '', 'col_name': 'x', 'comment': 'from > deserializer', 'data_type': 'int', 'distinct_count': '0', 'max': '0', > 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': '0', > 'num_trues': ''} > {code} > The stack trace is given as follows. > {code:java} > Stacktrace > metadata/test_hms_integration.py:390: in test_drop_column_maintains_stats > assert hive_x_stats == self.hive_column_stats(table_name, 'x') > E assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...} > E Common items: > E {'avg_col_len': '', > E 'bitVector': '', > E 'col_name': 'x', > E 'comment': 'from deserializer', > E 'data_type': 'int', > E 'distinct_count': '0', > E 'max': '0', > E 'max_col_len': '', > E 'min': '0', > E 'num_falses': '', > E 'num_nulls': '0', > E 'num_trues': ''} > E Right contains more items: > E {'COLUMN_STATS_ACCURATE': '{}'} > E Full diff: > E + {'COLUMN_STATS_ACCURATE': '{}', > E - {'avg_col_len': '', > E ? ^ > E + 'avg_col_len': '', > E ? ^ > E 'bitVector': '', > E 'col_name': 'x', > E 'comment': 'from deserializer', > E 'data_type': 'int', > E 'distinct_count': '0', > E 'max': '0', > E 'max_col_len': '', > E 'min': '0', > E 'num_falses': '', > E 'num_nulls': '0', > E 'num_trues': ''} > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8572) Move query hook execution to before query unregistration
[ https://issues.apache.org/jira/browse/IMPALA-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8572: --- Assignee: bharath v (was: radford nguyen) > Move query hook execution to before query unregistration > > > Key: IMPALA-8572 > URL: https://issues.apache.org/jira/browse/IMPALA-8572 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: radford nguyen >Assignee: bharath v >Priority: Critical > > The backend currently executes query event hooks during > {{ImpalaServer::UnregisterQuery}}, which may actually only happen a long time > after the query has actually executed. We depend on either the client closing > the query/session, the client's connection dropping, or an idle session > timing out. > e.g. the following sequence is possible. > # User executes query from Hue. > # User goes home for weekend, leaving Hue tab open in browser > # If we're lucky, the session timeout expires after some amount of idle time. > # The query gets unregistered, hooks get executed > It would generally be desirable to move the lineage logger earlier in the > query lifecycle, so it occurs as soon as all of the required data is > available. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8572) Move query hook execution to before query unregistration
[ https://issues.apache.org/jira/browse/IMPALA-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8572: Priority: Critical (was: Major) > Move query hook execution to before query unregistration > > > Key: IMPALA-8572 > URL: https://issues.apache.org/jira/browse/IMPALA-8572 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: radford nguyen >Assignee: radford nguyen >Priority: Critical > > The backend currently executes query event hooks during > {{ImpalaServer::UnregisterQuery}}, which may actually only happen a long time > after the query has actually executed. We depend on either the client closing > the query/session, the client's connection dropping, or an idle session > timing out. > e.g. the following sequence is possible. > # User executes query from Hue. > # User goes home for weekend, leaving Hue tab open in browser > # If we're lucky, the session timeout expires after some amount of idle time. > # The query gets unregistered, hooks get executed > It would generally be desirable to move the lineage logger earlier in the > query lifecycle, so it occurs as soon as all of the required data is > available. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8572) Move query hook execution to before query unregistration
[ https://issues.apache.org/jira/browse/IMPALA-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8572: --- Assignee: radford nguyen > Move query hook execution to before query unregistration > > > Key: IMPALA-8572 > URL: https://issues.apache.org/jira/browse/IMPALA-8572 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: radford nguyen >Assignee: radford nguyen >Priority: Major > > The backend currently executes query event hooks during > {{ImpalaServer::UnregisterQuery}}, which may actually only happen a long time > after the query has actually executed. We depend on either the client closing > the query/session, the client's connection dropping, or an idle session > timing out. > e.g. the following sequence is possible. > # User executes query from Hue. > # User goes home for weekend, leaving Hue tab open in browser > # If we're lucky, the session timeout expires after some amount of idle time. > # The query gets unregistered, hooks get executed > It would generally be desirable to move the lineage logger earlier in the > query lifecycle, so it occurs as soon as all of the required data is > available. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8823) Implement DROP TABLE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg resolved IMPALA-8823. - Resolution: Fixed > Implement DROP TABLE for insert-only ACID tables > > > Key: IMPALA-8823 > URL: https://issues.apache.org/jira/browse/IMPALA-8823 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > > Impala currently cannot drop insert-only ACID tables. > To implement DROP TABLE for insert-only tables at first we need to acquire an > exclusive lock from HMS, then proceed with the usual DROP TABLE process. > Heartbeating the lock might be also needed. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8823) Implement DROP TABLE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg resolved IMPALA-8823. - Resolution: Fixed > Implement DROP TABLE for insert-only ACID tables > > > Key: IMPALA-8823 > URL: https://issues.apache.org/jira/browse/IMPALA-8823 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > > Impala currently cannot drop insert-only ACID tables. > To implement DROP TABLE for insert-only tables at first we need to acquire an > exclusive lock from HMS, then proceed with the usual DROP TABLE process. > Heartbeating the lock might be also needed. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (IMPALA-8717) impala-shell support for HiveServer2 HTTP endpoint
[ https://issues.apache.org/jira/browse/IMPALA-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8717: Priority: Critical (was: Major) > impala-shell support for HiveServer2 HTTP endpoint > -- > > Key: IMPALA-8717 > URL: https://issues.apache.org/jira/browse/IMPALA-8717 > Project: IMPALA > Issue Type: Sub-task > Components: Clients >Affects Versions: Impala 3.3.0 >Reporter: bharath v >Assignee: bharath v >Priority: Critical > > Having impala-shell support to connect to the HTTP HS2 endpoints should be > super helpful. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8636) Implement INSERT for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8636: Priority: Critical (was: Major) > Implement INSERT for insert-only ACID tables > > > Key: IMPALA-8636 > URL: https://issues.apache.org/jira/browse/IMPALA-8636 > Project: IMPALA > Issue Type: New Feature >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Critical > Labels: impala-acid > > Impala should support insertion for insert-only ACID tables. > For this we need to allocate a write ID for the target table, and write the > data into the base/delta directories. > INSERT operation should create a new delta directory with the allocated write > ID. > INSERT OVERWRITE should create a new base directory with the allocated write > ID. This new base directory will only contain the data coming from this > operation. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8725) Improve usability when HMS is configured with strict managed tables
[ https://issues.apache.org/jira/browse/IMPALA-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8725: Priority: Critical (was: Major) > Improve usability when HMS is configured with strict managed tables > --- > > Key: IMPALA-8725 > URL: https://issues.apache.org/jira/browse/IMPALA-8725 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Reporter: Anurag Mantripragada >Priority: Critical > > Users tend to create and query managed tables often and when HMS is > configured with strict managed tables they get: > {code:java} > Table default.foo failed strict managed table checks due to the following > reason: Table is marked as a managed table but is not transactional{code} > We should improve usability in these scenarios. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8486) test_udf_update_via_drop and test_udf_update_via_create fail on local catalog
[ https://issues.apache.org/jira/browse/IMPALA-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8486: --- Assignee: (was: Todd Lipcon) > test_udf_update_via_drop and test_udf_update_via_create fail on local catalog > - > > Key: IMPALA-8486 > URL: https://issues.apache.org/jira/browse/IMPALA-8486 > Project: IMPALA > Issue Type: Improvement > Components: Catalog >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Priority: Critical > Labels: catalog-v2 > > {noformat} > TestUdfTargeted.test_udf_update_via_drop[protocol: beeswax | exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_threshold': 0} | table_format: text/none] > tests/query_test/test_udfs.py:541: in test_udf_update_via_drop > self._run_query_all_impalads(exec_options, query_stmt, ["New UDF"]) > tests/query_test/test_udfs.py:52: in _run_query_all_impalads > assert result.data == expected > E assert ['Old UDF'] == ['New UDF'] > E At index 0 diff: 'Old UDF' != 'New UDF' > E Full diff: > E - ['Old UDF'] > E + ['New UDF'] > > {noformat} > The tests are checking that the local UDF caches on each impalad get > invalidated by a drop/create of a function referencing the HDFS file > containing the UDF. The test fails because the local catalog, unlike the > regular catalog, doesn't invalidate LibCache entries upon receiving a catalog > update. > I looked at this for long enough to realise that the invalidation mechanism > is fundamentally broken - it doesn't work with dedicated executors. It also > creates a race between the statestore updates and queries referencing the > UDFs - if the queries win the race, then they can incorrectly use the old > version that should have been invalidated. > I think this is a potentially problematic issue because old JAR/SO versions > could persist in the cache indefinitely if old versions are overwritten in > place. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7506) Support global INVALIDATE METADATA on fetch-on-demand impalad
[ https://issues.apache.org/jira/browse/IMPALA-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-7506: --- Assignee: (was: Todd Lipcon) > Support global INVALIDATE METADATA on fetch-on-demand impalad > - > > Key: IMPALA-7506 > URL: https://issues.apache.org/jira/browse/IMPALA-7506 > Project: IMPALA > Issue Type: Sub-task >Reporter: Todd Lipcon >Priority: Major > Labels: catalog-v2 > > There is some complexity with how this is implemented in the original code: > it depends on maintaining the minimum version of any object in the impalad's > local cache. We can't determine that in an on-demand impalad, so INVALIDATE > METADATA is not supported currently on "fetch-on-demand". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7615) Partition metadata mismatch should be handled gracefully in local catalog mode.
[ https://issues.apache.org/jira/browse/IMPALA-7615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-7615: --- Assignee: bharath v > Partition metadata mismatch should be handled gracefully in local catalog > mode. > --- > > Key: IMPALA-7615 > URL: https://issues.apache.org/jira/browse/IMPALA-7615 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: bharath v >Assignee: bharath v >Priority: Major > Labels: catalog-v2 > > *This is a Catalog v2 only improvement* > An RPC to fetch partition metadata for a partition ID that does not exist on > the Catalog server currently throws IAE. > {noformat} > @Override > public TGetPartialCatalogObjectResponse getPartialInfo( > TGetPartialCatalogObjectRequest req) throws TableLoadingException { > for (long partId : partIds) { > HdfsPartition part = partitionMap_.get(partId); > Preconditions.checkArgument(part != null, "Partition id %s does not > exist", <-- > partId); > TPartialPartitionInfo partInfo = new TPartialPartitionInfo(partId); > if (req.table_info_selector.want_partition_names) { > partInfo.setName(part.getPartitionName()); > } > if (req.table_info_selector.want_partition_metadata) { > partInfo.hms_partition = part.toHmsPartition(); > {noformat} > This is undesirable since such exceptions are not transparently retried in > the frontend. Instead we should fix this code path to throw > InconsistentMetadataException, similar to what we do for other code paths > that handle such inconsistent metadata like version changes. > An example stack trace that hits this issue looks like follows, > {noformat} > org.apache.impala.catalog.local.LocalCatalogException: Could not load > partitions for table partition_level_tests.store_sales > at > org.apache.impala.catalog.local.LocalFsTable.loadPartitions(LocalFsTable.java:399) > at > org.apache.impala.catalog.FeCatalogUtils.loadAllPartitions(FeCatalogUtils.java:207) > at > org.apache.impala.catalog.local.LocalFsTable.getMajorityFormat(LocalFsTable.java:244) > at > org.apache.impala.planner.HdfsTableSink.computeResourceProfile(HdfsTableSink.java:75) > at > org.apache.impala.planner.PlanFragment.computeResourceProfile(PlanFragment.java:233) > at org.apache.impala.planner.Planner.computeResourceReqs(Planner.java:365) > at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1020) > at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1162) > at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1077) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156) > Caused by: org.apache.thrift.TException: > TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, > error_msgs:[IllegalArgumentException: Partition id 10084 does not exist]), > lookup_status:OK) > at > org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:322) > at > org.apache.impala.catalog.local.CatalogdMetaProvider.loadPartitionsFromCatalogd(CatalogdMetaProvider.java:644) > at > org.apache.impala.catalog.local.CatalogdMetaProvider.loadPartitionsByRefs(CatalogdMetaProvider.java:610) > at > org.apache.impala.catalog.local.LocalFsTable.loadPartitions(LocalFsTable.java:395) > ... 9 more{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7864) TestLocalCatalogRetries::test_replan_limit is flaky
[ https://issues.apache.org/jira/browse/IMPALA-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-7864: --- Assignee: bharath v (was: Vihang Karajgaonkar) > TestLocalCatalogRetries::test_replan_limit is flaky > --- > > Key: IMPALA-7864 > URL: https://issues.apache.org/jira/browse/IMPALA-7864 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.0, Impala 2.12.0 > Environment: Ubuntu 16.04 >Reporter: Jim Apple >Assignee: bharath v >Priority: Critical > Labels: broken-build, catalog-v2, flaky > Fix For: Impala 3.2.0 > > > In https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3605/, > TestLocalCatalogRetries::test_replan_limit failed on an unrelated patch. On > my development machine, the test passed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8648) Impala ACID read stress tests
[ https://issues.apache.org/jira/browse/IMPALA-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8648: --- Assignee: Csaba Ringhofer (was: Todd Lipcon) > Impala ACID read stress tests > - > > Key: IMPALA-8648 > URL: https://issues.apache.org/jira/browse/IMPALA-8648 > Project: IMPALA > Issue Type: Test >Reporter: Dinesh Garg >Assignee: Csaba Ringhofer >Priority: Critical > Labels: impala-acid > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8369) Impala should be able to interoperate with Hive 3.1.0
[ https://issues.apache.org/jira/browse/IMPALA-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8369: --- Assignee: Vihang Karajgaonkar (was: Dinesh Garg) > Impala should be able to interoperate with Hive 3.1.0 > - > > Key: IMPALA-8369 > URL: https://issues.apache.org/jira/browse/IMPALA-8369 > Project: IMPALA > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Labels: impala-acid > > Currently, Impala only works with Hive 2.1.1. Since Hive 3.1.0 has been > released for a while it would be good to add support for Hive 3.1.0 (HMS > 3.1.0). This patch will focus on ability to connect to HMS 3.1.0 and run > existing tests. It will not focus on adding support for newer features like > ACID in Hive 3.1.0 which can be taken up as separate JIRA. > It would be good to make changes to Impala source code such that it can work > with both Hive 2.1.0 and Hive 3.1.0 without the need to create a separate > branch. However, this should be a aspirational goal. If we hit a blocker we > should investigate alternative approaches. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8663) FileMetadataLoader should skip listing files in hidden and tmp directories
[ https://issues.apache.org/jira/browse/IMPALA-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8663: Labels: impala-acid (was: ) > FileMetadataLoader should skip listing files in hidden and tmp directories > -- > > Key: IMPALA-8663 > URL: https://issues.apache.org/jira/browse/IMPALA-8663 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Labels: impala-acid > > Currently, the file metadata loader recursively lists the table and partition > directories to get the fileStatuses. For each filestatus we ignore the hidden > files in {{FileSystemUtil.isValidDataFile}}(). However that is not > sufficient. For instance, if Hive is inserting data into a table when the > refresh is called, it is possible the staging directory is present within the > table directory. This staging directory is a hidden directory of the naming > {{.hive-staging_*}}. It is possible that this directory has files which are > not hidden (starting from a . or _). Such files should be considered > temporary files and should not be considered as valid data files. > > Another instance where we see this happen is in transactional tables which > has a {{.manifest}} which is located in a {{_tmp}} directory within the table > directory. This file should also be skipped and not considered as a valid > data file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8663) FileMetadataLoader should skip listing files in hidden and tmp directories
[ https://issues.apache.org/jira/browse/IMPALA-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8663: Priority: Critical (was: Major) > FileMetadataLoader should skip listing files in hidden and tmp directories > -- > > Key: IMPALA-8663 > URL: https://issues.apache.org/jira/browse/IMPALA-8663 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Critical > Labels: impala-acid > > Currently, the file metadata loader recursively lists the table and partition > directories to get the fileStatuses. For each filestatus we ignore the hidden > files in {{FileSystemUtil.isValidDataFile}}(). However that is not > sufficient. For instance, if Hive is inserting data into a table when the > refresh is called, it is possible the staging directory is present within the > table directory. This staging directory is a hidden directory of the naming > {{.hive-staging_*}}. It is possible that this directory has files which are > not hidden (starting from a . or _). Such files should be considered > temporary files and should not be considered as valid data files. > > Another instance where we see this happen is in transactional tables which > has a {{.manifest}} which is located in a {{_tmp}} directory within the table > directory. This file should also be skipped and not considered as a valid > data file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8339) Coordinator should be more resilient to fragment instances startup failure
[ https://issues.apache.org/jira/browse/IMPALA-8339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8339: Priority: Critical (was: Major) > Coordinator should be more resilient to fragment instances startup failure > -- > > Key: IMPALA-8339 > URL: https://issues.apache.org/jira/browse/IMPALA-8339 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Reporter: Michael Ho >Assignee: Thomas Tauber-Marshall >Priority: Critical > Labels: Availability, resilience > > Impala currently relies on statestore for cluster membership. When an Impala > executor goes offline, it may take a while for statestore to declare that > node as unavailable and for that information to be propagated to all > coordinator nodes. Within this window, some coordinator nodes may still > attempt to issue RPCs to the faulty node, resulting in RPC failures which > resulted in query failures. In other words, many queries may fail to start > within this window until all coordinator nodes get the latest information on > cluster membership. > Going forward, coordinator may need to fall back to using backup executors > for each fragments in case some of the executors are not available. Moreover, > *coordinator should treat the cluster membership information from statestore > (or any external source of truth e.g. etcd) as hints instead of ground truth* > and adjust the scheduling of fragment instances based on the availability of > the executors from the coordinator's perspective. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8663) FileMetadataLoader should skip listing files in hidden and tmp directories
[ https://issues.apache.org/jira/browse/IMPALA-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8663: --- Assignee: Vihang Karajgaonkar (was: Dinesh Garg) > FileMetadataLoader should skip listing files in hidden and tmp directories > -- > > Key: IMPALA-8663 > URL: https://issues.apache.org/jira/browse/IMPALA-8663 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > > Currently, the file metadata loader recursively lists the table and partition > directories to get the fileStatuses. For each filestatus we ignore the hidden > files in {{FileSystemUtil.isValidDataFile}}(). However that is not > sufficient. For instance, if Hive is inserting data into a table when the > refresh is called, it is possible the staging directory is present within the > table directory. This staging directory is a hidden directory of the naming > {{.hive-staging_*}}. It is possible that this directory has files which are > not hidden (starting from a . or _). Such files should be considered > temporary files and should not be considered as valid data files. > > Another instance where we see this happen is in transactional tables which > has a {{.manifest}} which is located in a {{_tmp}} directory within the table > directory. This file should also be skipped and not considered as a valid > data file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8663) FileMetadataLoader should skip listing files in hidden and tmp directories
[ https://issues.apache.org/jira/browse/IMPALA-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8663 started by Dinesh Garg. --- > FileMetadataLoader should skip listing files in hidden and tmp directories > -- > > Key: IMPALA-8663 > URL: https://issues.apache.org/jira/browse/IMPALA-8663 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Dinesh Garg >Priority: Major > > Currently, the file metadata loader recursively lists the table and partition > directories to get the fileStatuses. For each filestatus we ignore the hidden > files in {{FileSystemUtil.isValidDataFile}}(). However that is not > sufficient. For instance, if Hive is inserting data into a table when the > refresh is called, it is possible the staging directory is present within the > table directory. This staging directory is a hidden directory of the naming > {{.hive-staging_*}}. It is possible that this directory has files which are > not hidden (starting from a . or _). Such files should be considered > temporary files and should not be considered as valid data files. > > Another instance where we see this happen is in transactional tables which > has a {{.manifest}} which is located in a {{_tmp}} directory within the table > directory. This file should also be skipped and not considered as a valid > data file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8436) Disallow write/alter to materialized views
[ https://issues.apache.org/jira/browse/IMPALA-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg resolved IMPALA-8436. - Resolution: Fixed > Disallow write/alter to materialized views > -- > > Key: IMPALA-8436 > URL: https://issues.apache.org/jira/browse/IMPALA-8436 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Sudhanshu Arora >Priority: Critical > Labels: impala-acid > > Block write/alter into materialized views, but allow select -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8436) Disallow write/alter to materialized views
[ https://issues.apache.org/jira/browse/IMPALA-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg resolved IMPALA-8436. - Resolution: Fixed > Disallow write/alter to materialized views > -- > > Key: IMPALA-8436 > URL: https://issues.apache.org/jira/browse/IMPALA-8436 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Sudhanshu Arora >Priority: Critical > Labels: impala-acid > > Block write/alter into materialized views, but allow select -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IMPALA-8585) Impala ACID tests
[ https://issues.apache.org/jira/browse/IMPALA-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8585: --- Assignee: Csaba Ringhofer (was: Dinesh Garg) > Impala ACID tests > - > > Key: IMPALA-8585 > URL: https://issues.apache.org/jira/browse/IMPALA-8585 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Csaba Ringhofer >Priority: Critical > Labels: impala-acid > > Umbrella Jira for adding tests about ACID functionality, e.g.: > * Ordinary table that was upgraded to ACID table > * Inserting data in hive and querying it in Impala concurrently > * Compute stats interoperability between Hive and Impala > * Partitioned tables, dynamic partitioning -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8585) Impala ACID tests
[ https://issues.apache.org/jira/browse/IMPALA-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8585 started by Dinesh Garg. --- > Impala ACID tests > - > > Key: IMPALA-8585 > URL: https://issues.apache.org/jira/browse/IMPALA-8585 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Dinesh Garg >Priority: Critical > Labels: impala-acid > > Umbrella Jira for adding tests about ACID functionality, e.g.: > * Ordinary table that was upgraded to ACID table > * Inserting data in hive and querying it in Impala concurrently > * Compute stats interoperability between Hive and Impala > * Partitioned tables, dynamic partitioning -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8439) Add Hive ACID tables during dataload if Hive 3.1 is enabled
[ https://issues.apache.org/jira/browse/IMPALA-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8439: --- Assignee: Yongzhi Chen (was: Dinesh Garg) > Add Hive ACID tables during dataload if Hive 3.1 is enabled > --- > > Key: IMPALA-8439 > URL: https://issues.apache.org/jira/browse/IMPALA-8439 > Project: IMPALA > Issue Type: Story > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Csaba Ringhofer >Assignee: Yongzhi Chen >Priority: Critical > Labels: impala-acid > > Test warehouse should include a few transactional tables (insert-only, not > insert-only, partitioned, not partitioned, bucketed, not-bucketed, compacted > and uncompacted) to enable the testing of ACID features. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8439) Add Hive ACID tables during dataload if Hive 3.1 is enabled
[ https://issues.apache.org/jira/browse/IMPALA-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8439 started by Dinesh Garg. --- > Add Hive ACID tables during dataload if Hive 3.1 is enabled > --- > > Key: IMPALA-8439 > URL: https://issues.apache.org/jira/browse/IMPALA-8439 > Project: IMPALA > Issue Type: Story > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Csaba Ringhofer >Assignee: Dinesh Garg >Priority: Critical > Labels: impala-acid > > Test warehouse should include a few transactional tables (insert-only, not > insert-only, partitioned, not partitioned, bucketed, not-bucketed, compacted > and uncompacted) to enable the testing of ACID features. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Reopened] (IMPALA-8439) Add Hive ACID tables during dataload if Hive 3.1 is enabled
[ https://issues.apache.org/jira/browse/IMPALA-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reopened IMPALA-8439: - > Add Hive ACID tables during dataload if Hive 3.1 is enabled > --- > > Key: IMPALA-8439 > URL: https://issues.apache.org/jira/browse/IMPALA-8439 > Project: IMPALA > Issue Type: Story > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Csaba Ringhofer >Assignee: Csaba Ringhofer >Priority: Critical > Labels: impala-acid > > Test warehouse should include a few transactional tables (insert-only, not > insert-only, partitioned, not partitioned, bucketed, not-bucketed, compacted > and uncompacted) to enable the testing of ACID features. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8600) Reload partition does not work for transactional tables
[ https://issues.apache.org/jira/browse/IMPALA-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg updated IMPALA-8600: Labels: impala-acid (was: ) > Reload partition does not work for transactional tables > --- > > Key: IMPALA-8600 > URL: https://issues.apache.org/jira/browse/IMPALA-8600 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Labels: impala-acid > > If a table is transactional, a reload partition call should fetch the valid > writeIds. Without doing this, the reload will skip adding all the newly > created delta files of the transactional table pertaining to the new writeIds. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8631) Ensure that cached data is always up to date to avoid reads based on stale metadata for transactional read only tables
[ https://issues.apache.org/jira/browse/IMPALA-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8631: --- Assignee: Todd Lipcon > Ensure that cached data is always up to date to avoid reads based on stale > metadata for transactional read only tables > --- > > Key: IMPALA-8631 > URL: https://issues.apache.org/jira/browse/IMPALA-8631 > Project: IMPALA > Issue Type: Improvement >Reporter: Dinesh Garg >Assignee: Todd Lipcon >Priority: Major > Labels: impala-acid > > Acquire latest validWriteIdList in the coordinator and validate that the > cached data is up to date. Automatically force refresh with query if it’s not. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8369) Impala should be able to interoperate with Hive 3.1.0
[ https://issues.apache.org/jira/browse/IMPALA-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8369 started by Dinesh Garg. --- > Impala should be able to interoperate with Hive 3.1.0 > - > > Key: IMPALA-8369 > URL: https://issues.apache.org/jira/browse/IMPALA-8369 > Project: IMPALA > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Dinesh Garg >Priority: Major > Labels: impala-acid > > Currently, Impala only works with Hive 2.1.1. Since Hive 3.1.0 has been > released for a while it would be good to add support for Hive 3.1.0 (HMS > 3.1.0). This patch will focus on ability to connect to HMS 3.1.0 and run > existing tests. It will not focus on adding support for newer features like > ACID in Hive 3.1.0 which can be taken up as separate JIRA. > It would be good to make changes to Impala source code such that it can work > with both Hive 2.1.0 and Hive 3.1.0 without the need to create a separate > branch. However, this should be a aspirational goal. If we hit a blocker we > should investigate alternative approaches. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8439) Add Hive ACID tables during dataload if Hive 3.1 is enabled
[ https://issues.apache.org/jira/browse/IMPALA-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg resolved IMPALA-8439. - Resolution: Fixed > Add Hive ACID tables during dataload if Hive 3.1 is enabled > --- > > Key: IMPALA-8439 > URL: https://issues.apache.org/jira/browse/IMPALA-8439 > Project: IMPALA > Issue Type: Story > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Csaba Ringhofer >Assignee: Csaba Ringhofer >Priority: Critical > Labels: impala-acid > > Test warehouse should include a few transactional tables (insert-only, not > insert-only, partitioned, not partitioned, bucketed, not-bucketed, compacted > and uncompacted) to enable the testing of ACID features. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IMPALA-8440) Add "post upgrade" ACID tables to test data
[ https://issues.apache.org/jira/browse/IMPALA-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8440: --- Assignee: Csaba Ringhofer > Add "post upgrade" ACID tables to test data > --- > > Key: IMPALA-8440 > URL: https://issues.apache.org/jira/browse/IMPALA-8440 > Project: IMPALA > Issue Type: Story > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Csaba Ringhofer >Assignee: Csaba Ringhofer >Priority: Critical > Labels: impala-acid > Fix For: Impala 3.3.0 > > > Include a transactional table in the test data which is in post-upgrade > format (what an old table looks like after it becomes transactional). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8636) Implement INSERT for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8636: --- Assignee: Zoltán Borók-Nagy (was: Dinesh Garg) > Implement INSERT for insert-only ACID tables > > > Key: IMPALA-8636 > URL: https://issues.apache.org/jira/browse/IMPALA-8636 > Project: IMPALA > Issue Type: New Feature >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: impala-acid > > Impala should support insertion for insert-only ACID tables. > For this we need to allocate a write ID for the target table, and write the > data into the base/delta directories. > INSERT operation should create a new delta directory with the allocated write > ID. > INSERT OVERWRITE should create a new base directory with the allocated write > ID. This new base directory will only contain the data coming from this > operation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8637) Implement transaction handling and locking for ACID queries
[ https://issues.apache.org/jira/browse/IMPALA-8637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Garg reassigned IMPALA-8637: --- Assignee: Zoltán Borók-Nagy (was: Dinesh Garg) > Implement transaction handling and locking for ACID queries > --- > > Key: IMPALA-8637 > URL: https://issues.apache.org/jira/browse/IMPALA-8637 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: impala-acid > > * Start a transaction before planning > * lock tables > * heartbeat during execution > * mark committed after execution finishes -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org