[jira] [Updated] (IMPALA-9095) Alter table events generated by renames are not renaming the table to a different DB.

2019-10-25 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9095:
-
Description: 
Alter table renames was recently refactored. This introduced a bug where rename 
to a different database is not applied correctly.

Steps to reproduce:

>From Hive:
{code:java}
create database bug1;

create table bug1.foo (id int);

create database bug2;

alter table bug1.foo rename to bug2.foo;{code}
 

>From Impala:
{code:java}
use bug2;

show tables;{code}
 

Expect foo to show up in bug2, it doesn't. 

  was:
Alter table renames was recently refactored. This introduced a bug where rename 
to a different database is not applied correctly.

 

 


> Alter table events generated by renames are not renaming the table to a 
> different DB.
> -
>
> Key: IMPALA-9095
> URL: https://issues.apache.org/jira/browse/IMPALA-9095
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> Alter table renames was recently refactored. This introduced a bug where 
> rename to a different database is not applied correctly.
> Steps to reproduce:
> From Hive:
> {code:java}
> create database bug1;
> create table bug1.foo (id int);
> create database bug2;
> alter table bug1.foo rename to bug2.foo;{code}
>  
> From Impala:
> {code:java}
> use bug2;
> show tables;{code}
>  
> Expect foo to show up in bug2, it doesn't. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9096) Create external table ddls should send column lineages.

2019-10-25 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9096:


 Summary: Create external table ddls should send column lineages.
 Key: IMPALA-9096
 URL: https://issues.apache.org/jira/browse/IMPALA-9096
 Project: IMPALA
  Issue Type: Improvement
Reporter: Anurag Mantripragada


Create external table with specified columns should create column lineages for 
tools like Altas to consume.

 

For example:

create EXTERNAL TABLE IF NOT EXISTS friday_ext6
(STUD_ID int,
DEPT_ID int,
NAME string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
STORED AS TEXTFILE;

Currently we send a lineage like:
{code:java}
 {
 "queryText":"create EXTERNAL TABLE IF NOT EXISTS friday_ext5 (STUD_ID int, 
DEPT_IDint, NAME string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ 
STORED AS TEXTFILE LOCATION 
‘/warehouse/tablespace/external/hive/testdb.db/friday_ext5’",
 "queryId":"4b471ac0ca2b0f93:029db79c",
 "hash":"867fae20bc6c8254c05774cc923a99fa",
 "user":"admin",
 "timestamp":1572028716,
 "endTime":1572028716,
 "edges":[],
 "vertices":[],
 
"tableLocation":"hdfs://sid-cdp-2-1.gce.cloudera.com:8020/warehouse/tablespace/external/hive/testdb.db/friday_ext"
}
 {code}
Atlas needs fully qualified table name to create lineage.

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9095) Alter table events generated by renames are not renaming the table to a different DB.

2019-10-25 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9095:


Assignee: (was: Anurag Mantripragada)

> Alter table events generated by renames are not renaming the table to a 
> different DB.
> -
>
> Key: IMPALA-9095
> URL: https://issues.apache.org/jira/browse/IMPALA-9095
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Priority: Critical
>
> Alter table renames was recently refactored. This introduced a bug where 
> rename to a different database is not applied correctly.
> Steps to reproduce:
> From Hive:
> {code:java}
> create database bug1;
> create table bug1.foo (id int);
> create database bug2;
> alter table bug1.foo rename to bug2.foo;{code}
>  
> From Impala:
> {code:java}
> use bug2;
> show tables;{code}
>  
> Expect foo to show up in bug2, it doesn't. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9095) Alter table events generated by renames are not renaming the table to a different DB.

2019-10-25 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9095:


Assignee: Anurag Mantripragada

> Alter table events generated by renames are not renaming the table to a 
> different DB.
> -
>
> Key: IMPALA-9095
> URL: https://issues.apache.org/jira/browse/IMPALA-9095
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> Alter table renames was recently refactored. This introduced a bug where 
> rename to a different database is not applied correctly.
> Steps to reproduce:
> From Hive:
> {code:java}
> create database bug1;
> create table bug1.foo (id int);
> create database bug2;
> alter table bug1.foo rename to bug2.foo;{code}
>  
> From Impala:
> {code:java}
> use bug2;
> show tables;{code}
>  
> Expect foo to show up in bug2, it doesn't. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9172) Load DB/Table ownership info on start-up.

2019-11-19 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9172:


 Summary: Load DB/Table ownership info on start-up.
 Key: IMPALA-9172
 URL: https://issues.apache.org/jira/browse/IMPALA-9172
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.3.0
Reporter: Anurag Mantripragada


Ranger users OWNER user to check ownership of databases and tables. Ownership 
is a part of the HMS thrift object. Since we do not aggressively load HMS 
schemas during start-up, coordinators with cold caches can result in missing 
table listings or any other information that depends on owner information due 
to lack of metadata needed for verifying ownership. This should be fixed to 
make the behavior more consistent and user-friendly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9152) AuthorizationStmtTest.testColumnMaskEnabled failed in precommits.

2019-11-12 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9152:
-
Summary: AuthorizationStmtTest.testColumnMaskEnabled failed in precommits.  
(was: 
org.apache.impala.authorization.AuthorizationStmtTest.testColumnMaskEnabled 
failed in precommits.)

> AuthorizationStmtTest.testColumnMaskEnabled failed in precommits.
> -
>
> Key: IMPALA-9152
> URL: https://issues.apache.org/jira/browse/IMPALA-9152
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Anurag Mantripragada
>Assignee: Quanlong Huang
>Priority: Major
>
> [~stigahuang] since you are going to work on this stuff, assigning it to you. 
> Please feel free to reassign it. Thank you!
> Encountered here:
> [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/8738]
> Looks like the test was expecting column masking error but encountered row 
> filtering error.
> {code:java}
> got error:
> Impala does not support row filtering yet. Row filtering is enabled on table: 
> functional.alltypes_view
> expected:
> Impala does not support column masking yet. Column masking is enabled on 
> column: functional.alltypes_view.string_col {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9152) org.apache.impala.authorization.AuthorizationStmtTest.testColumnMaskEnabled failed in precommits.

2019-11-12 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9152:


 Summary: 
org.apache.impala.authorization.AuthorizationStmtTest.testColumnMaskEnabled 
failed in precommits.
 Key: IMPALA-9152
 URL: https://issues.apache.org/jira/browse/IMPALA-9152
 Project: IMPALA
  Issue Type: Bug
Reporter: Anurag Mantripragada
Assignee: Quanlong Huang


[~stigahuang] since you are going to work on this stuff, assigning it to you. 
Please feel free to reassign it. Thank you!

Encountered here:

[https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/8738]

Looks like the test was expecting column masking error but encountered row 
filtering error.
{code:java}
got error:
Impala does not support row filtering yet. Row filtering is enabled on table: 
functional.alltypes_view
expected:
Impala does not support column masking yet. Column masking is enabled on 
column: functional.alltypes_view.string_col {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-8761) Configuration validation introduced in IMPALA-8559 can be improved

2019-09-20 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada closed IMPALA-8761.

Resolution: Fixed

> Configuration validation introduced in IMPALA-8559 can be improved
> --
>
> Key: IMPALA-8761
> URL: https://issues.apache.org/jira/browse/IMPALA-8761
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
>
> The issue with configuration validation in IMPALA-8559 is that it validates 
> one configuration at a time and fails as soon as there is a validation error. 
> Since there are more than one configuration keys to validate, user may have 
> to restart HMS again and again if there are multiple configuration changes 
> which are needed. This is not a great user experience. A simple improvement 
> that can be made is do all the configuration validations together and then 
> present the results together in case of failures so that user can change all 
> the required changes in one go.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-8968) Fix self-event detection on database events.

2019-09-24 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937283#comment-16937283
 ] 

Anurag Mantripragada edited comment on IMPALA-8968 at 9/25/19 12:23 AM:


Hi Quanlong, 

I should have been more elaborate in the description. The idea is to receive an 
ALTER_DATABASE event (created by create/drop function in this case) on a DB 
that is already dropped. This is a permanent issue and should be reproducible 
consistently.  I suggest changing the event polling frequency like so:
{code:java}
 /bin/start-impala-cluster.py 
--catalogd_args=–hms_event_polling_interval_s=10{code}
Also, I ran all the commands using a file like in a test:
{code:java}
 bin/impala-shell.sh -f my_file.sql{code}
Let me know if this doesn't help you reproduce. 


was (Author: anuragmantri):
Hi Quanlong, 

I should have been more elaborate in the description. The idea is to receive an 
ALTER_DATABASE event (created by create/drop function in this case) while the 
DB is already dropped. This is a permanent issue and should be reproducible 
consistently.  I suggest changing the event polling frequency like so:
{code:java}
 /bin/start-impala-cluster.py 
--catalogd_args=–hms_event_polling_interval_s=10{code}
Also, I ran all the commands using a file like in a test:
{code:java}
 bin/impala-shell.sh -f my_file.sql{code}
Let me know if this doesn't help you reproduce. 

> Fix self-event detection on database events.
> 
>
> Key: IMPALA-8968
> URL: https://issues.apache.org/jira/browse/IMPALA-8968
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>
> When event processing is turned on, self-events are not detected for DATABASE 
> level events (create/alter/drop database). Reproduced using the below 
> statements:
> {code:java}
> CREATE DATABASE test;
> CREATE FUNCTION test.fn_test (INT, STRING) RETURNS BOOLEAN LOCATION 
> '/test-warehouse/dummy.jar' SYMBOL='com.cloudera.impala.TestUdf';
> DROP FUNCTION test.fn_test(INT, STRING);
> DROP DATABASE test CASCADE; {code}
> Since the events processor could not detect self-events, it will try to 
> process the ALTER_DATABASE event created by dropping the function. However, 
> it doesn't find the DB and errors out like below:
> {code:java}
> I0923 16:09:46.042317  6727 MetastoreEventsProcessor.java:480] Received 2 
> events. Start event id : 30077
> I0923 16:09:46.042317  6727 MetastoreEventsProcessor.java:480] Received 2 
> events. Start event id : 30077
> I0923 16:09:46.042853  6727 MetastoreEvents.java:382] EventId: 30078 
> EventType: ALTER_DATABASE Creating event 30078 of type ALTER_DATABASE on 
> database test
> I0923 16:09:46.01  6727 MetastoreEvents.java:382] EventId: 30079 
> EventType: DROP_DATABASE Creating event 30079 of type DROP_DATABASE on 
> database test
> I0923 16:09:46.050657  6727 MetastoreEvents.java:231] Total number of events 
> received: 2 Total number of events filtered out: 0
> I0923 16:09:46.051167  6727 MetastoreEvents.java:382] EventId: 30078 
> EventType: ALTER_DATABASE Received exception Database test not found. 
> Ignoring self-event evaluation
> E0923 16:09:46.056273  6727 MetastoreEventsProcessor.java:525] Unexpected 
> exception received while processing event
> Java exception 
> follows:org.apache.impala.catalog.events.MetastoreNotificationException: 
> Unable to process event 30078 of type ALTER_DATABASE. Event processing will 
> be stopped.
>  at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:614)
>  at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:511)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.impala.catalog.DatabaseNotFoundException: Database test 
> not found
>  at 
> org.apache.impala.catalog.CatalogServiceCatalog.updateDb(CatalogServiceCatalog.java:1060)
>  at 
> org.apache.impala.catalog.events.MetastoreEvents$AlterDatabaseEvent.process(MetastoreEvents.java:1225)
>  at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:316)
>  at 

[jira] [Commented] (IMPALA-8968) Fix self-event detection on database events.

2019-09-24 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937283#comment-16937283
 ] 

Anurag Mantripragada commented on IMPALA-8968:
--

Hi Quanlong, 

I should have been more elaborate in the description. The idea is to receive an 
ALTER_DATABASE event (created by create/drop function in this case) while the 
DB is already dropped. This is a permanent issue and should be reproducible 
consistently.  I suggest changing the event polling frequency like so:
{code:java}
 /bin/start-impala-cluster.py 
--catalogd_args=–hms_event_polling_interval_s=10{code}
Also, I ran all the commands using a file like in a test:
{code:java}
 bin/impala-shell.sh -f my_file.sql{code}
Let me know if this doesn't help you reproduce. 

> Fix self-event detection on database events.
> 
>
> Key: IMPALA-8968
> URL: https://issues.apache.org/jira/browse/IMPALA-8968
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>
> When event processing is turned on, self-events are not detected for DATABASE 
> level events (create/alter/drop database). Reproduced using the below 
> statements:
> {code:java}
> CREATE DATABASE test;
> CREATE FUNCTION test.fn_test (INT, STRING) RETURNS BOOLEAN LOCATION 
> '/test-warehouse/dummy.jar' SYMBOL='com.cloudera.impala.TestUdf';
> DROP FUNCTION test.fn_test(INT, STRING);
> DROP DATABASE test CASCADE; {code}
> Since the events processor could not detect self-events, it will try to 
> process the ALTER_DATABASE event created by dropping the function. However, 
> it doesn't find the DB and errors out like below:
> {code:java}
> I0923 16:09:46.042317  6727 MetastoreEventsProcessor.java:480] Received 2 
> events. Start event id : 30077
> I0923 16:09:46.042317  6727 MetastoreEventsProcessor.java:480] Received 2 
> events. Start event id : 30077
> I0923 16:09:46.042853  6727 MetastoreEvents.java:382] EventId: 30078 
> EventType: ALTER_DATABASE Creating event 30078 of type ALTER_DATABASE on 
> database test
> I0923 16:09:46.01  6727 MetastoreEvents.java:382] EventId: 30079 
> EventType: DROP_DATABASE Creating event 30079 of type DROP_DATABASE on 
> database test
> I0923 16:09:46.050657  6727 MetastoreEvents.java:231] Total number of events 
> received: 2 Total number of events filtered out: 0
> I0923 16:09:46.051167  6727 MetastoreEvents.java:382] EventId: 30078 
> EventType: ALTER_DATABASE Received exception Database test not found. 
> Ignoring self-event evaluation
> E0923 16:09:46.056273  6727 MetastoreEventsProcessor.java:525] Unexpected 
> exception received while processing event
> Java exception 
> follows:org.apache.impala.catalog.events.MetastoreNotificationException: 
> Unable to process event 30078 of type ALTER_DATABASE. Event processing will 
> be stopped.
>  at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:614)
>  at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:511)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.impala.catalog.DatabaseNotFoundException: Database test 
> not found
>  at 
> org.apache.impala.catalog.CatalogServiceCatalog.updateDb(CatalogServiceCatalog.java:1060)
>  at 
> org.apache.impala.catalog.events.MetastoreEvents$AlterDatabaseEvent.process(MetastoreEvents.java:1225)
>  at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:316)
>  at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:609)
>  ... 8 more
> E0923 16:09:46.056489  6727 MetastoreEventsProcessor.java:631] Notification 
> event is null
> W0923 16:09:56.057376  6727 MetastoreEventsProcessor.java:504] Event 
> processing is skipped since status is ERROR. Last synced event id is 
> 30077{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-8968) Fix self-event detection on database events.

2019-09-24 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937283#comment-16937283
 ] 

Anurag Mantripragada edited comment on IMPALA-8968 at 9/25/19 12:21 AM:


Hi Quanlong, 

I should have been more elaborate in the description. The idea is to receive an 
ALTER_DATABASE event (created by create/drop function in this case) while the 
DB is already dropped. This is a permanent issue and should be reproducible 
consistently.  I suggest changing the event polling frequency like so:
{code:java}
 /bin/start-impala-cluster.py 
--catalogd_args=–hms_event_polling_interval_s=10{code}
Also, I ran all the commands using a file like in a test:
{code:java}
 bin/impala-shell.sh -f my_file.sql{code}
Let me know if this doesn't help you reproduce. 


was (Author: anuragmantri):
Hi Quanlong, 

I should have been more elaborate in the description. The idea is to receive an 
ALTER_DATABASE event (created by create/drop function in this case) while the 
DB is already dropped. This is a permanent issue and should be reproducible 
consistently.  I suggest changing the event polling frequency like so:
{code:java}
 /bin/start-impala-cluster.py 
--catalogd_args=–hms_event_polling_interval_s=10{code}
Also, I ran all the commands using a file like in a test:
{code:java}
 bin/impala-shell.sh -f my_file.sql{code}
Let me know if this doesn't help you reproduce. 

> Fix self-event detection on database events.
> 
>
> Key: IMPALA-8968
> URL: https://issues.apache.org/jira/browse/IMPALA-8968
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>
> When event processing is turned on, self-events are not detected for DATABASE 
> level events (create/alter/drop database). Reproduced using the below 
> statements:
> {code:java}
> CREATE DATABASE test;
> CREATE FUNCTION test.fn_test (INT, STRING) RETURNS BOOLEAN LOCATION 
> '/test-warehouse/dummy.jar' SYMBOL='com.cloudera.impala.TestUdf';
> DROP FUNCTION test.fn_test(INT, STRING);
> DROP DATABASE test CASCADE; {code}
> Since the events processor could not detect self-events, it will try to 
> process the ALTER_DATABASE event created by dropping the function. However, 
> it doesn't find the DB and errors out like below:
> {code:java}
> I0923 16:09:46.042317  6727 MetastoreEventsProcessor.java:480] Received 2 
> events. Start event id : 30077
> I0923 16:09:46.042317  6727 MetastoreEventsProcessor.java:480] Received 2 
> events. Start event id : 30077
> I0923 16:09:46.042853  6727 MetastoreEvents.java:382] EventId: 30078 
> EventType: ALTER_DATABASE Creating event 30078 of type ALTER_DATABASE on 
> database test
> I0923 16:09:46.01  6727 MetastoreEvents.java:382] EventId: 30079 
> EventType: DROP_DATABASE Creating event 30079 of type DROP_DATABASE on 
> database test
> I0923 16:09:46.050657  6727 MetastoreEvents.java:231] Total number of events 
> received: 2 Total number of events filtered out: 0
> I0923 16:09:46.051167  6727 MetastoreEvents.java:382] EventId: 30078 
> EventType: ALTER_DATABASE Received exception Database test not found. 
> Ignoring self-event evaluation
> E0923 16:09:46.056273  6727 MetastoreEventsProcessor.java:525] Unexpected 
> exception received while processing event
> Java exception 
> follows:org.apache.impala.catalog.events.MetastoreNotificationException: 
> Unable to process event 30078 of type ALTER_DATABASE. Event processing will 
> be stopped.
>  at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:614)
>  at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:511)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.impala.catalog.DatabaseNotFoundException: Database test 
> not found
>  at 
> org.apache.impala.catalog.CatalogServiceCatalog.updateDb(CatalogServiceCatalog.java:1060)
>  at 
> org.apache.impala.catalog.events.MetastoreEvents$AlterDatabaseEvent.process(MetastoreEvents.java:1225)
>  at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:316)
>  at 

[jira] [Assigned] (IMPALA-8795) Enable event polling by default in tests

2019-10-07 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-8795:


Assignee: Anurag Mantripragada  (was: Vihang Karajgaonkar)

> Enable event polling by default in tests
> 
>
> Key: IMPALA-8795
> URL: https://issues.apache.org/jira/browse/IMPALA-8795
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
>
> We should turn on event processing by default in all the tests to make sure 
> that there are no regressions when we turn ON the feature by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9017) Alter table events should be skipped if the table/db is not found in the catalog.

2019-10-07 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9017:


 Summary: Alter table events should be skipped if the table/db is 
not found in the catalog.
 Key: IMPALA-9017
 URL: https://issues.apache.org/jira/browse/IMPALA-9017
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Anurag Mantripragada
Assignee: Anurag Mantripragada


Currently, a sequence of alter_table, drop_table/db puts the events processor 
in an error state since the table/database are already dropped. However, it is 
safe to skip such events because the catalog is not aware of such tables 
because of either of the reasons:
 # The table creation event is missed by the catalog - this means the events 
processor is already in error state.
 # The table was later dropped/renamed.

In either case, it doesn't make sense to try to add the table to catalog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9017) Alter table events should be skipped if the table/db is not found in the catalog.

2019-10-08 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-9017.
--
Resolution: Fixed

> Alter table events should be skipped if the table/db is not found in the 
> catalog.
> -
>
> Key: IMPALA-9017
> URL: https://issues.apache.org/jira/browse/IMPALA-9017
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>
> Currently, a sequence of alter_table, drop_table/db puts the events processor 
> in an error state since the table/database are already dropped. However, it 
> is safe to skip such events because the catalog is not aware of such tables 
> because of either of the reasons:
>  # The table creation event is missed by the catalog - this means the events 
> processor is already in error state.
>  # The table was later dropped/renamed.
> In either case, it doesn't make sense to try to add the table to catalog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9419) Show create table should show enforcement and belief information.

2020-02-24 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9419:


 Summary: Show create table should show enforcement and belief 
information.
 Key: IMPALA-9419
 URL: https://issues.apache.org/jira/browse/IMPALA-9419
 Project: IMPALA
  Issue Type: Sub-task
  Components: Frontend
Affects Versions: Impala 3.4.0
Reporter: Anurag Mantripragada


Currently, for tables with PK/FK information, "SHOW CREATE" statements do not 
show enforcement and belief information (DISABLE, NOVALIDATE, RELY/NORELY). 
This information is useful to reconstruct the table via show create.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9119) query_test.test_tpch_queries.TestTpchQuery.test_tpch failed with "Memory limit Exceeded."

2020-03-02 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049791#comment-17049791
 ] 

Anurag Mantripragada commented on IMPALA-9119:
--

Since the jenkins runs are short-lived, is it a good idea to attach full logs 
while creating such JIRAs. I missed capturing the actual query that caused 
these errors in the JIRA :( 

> query_test.test_tpch_queries.TestTpchQuery.test_tpch failed with "Memory 
> limit Exceeded."
> -
>
> Key: IMPALA-9119
> URL: https://issues.apache.org/jira/browse/IMPALA-9119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Anurag Mantripragada
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build
>
> Observed during a GVD run here: 
> [https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/1529/]
> Part of the error message:
> {code:java}
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:Memory limit exceeded: Cannot perform aggregation at 
> aggregator with id 4. Failed to allocate 25 bytes for intermediate tuple.
> E   Fragment 85444bc502604143:43739df8000b could not allocate 25.00 B 
> without exceeding limit.
> E   Error occurred on backend e6a3e789bfca:22000 by fragment 
> 85444bc502604143:43739df8000b
> E   Memory left in process limit: 6.82 GB
> E   Memory left in query limit: 10.48 KB {code}
>  
> Aggregation node id 4:
> {code:java}
> E Fragment 85444bc502604143:43739df8000b: Reservation=42.00 MB 
> OtherMemory=1.64 MB Total=43.64 MB Peak=43.70 MB
> E   AGGREGATION_NODE (id=4): Reservation=42.00 MB OtherMemory=42.12 KB 
> Total=42.04 MB Peak=42.04 MB
> E GroupingAggregator 0: Reservation=42.00 MB OtherMemory=17.12 KB 
> Total=42.02 MB Peak=42.02 MB
> E   Exprs: Total=17.12 KB Peak=17.12 KB
> E   KUDU_SCAN_NODE (id=3): Total=1.49 MB Peak=1.51 MB
> E Queued Batches: Total=1.49 MB Peak=1.51 MB
> E   KrpcDataStreamSender (dst_id=13): Total=83.35 KB Peak=99.35 KB
> E   CodeGen: Total=5.29 KB Peak=596.50 KB {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9325) Describe statement fails if its a Hive Materialized view

2020-01-23 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022728#comment-17022728
 ] 

Anurag Mantripragada commented on IMPALA-9325:
--

A related Jira on more user-friendly messages: 
https://issues.apache.org/jira/browse/IMPALA-7507

> Describe statement fails if its a Hive Materialized view 
> -
>
> Key: IMPALA-9325
> URL: https://issues.apache.org/jira/browse/IMPALA-9325
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> Based on what I see here 
> [https://github.com/apache/impala/blob/master/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java#L349]
>  Impala treats Hive's materialized views as views.
>  
> However, when we issue describe statement on a materialized in catalog-v2 
> mode (--use-local-catalog=true) we see the following exception:
>  
> Error: LocalCatalogException: Could not load partition names for table 
> default.mv3 CAUSED BY: TException: Invalid response from catalogd for request 
> TGetPartialCatalogObjectRequest(protocol_version:V1, 
> object_desc:TCatalogObject(type:TABLE, catalog_version:16, 
> table:TTable(db_name:default, tbl_name:mv3)), 
> table_info_selector:TTableInfoSelector(want_hms_table:false, 
> want_partition_names:true, want_partition_metadata:false, 
> want_partition_files:false, want_partition_stats:false, 
> want_table_constraints:false)): missing partition list result
>  
> We should either have a more user-friendly error message are treat it as a 
> View until we add a full support to materialized views.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9311) test_show_create_table failed with primary key mismatch

2020-01-27 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9311:
-
Priority: Critical  (was: Major)

> test_show_create_table failed with primary key mismatch
> ---
>
> Key: IMPALA-9311
> URL: https://issues.apache.org/jira/browse/IMPALA-9311
> Project: IMPALA
>  Issue Type: Test
>Affects Versions: Impala 3.4.0
>Reporter: Xiaomeng Zhang
>Assignee: Anurag Mantripragada
>Priority: Critical
>  Labels: broken-build
>
> {code:java}
> Error Messagemetadata/test_show_create_table.py:62: in test_show_create_table 
> unique_database) metadata/test_show_create_table.py:110: in 
> __run_show_create_table_test_case self.__compare_result(expected_result, 
> create_table_result) metadata/test_show_create_table.py:146: in 
> __compare_result assert expected_sql_filtered == actual_sql_filtered E   
> assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'" E
>  Skipping 71 identical leading characters in diff, use -v to show E 
> Skipping 126 identical trailing characters in diff, use -v to show E - 
> MARY KEY (year, id)) ROW FO E ?    E + MARY KEY (id, 
> year)) ROW FO E ?   
> Stacktracemetadata/test_show_create_table.py:62: in test_show_create_table
> unique_database)
> metadata/test_show_create_table.py:110: in __run_show_create_table_test_case
> self.__compare_result(expected_result, create_table_result)
> metadata/test_show_create_table.py:146: in __compare_result
> assert expected_sql_filtered == actual_sql_filtered
> E   assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'"
> E Skipping 71 identical leading characters in diff, use -v to show
> E Skipping 126 identical trailing characters in diff, use -v to show
> E - MARY KEY (year, id)) ROW FO
> E ?   
> E + MARY KEY (id, year)) ROW FO
> E ?   {code}
> I think this is due to commit 
> [https://github.com/apache/impala/commit/cfe60858da110cf1256bd3aa5d4f8d374578a33d]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9336) Impala doc: Document the create table syntax for Primary Key and Foreign Keys spec.

2020-01-28 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9336:


 Summary: Impala doc: Document the create table syntax for Primary 
Key and Foreign Keys spec.
 Key: IMPALA-9336
 URL: https://issues.apache.org/jira/browse/IMPALA-9336
 Project: IMPALA
  Issue Type: Sub-task
  Components: Docs
Reporter: Anurag Mantripragada
Assignee: Laurel Hale


This Jira tracks the documentation needed for defining primary keys and foreign 
keys as part of create table in Impala. See IMPALA-2112.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work stopped] (IMPALA-8795) Enable event polling by default in tests

2020-02-03 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8795 stopped by Anurag Mantripragada.

> Enable event polling by default in tests
> 
>
> Key: IMPALA-8795
> URL: https://issues.apache.org/jira/browse/IMPALA-8795
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
>
> We should turn on event processing by default in all the tests to make sure 
> that there are no regressions when we turn ON the feature by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9311) test_show_create_table failed with primary key mismatch

2020-01-24 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023342#comment-17023342
 ] 

Anurag Mantripragada commented on IMPALA-9311:
--

CR here: [https://gerrit.cloudera.org/#/c/15106/]

> test_show_create_table failed with primary key mismatch
> ---
>
> Key: IMPALA-9311
> URL: https://issues.apache.org/jira/browse/IMPALA-9311
> Project: IMPALA
>  Issue Type: Test
>Affects Versions: Impala 3.4.0
>Reporter: Xiaomeng Zhang
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: broken-build
>
> {code:java}
> Error Messagemetadata/test_show_create_table.py:62: in test_show_create_table 
> unique_database) metadata/test_show_create_table.py:110: in 
> __run_show_create_table_test_case self.__compare_result(expected_result, 
> create_table_result) metadata/test_show_create_table.py:146: in 
> __compare_result assert expected_sql_filtered == actual_sql_filtered E   
> assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'" E
>  Skipping 71 identical leading characters in diff, use -v to show E 
> Skipping 126 identical trailing characters in diff, use -v to show E - 
> MARY KEY (year, id)) ROW FO E ?    E + MARY KEY (id, 
> year)) ROW FO E ?   
> Stacktracemetadata/test_show_create_table.py:62: in test_show_create_table
> unique_database)
> metadata/test_show_create_table.py:110: in __run_show_create_table_test_case
> self.__compare_result(expected_result, create_table_result)
> metadata/test_show_create_table.py:146: in __compare_result
> assert expected_sql_filtered == actual_sql_filtered
> E   assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'"
> E Skipping 71 identical leading characters in diff, use -v to show
> E Skipping 126 identical trailing characters in diff, use -v to show
> E - MARY KEY (year, id)) ROW FO
> E ?   
> E + MARY KEY (id, year)) ROW FO
> E ?   {code}
> I think this is due to commit 
> [https://github.com/apache/impala/commit/cfe60858da110cf1256bd3aa5d4f8d374578a33d]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9336) Impala doc: Document the create table syntax for Primary Key and Foreign Keys spec.

2020-01-28 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9336:


Assignee: (was: Anurag Mantripragada)

> Impala doc: Document the create table syntax for Primary Key and Foreign Keys 
> spec.
> ---
>
> Key: IMPALA-9336
> URL: https://issues.apache.org/jira/browse/IMPALA-9336
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Anurag Mantripragada
>Priority: Major
>
> This Jira tracks the documentation needed for defining primary keys and 
> foreign keys as part of create table in Impala. See IMPALA-2112.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9336) Impala doc: Document the create table syntax for Primary Key and Foreign Keys spec.

2020-01-28 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9336:


Assignee: Anurag Mantripragada  (was: Laurel Hale)

> Impala doc: Document the create table syntax for Primary Key and Foreign Keys 
> spec.
> ---
>
> Key: IMPALA-9336
> URL: https://issues.apache.org/jira/browse/IMPALA-9336
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>
> This Jira tracks the documentation needed for defining primary keys and 
> foreign keys as part of create table in Impala. See IMPALA-2112.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9336) Impala doc: Document the create table syntax for Primary Key and Foreign Keys spec.

2020-01-28 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9336:
-
Priority: Critical  (was: Major)

> Impala doc: Document the create table syntax for Primary Key and Foreign Keys 
> spec.
> ---
>
> Key: IMPALA-9336
> URL: https://issues.apache.org/jira/browse/IMPALA-9336
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Anurag Mantripragada
>Priority: Critical
>
> This Jira tracks the documentation needed for defining primary keys and 
> foreign keys as part of create table in Impala. See IMPALA-2112.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9336) Impala doc: Document the create table syntax for Primary Key and Foreign Keys spec.

2020-01-28 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025525#comment-17025525
 ] 

Anurag Mantripragada commented on IMPALA-9336:
--

I was unable to assign it to you. Looks like you are not added to the Impala 
project. Could you please [email |mailto:email%c2%a0...@impala.apache.org]

[d...@impala.apache.org|mailto:email%c2%a0...@impala.apache.org] requesting 
access? Thanks!

> Impala doc: Document the create table syntax for Primary Key and Foreign Keys 
> spec.
> ---
>
> Key: IMPALA-9336
> URL: https://issues.apache.org/jira/browse/IMPALA-9336
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Anurag Mantripragada
>Priority: Critical
>
> This Jira tracks the documentation needed for defining primary keys and 
> foreign keys as part of create table in Impala. See IMPALA-2112.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9311) test_show_create_table failed with primary key mismatch

2020-01-28 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-9311.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> test_show_create_table failed with primary key mismatch
> ---
>
> Key: IMPALA-9311
> URL: https://issues.apache.org/jira/browse/IMPALA-9311
> Project: IMPALA
>  Issue Type: Test
>Affects Versions: Impala 3.4.0
>Reporter: Xiaomeng Zhang
>Assignee: Anurag Mantripragada
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 3.4.0
>
>
> {code:java}
> Error Messagemetadata/test_show_create_table.py:62: in test_show_create_table 
> unique_database) metadata/test_show_create_table.py:110: in 
> __run_show_create_table_test_case self.__compare_result(expected_result, 
> create_table_result) metadata/test_show_create_table.py:146: in 
> __compare_result assert expected_sql_filtered == actual_sql_filtered E   
> assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'" E
>  Skipping 71 identical leading characters in diff, use -v to show E 
> Skipping 126 identical trailing characters in diff, use -v to show E - 
> MARY KEY (year, id)) ROW FO E ?    E + MARY KEY (id, 
> year)) ROW FO E ?   
> Stacktracemetadata/test_show_create_table.py:62: in test_show_create_table
> unique_database)
> metadata/test_show_create_table.py:110: in __run_show_create_table_test_case
> self.__compare_result(expected_result, create_table_result)
> metadata/test_show_create_table.py:146: in __compare_result
> assert expected_sql_filtered == actual_sql_filtered
> E   assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'"
> E Skipping 71 identical leading characters in diff, use -v to show
> E Skipping 126 identical trailing characters in diff, use -v to show
> E - MARY KEY (year, id)) ROW FO
> E ?   
> E + MARY KEY (id, year)) ROW FO
> E ?   {code}
> I think this is due to commit 
> [https://github.com/apache/impala/commit/cfe60858da110cf1256bd3aa5d4f8d374578a33d]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-2112) Support primary key/foreign key constraint as part of create table in Impala

2020-02-18 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-2112:
-
Description: 
These would be advisory, ie, Impala would not attempt to enforce them. However, 
they could be used for cardinality estimation during query planning.

To be compatible with Hive:
 * We neither enforce or validate integrity constraints. Hence, DISABLE and 
NOVALIDATE options are mandatory.
 * RELY/NORELY is optional. The CBO is expected to use this information when a 
user specifies “RELY”. The default is NORELY.
 * Since we do not yet have UNIQUE in Hive, the FK mentioned must be Primary 
Key column in parent table.

Support create table syntax like hive does:
 * {{create table pk(id1 integer, id2 integer, }}{{primary key(id1, id2) 
DISABLE NOVALIDATE);}}
 * {{create table fk(id1 integer, id2 integer, }}{{foreign key(id1, id2) 
references pk(id2, id1) DISABLE NOVALIDATE);}}
 * {{create table T1(id integer, name string, primary key(id) DISABLE 
NOVALIDATE RELY}}

  was:
These would be advisory, ie, Impala would not attempt to enforce them. However, 
they could be used for cardinality estimation during query planning.

To be compatible with Hive:
 * We neither enforce or validate integrity constraints. Hence, DISABLE and 
NOVALIDATE options are mandatory.
 * RELY/NORELY is optional. The CBO is expected to use this information when a 
user specifies “RELY”. The default is NORELY.
 * Since we do not yet have UNIQUE in Hive, the FK mentioned must be Primary 
Key column in parent table.

Support create table syntax like hive does:
 * {{create table pk(id1 integer, id2 integer, }}{{primary key(id1, id2) 
DISABLE NOVALIDATE);}}
 * {{create table fk(id1 integer, id2 integer, }}{{constraint c1 foreign 
key(id1, id2) references pk(id2, id1) DISABLE NOVALIDATE);}}
 * {{create table T1(id integer, name string, primary key(id) DISABLE 
NOVALIDATE RELY}}


> Support primary key/foreign key constraint as part of create table in Impala
> 
>
> Key: IMPALA-2112
> URL: https://issues.apache.org/jira/browse/IMPALA-2112
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Affects Versions: Impala 2.2
>Reporter: Marcel Kinard
>Assignee: Anurag Mantripragada
>Priority: Critical
>  Labels: planner
>
> These would be advisory, ie, Impala would not attempt to enforce them. 
> However, they could be used for cardinality estimation during query planning.
> To be compatible with Hive:
>  * We neither enforce or validate integrity constraints. Hence, DISABLE and 
> NOVALIDATE options are mandatory.
>  * RELY/NORELY is optional. The CBO is expected to use this information when 
> a user specifies “RELY”. The default is NORELY.
>  * Since we do not yet have UNIQUE in Hive, the FK mentioned must be Primary 
> Key column in parent table.
> Support create table syntax like hive does:
>  * {{create table pk(id1 integer, id2 integer, }}{{primary key(id1, id2) 
> DISABLE NOVALIDATE);}}
>  * {{create table fk(id1 integer, id2 integer, }}{{foreign key(id1, id2) 
> references pk(id2, id1) DISABLE NOVALIDATE);}}
>  * {{create table T1(id integer, name string, primary key(id) DISABLE 
> NOVALIDATE RELY}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9369) Inserts on large tables could be very slow when event processing it turned on

2020-02-20 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9369:


Assignee: Anurag Mantripragada

> Inserts on large tables could be very slow when event processing it turned on
> -
>
> Key: IMPALA-9369
> URL: https://issues.apache.org/jira/browse/IMPALA-9369
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> In case where large number files are being inserted into a table, the 
> {{createInsertEvents}} method fires insert events to HMS for each partition 
> one take a time. This could be very slow for a insert statement which is 
> added hundreds or thousands of files.
> We should see if we can fire the insert events asynchronously instead of 
> blocking the query from returning to the user until all the insert events are 
> fired.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9372) PartialCatalogInfoTest.testGetSqlConstraints fails in comparison of PK table name

2020-02-11 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034824#comment-17034824
 ] 

Anurag Mantripragada commented on IMPALA-9372:
--

I will fix this as part of IMPALA-9256.

> PartialCatalogInfoTest.testGetSqlConstraints fails in comparison of PK table 
> name
> -
>
> Key: IMPALA-9372
> URL: https://issues.apache.org/jira/browse/IMPALA-9372
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Aman Sinha
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: build
>
> PartialCatalogInfoTest.testGetSqlConstraints encounters the following failure 
> on a recent build . 
> {noformat}
> org.junit.ComparisonFailure: expected: but 
> was:
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.impala.catalog.PartialCatalogInfoTest.testGetSqlConstraints(PartialCatalogInfoTest.java:291)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-3531) Implement deferrable and optionally enforced PK/FK constraints

2020-02-16 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037966#comment-17037966
 ] 

Anurag Mantripragada edited comment on IMPALA-3531 at 2/16/20 8:39 PM:
---

Changing priority to major as the initial DDL support is in place. I will 
continue to work on the rest of the items.


was (Author: anuragmantri):
Changing priority to major as the initial DDL support is in place. I will 
continue to work on the rest of the support.

> Implement deferrable and optionally enforced PK/FK constraints
> --
>
> Key: IMPALA-3531
> URL: https://issues.apache.org/jira/browse/IMPALA-3531
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend, Perf Investigation
>Affects Versions: Impala 2.5.0, Impala 2.6.0
> Environment: CDH
>Reporter: Ruslan Dautkhanov
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: CBO, performance, ramp-up, sql-language
>
> Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for 
> Hive to start with something like that for PK/FK constraints. So CBO has more 
> information for optimizations. It does not have to actually check if that 
> constraint is relationship is true; it can just "rely" on that constraint.
> https://docs.oracle.com/database/121/SQLRF/clauses002.htm#sthref2289
> So it would be helpful with join cardinality estimates, and with cases like 
> IMPALA-2929.
> https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053
> "Overview of Constraint States":
> - Enforcement
> - Validation
> - Belief
> So FK/PK with "rely novalidate" will have Enforcement disabled but 
> Belief = RELY as it is possible to do in Oracle and now in Hive (HIVE-13076).
> It opens a lot of ways to do additional ways to optimize execution plans.
> As exxplined in Tom Kyte's "Metadata matters"
> http://www.peoug.org/wp-content/uploads/2009/12/MetadataMatters_PEOUG_Day2009_TKyte.pdf
> pp.30 - "Tell us how the tables relate and we can remove them from the 
> plan...".
> pp.35 - "Tell us how the tables relate and we have more access paths 
> available...".
> Also it might be helpful when Impala is being integrated with Kudu as the 
> latter have to have a PK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-3531) Implement deferrable and optionally enforced PK/FK constraints

2020-02-16 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037966#comment-17037966
 ] 

Anurag Mantripragada commented on IMPALA-3531:
--

Changing priority to major as the initial DDL support is in place. I will 
continue to work on the rest of the support.

> Implement deferrable and optionally enforced PK/FK constraints
> --
>
> Key: IMPALA-3531
> URL: https://issues.apache.org/jira/browse/IMPALA-3531
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend, Perf Investigation
>Affects Versions: Impala 2.5.0, Impala 2.6.0
> Environment: CDH
>Reporter: Ruslan Dautkhanov
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: CBO, performance, ramp-up, sql-language
>
> Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for 
> Hive to start with something like that for PK/FK constraints. So CBO has more 
> information for optimizations. It does not have to actually check if that 
> constraint is relationship is true; it can just "rely" on that constraint.
> https://docs.oracle.com/database/121/SQLRF/clauses002.htm#sthref2289
> So it would be helpful with join cardinality estimates, and with cases like 
> IMPALA-2929.
> https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053
> "Overview of Constraint States":
> - Enforcement
> - Validation
> - Belief
> So FK/PK with "rely novalidate" will have Enforcement disabled but 
> Belief = RELY as it is possible to do in Oracle and now in Hive (HIVE-13076).
> It opens a lot of ways to do additional ways to optimize execution plans.
> As exxplined in Tom Kyte's "Metadata matters"
> http://www.peoug.org/wp-content/uploads/2009/12/MetadataMatters_PEOUG_Day2009_TKyte.pdf
> pp.30 - "Tell us how the tables relate and we can remove them from the 
> plan...".
> pp.35 - "Tell us how the tables relate and we have more access paths 
> available...".
> Also it might be helpful when Impala is being integrated with Kudu as the 
> latter have to have a PK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9372) PartialCatalogInfoTest.testGetSqlConstraints fails in comparison of PK table name

2020-02-16 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-9372.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

Fixed as part of IMPALA-9256

> PartialCatalogInfoTest.testGetSqlConstraints fails in comparison of PK table 
> name
> -
>
> Key: IMPALA-9372
> URL: https://issues.apache.org/jira/browse/IMPALA-9372
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Aman Sinha
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: build
> Fix For: Impala 3.4.0
>
>
> PartialCatalogInfoTest.testGetSqlConstraints encounters the following failure 
> on a recent build . 
> {noformat}
> org.junit.ComparisonFailure: expected: but 
> was:
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.impala.catalog.PartialCatalogInfoTest.testGetSqlConstraints(PartialCatalogInfoTest.java:291)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9256) Refactor constraint information into a separate class.

2020-02-16 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-9256.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Refactor constraint information into a separate class.
> --
>
> Key: IMPALA-9256
> URL: https://issues.apache.org/jira/browse/IMPALA-9256
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> We recently added support for primary keys and foreign keys information for 
> tables. However, it is cleaner to have an SQLConstraint class as a container 
> for these constraints just like hive does here: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/constraint/Constraints.java.|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/constraint/Constraints.java]
> This can be extended to support other kinds of constraints in the future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7910) COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore

2020-02-16 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-7910:
-
Priority: Major  (was: Critical)

> COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore
> 
>
> Key: IMPALA-7910
> URL: https://issues.apache.org/jira/browse/IMPALA-7910
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.9.0, Impala 2.11.0, Impala 2.12.0
>Reporter: Michael Brown
>Assignee: Anurag Mantripragada
>Priority: Major
>
> COMPUTE STATS and possibly other DDL operations unnecessarily do the 
> equivalent of a REFRESH after writing to the Hive Metastore. This unnecessary 
> operation can be very expensive, so should be avoided.
> The behavior can be confirmed from the catalogd logs:
> {code}
> compute stats functional_parquet.alltypes;
> +---+
> | summary   |
> +---+
> | Updated 24 partition(s) and 11 column(s). |
> +---+
> Relevant catalogd.INFO snippet
> I0413 14:40:24.210749 27295 HdfsTable.java:1263] Incrementally loading table 
> metadata for: functional_parquet.alltypes
> I0413 14:40:24.242122 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.244634 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=10: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.247174 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=11: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.249713 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=12: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.252288 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=2: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.254629 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=3: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.256991 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=4: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.259464 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=5: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.262197 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=6: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.264463 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=7: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.266736 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=8: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.269210 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=9: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.271800 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.274348 27295 HdfsTable.java:555] Refreshed file metadata for 
> 

[jira] [Commented] (IMPALA-8877) CatalogException during stress test: Table modified while operation was in progress

2020-02-16 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037965#comment-17037965
 ] 

Anurag Mantripragada commented on IMPALA-8877:
--

[~vihangk1], IIRC, you were working on this. Assigning it back to you. Feel 
free to reassign.

> CatalogException during stress test: Table  modified while operation was 
> in progress
> -
>
> Key: IMPALA-8877
> URL: https://issues.apache.org/jira/browse/IMPALA-8877
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: David Knupp
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>  Labels: catalog-v2
> Attachments: catalogd.INFO.tar.gz, impalad.INFO.tar.gz
>
>
> This was hit while running the stress tests to get a baseline on a deployed 
> cluster.
> /* Mem: 12850 MB. Coordinator: quasar-mzmnbe-6.vpc.cloudera.com. */
> COMPUTE STATS catalog_sales
> {noformat}
> Query (id=924a50178a5a6146:29d58a73)
>   Summary
> Session ID: 5543fb9029e2b71f:f446381b1f59ed81
> Session Type: HIVESERVER2
> HiveServer2 Protocol Version: V6
> Start Time: 2019-08-19 01:26:07.292866000
> End Time: 2019-08-19 01:26:27.248053000
> Query Type: DDL
> Query State: EXCEPTION
> Query Status: CatalogException: Table 
> 'tpcds_300_decimal_parquet.catalog_sales' was modified while operation was in 
> progress, aborting execution.
> Impala Version: impalad version 3.3.0-SNAPSHOT RELEASE (build 
> df3e7c051e2641524fc53a0cd07c2a14decd55f7)
> User: syst...@vpc.cloudera.com
> Connected User: syst...@vpc.cloudera.com
> Delegated User: 
> Network Address: :::10.65.6.19:39174
> Default Db: tpcds_300_decimal_parquet
> Sql Statement: /* Mem: 12850 MB. Coordinator: 
> quasar-mzmnbe-6.vpc.cloudera.com. */
> COMPUTE STATS catalog_sales
> Coordinator: quasar-mzmnbe-6.vpc.cloudera.com:22000
> Query Options (set by configuration): 
> ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1
> Query Options (set by configuration and planner): 
> ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1
> DDL Type: COMPUTE_STATS
> Query Compilation
>   Metadata of all 1 tables cached: 5.62s (5622372318)
>   Analysis finished: 5.62s (5622560027)
>   Authorization finished (noop): 5.62s (5622568284)
>   Retried query planning due to inconsistent metadata 7 of 40 times: 
> Catalog object TCatalogObject(type:TABLE, catalog_version:94204, 
> table:TTable(db_name:tpcds_300_decimal_parquet, tbl_name:catalog_sales)) 
> changed version between accesses.: 5.95s (5949859598)
>   Planning finished: 5.95s (5949861145)
> Query Timeline
>   Query submitted: 0ns (0)
>   Planning finished: 5.95s (5950024020)
>   Child queries finished: 17.85s (17849072057)
>   Rows available: 19.82s (19825080035)
>   Unregister query: 19.95s (19955080560)
> Frontend
>   - CatalogFetch.ColumnStats.Misses: 34 (34)
>   - CatalogFetch.ColumnStats.Requests: 34 (34)
>   - CatalogFetch.ColumnStats.Time: 0 (0)
>   - CatalogFetch.Config.Hits: 1 (1)
>   - CatalogFetch.Config.Requests: 1 (1)
>   - CatalogFetch.Config.Time: 0 (0)
>   - CatalogFetch.DatabaseList.Hits: 8 (8)
>   - CatalogFetch.DatabaseList.Requests: 8 (8)
>   - CatalogFetch.DatabaseList.Time: 0 (0)
>   - CatalogFetch.PartitionLists.Misses: 1 (1)
>   - CatalogFetch.PartitionLists.Requests: 1 (1)
>   - CatalogFetch.PartitionLists.Time: 7 (7)
>   - CatalogFetch.Partitions.Hits: 1837 (1837)
>   - CatalogFetch.Partitions.Misses: 1837 (1837)
>   - CatalogFetch.Partitions.Requests: 3674 (3674)
>   - CatalogFetch.Partitions.Time: 325 (325)
>   - CatalogFetch.RPCs.Bytes: 4.7 MiB (4936030)
>   - CatalogFetch.RPCs.Requests: 22 (22)
>   - CatalogFetch.RPCs.Time: 343 (343)
>   - CatalogFetch.TableNames.Hits: 4 (4)
>   - CatalogFetch.TableNames.Misses: 4 (4)
>   - CatalogFetch.TableNames.Requests: 8 (8)
>   - CatalogFetch.TableNames.Time: 0 (0)
>   - CatalogFetch.Tables.Misses: 8 (8)
>   - CatalogFetch.Tables.Requests: 8 (8)
>   - CatalogFetch.Tables.Time: 74 (74)
>   - InactiveTotalTime: 0ns (0)
>   - TotalTime: 0ns (0)
>   ImpalaServer
> - CatalogOpExecTimer: 1.97s (1972007962)
> - ClientFetchWaitTimer: 0ns (0)
> - InactiveTotalTime: 0ns (0)
> - RowMaterializationTimer: 0ns (0)
> - TotalTime: 0ns (0)
>   Child Queries
> Table Stats Query (id=db4821e4aa5bb04d:d4a5ae45)
>   

[jira] [Updated] (IMPALA-3531) Implement deferrable and optionally enforced PK/FK constraints

2020-02-16 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-3531:
-
Priority: Major  (was: Critical)

> Implement deferrable and optionally enforced PK/FK constraints
> --
>
> Key: IMPALA-3531
> URL: https://issues.apache.org/jira/browse/IMPALA-3531
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend, Perf Investigation
>Affects Versions: Impala 2.5.0, Impala 2.6.0
> Environment: CDH
>Reporter: Ruslan Dautkhanov
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: CBO, performance, ramp-up, sql-language
>
> Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for 
> Hive to start with something like that for PK/FK constraints. So CBO has more 
> information for optimizations. It does not have to actually check if that 
> constraint is relationship is true; it can just "rely" on that constraint.
> https://docs.oracle.com/database/121/SQLRF/clauses002.htm#sthref2289
> So it would be helpful with join cardinality estimates, and with cases like 
> IMPALA-2929.
> https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053
> "Overview of Constraint States":
> - Enforcement
> - Validation
> - Belief
> So FK/PK with "rely novalidate" will have Enforcement disabled but 
> Belief = RELY as it is possible to do in Oracle and now in Hive (HIVE-13076).
> It opens a lot of ways to do additional ways to optimize execution plans.
> As exxplined in Tom Kyte's "Metadata matters"
> http://www.peoug.org/wp-content/uploads/2009/12/MetadataMatters_PEOUG_Day2009_TKyte.pdf
> pp.30 - "Tell us how the tables relate and we can remove them from the 
> plan...".
> pp.35 - "Tell us how the tables relate and we have more access paths 
> available...".
> Also it might be helpful when Impala is being integrated with Kudu as the 
> latter have to have a PK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8877) CatalogException during stress test: Table modified while operation was in progress

2020-02-16 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-8877:


Assignee: Vihang Karajgaonkar  (was: Anurag Mantripragada)

> CatalogException during stress test: Table  modified while operation was 
> in progress
> -
>
> Key: IMPALA-8877
> URL: https://issues.apache.org/jira/browse/IMPALA-8877
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: David Knupp
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>  Labels: catalog-v2
> Attachments: catalogd.INFO.tar.gz, impalad.INFO.tar.gz
>
>
> This was hit while running the stress tests to get a baseline on a deployed 
> cluster.
> /* Mem: 12850 MB. Coordinator: quasar-mzmnbe-6.vpc.cloudera.com. */
> COMPUTE STATS catalog_sales
> {noformat}
> Query (id=924a50178a5a6146:29d58a73)
>   Summary
> Session ID: 5543fb9029e2b71f:f446381b1f59ed81
> Session Type: HIVESERVER2
> HiveServer2 Protocol Version: V6
> Start Time: 2019-08-19 01:26:07.292866000
> End Time: 2019-08-19 01:26:27.248053000
> Query Type: DDL
> Query State: EXCEPTION
> Query Status: CatalogException: Table 
> 'tpcds_300_decimal_parquet.catalog_sales' was modified while operation was in 
> progress, aborting execution.
> Impala Version: impalad version 3.3.0-SNAPSHOT RELEASE (build 
> df3e7c051e2641524fc53a0cd07c2a14decd55f7)
> User: syst...@vpc.cloudera.com
> Connected User: syst...@vpc.cloudera.com
> Delegated User: 
> Network Address: :::10.65.6.19:39174
> Default Db: tpcds_300_decimal_parquet
> Sql Statement: /* Mem: 12850 MB. Coordinator: 
> quasar-mzmnbe-6.vpc.cloudera.com. */
> COMPUTE STATS catalog_sales
> Coordinator: quasar-mzmnbe-6.vpc.cloudera.com:22000
> Query Options (set by configuration): 
> ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1
> Query Options (set by configuration and planner): 
> ABORT_ON_ERROR=1,MEM_LIMIT=13474201600,MT_DOP=4,EXEC_TIME_LIMIT_S=2147483647,TIMEZONE=America/Los_Angeles,DEFAULT_FILE_FORMAT=4,DEFAULT_TRANSACTIONAL_TYPE=1
> DDL Type: COMPUTE_STATS
> Query Compilation
>   Metadata of all 1 tables cached: 5.62s (5622372318)
>   Analysis finished: 5.62s (5622560027)
>   Authorization finished (noop): 5.62s (5622568284)
>   Retried query planning due to inconsistent metadata 7 of 40 times: 
> Catalog object TCatalogObject(type:TABLE, catalog_version:94204, 
> table:TTable(db_name:tpcds_300_decimal_parquet, tbl_name:catalog_sales)) 
> changed version between accesses.: 5.95s (5949859598)
>   Planning finished: 5.95s (5949861145)
> Query Timeline
>   Query submitted: 0ns (0)
>   Planning finished: 5.95s (5950024020)
>   Child queries finished: 17.85s (17849072057)
>   Rows available: 19.82s (19825080035)
>   Unregister query: 19.95s (19955080560)
> Frontend
>   - CatalogFetch.ColumnStats.Misses: 34 (34)
>   - CatalogFetch.ColumnStats.Requests: 34 (34)
>   - CatalogFetch.ColumnStats.Time: 0 (0)
>   - CatalogFetch.Config.Hits: 1 (1)
>   - CatalogFetch.Config.Requests: 1 (1)
>   - CatalogFetch.Config.Time: 0 (0)
>   - CatalogFetch.DatabaseList.Hits: 8 (8)
>   - CatalogFetch.DatabaseList.Requests: 8 (8)
>   - CatalogFetch.DatabaseList.Time: 0 (0)
>   - CatalogFetch.PartitionLists.Misses: 1 (1)
>   - CatalogFetch.PartitionLists.Requests: 1 (1)
>   - CatalogFetch.PartitionLists.Time: 7 (7)
>   - CatalogFetch.Partitions.Hits: 1837 (1837)
>   - CatalogFetch.Partitions.Misses: 1837 (1837)
>   - CatalogFetch.Partitions.Requests: 3674 (3674)
>   - CatalogFetch.Partitions.Time: 325 (325)
>   - CatalogFetch.RPCs.Bytes: 4.7 MiB (4936030)
>   - CatalogFetch.RPCs.Requests: 22 (22)
>   - CatalogFetch.RPCs.Time: 343 (343)
>   - CatalogFetch.TableNames.Hits: 4 (4)
>   - CatalogFetch.TableNames.Misses: 4 (4)
>   - CatalogFetch.TableNames.Requests: 8 (8)
>   - CatalogFetch.TableNames.Time: 0 (0)
>   - CatalogFetch.Tables.Misses: 8 (8)
>   - CatalogFetch.Tables.Requests: 8 (8)
>   - CatalogFetch.Tables.Time: 74 (74)
>   - InactiveTotalTime: 0ns (0)
>   - TotalTime: 0ns (0)
>   ImpalaServer
> - CatalogOpExecTimer: 1.97s (1972007962)
> - ClientFetchWaitTimer: 0ns (0)
> - InactiveTotalTime: 0ns (0)
> - RowMaterializationTimer: 0ns (0)
> - TotalTime: 0ns (0)
>   Child Queries
> Table Stats Query (id=db4821e4aa5bb04d:d4a5ae45)
> Column Stats Query (id=0444367557e3496d:f9435111)
> 

[jira] [Commented] (IMPALA-9096) Create external table ddls should send column lineages.

2020-02-12 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035815#comment-17035815
 ] 

Anurag Mantripragada commented on IMPALA-9096:
--

This is not very critical at the moment. Changed the priority to a minor.

> Create external table ddls should send column lineages.
> ---
>
> Key: IMPALA-9096
> URL: https://issues.apache.org/jira/browse/IMPALA-9096
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Priority: Minor
>
> Create external table with specified columns should create column lineages 
> for tools like Altas to consume.
>  
> For example:
> create EXTERNAL TABLE IF NOT EXISTS friday_ext6
> (STUD_ID int,
> DEPT_ID int,
> NAME string
> )
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ‘,’
> STORED AS TEXTFILE;
> Currently we send a lineage like:
> {code:java}
>  {
>  "queryText":"create EXTERNAL TABLE IF NOT EXISTS friday_ext5 (STUD_ID int, 
> DEPT_IDint, NAME string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ 
> STORED AS TEXTFILE LOCATION 
> ‘/warehouse/tablespace/external/hive/testdb.db/friday_ext5’",
>  "queryId":"4b471ac0ca2b0f93:029db79c",
>  "hash":"867fae20bc6c8254c05774cc923a99fa",
>  "user":"admin",
>  "timestamp":1572028716,
>  "endTime":1572028716,
>  "edges":[],
>  "vertices":[],
>  
> "tableLocation":"hdfs://sid-cdp-2-1.gce.cloudera.com:8020/warehouse/tablespace/external/hive/testdb.db/friday_ext"
> }
>  {code}
> Atlas needs fully qualified table name to create lineage.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9096) Create external table ddls should send column lineages.

2020-02-12 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9096:
-
Description: 
Create external table with specified columns should create column lineages for 
tools like Altas to consume.

 

For example:

create EXTERNAL TABLE IF NOT EXISTS friday_ext6
 (STUD_ID int,
 DEPT_ID int,
 NAME string
 )
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ‘,’
 STORED AS TEXTFILE;

Currently we send a lineage like:
{code:java}
 {
 "queryText":"create EXTERNAL TABLE IF NOT EXISTS friday_ext5 (STUD_ID int, 
DEPT_IDint, NAME string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ 
STORED AS TEXTFILE LOCATION 
‘/warehouse/tablespace/external/hive/testdb.db/friday_ext5’",
 "queryId":"4b471ac0ca2b0f93:029db79c",
 "hash":"867fae20bc6c8254c05774cc923a99fa",
 "user":"admin",
 "timestamp":1572028716,
 "endTime":1572028716,
 "edges":[],
 "vertices":[],
 
"tableLocation":"hdfs://sid-cdp-2-1.gce.cloudera.com:8020/warehouse/tablespace/external/hive/testdb.db/friday_ext"
}
 {code}
Atlas needs fully qualified table name to create lineage. 

  was:
Create external table with specified columns should create column lineages for 
tools like Altas to consume.

 

For example:

create EXTERNAL TABLE IF NOT EXISTS friday_ext6
(STUD_ID int,
DEPT_ID int,
NAME string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
STORED AS TEXTFILE;

Currently we send a lineage like:
{code:java}
 {
 "queryText":"create EXTERNAL TABLE IF NOT EXISTS friday_ext5 (STUD_ID int, 
DEPT_IDint, NAME string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ 
STORED AS TEXTFILE LOCATION 
‘/warehouse/tablespace/external/hive/testdb.db/friday_ext5’",
 "queryId":"4b471ac0ca2b0f93:029db79c",
 "hash":"867fae20bc6c8254c05774cc923a99fa",
 "user":"admin",
 "timestamp":1572028716,
 "endTime":1572028716,
 "edges":[],
 "vertices":[],
 
"tableLocation":"hdfs://sid-cdp-2-1.gce.cloudera.com:8020/warehouse/tablespace/external/hive/testdb.db/friday_ext"
}
 {code}
Atlas needs fully qualified table name to create lineage.

 

 

 


> Create external table ddls should send column lineages.
> ---
>
> Key: IMPALA-9096
> URL: https://issues.apache.org/jira/browse/IMPALA-9096
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Priority: Minor
>
> Create external table with specified columns should create column lineages 
> for tools like Altas to consume.
>  
> For example:
> create EXTERNAL TABLE IF NOT EXISTS friday_ext6
>  (STUD_ID int,
>  DEPT_ID int,
>  NAME string
>  )
>  ROW FORMAT DELIMITED
>  FIELDS TERMINATED BY ‘,’
>  STORED AS TEXTFILE;
> Currently we send a lineage like:
> {code:java}
>  {
>  "queryText":"create EXTERNAL TABLE IF NOT EXISTS friday_ext5 (STUD_ID int, 
> DEPT_IDint, NAME string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ 
> STORED AS TEXTFILE LOCATION 
> ‘/warehouse/tablespace/external/hive/testdb.db/friday_ext5’",
>  "queryId":"4b471ac0ca2b0f93:029db79c",
>  "hash":"867fae20bc6c8254c05774cc923a99fa",
>  "user":"admin",
>  "timestamp":1572028716,
>  "endTime":1572028716,
>  "edges":[],
>  "vertices":[],
>  
> "tableLocation":"hdfs://sid-cdp-2-1.gce.cloudera.com:8020/warehouse/tablespace/external/hive/testdb.db/friday_ext"
> }
>  {code}
> Atlas needs fully qualified table name to create lineage. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9336) Impala doc: Document the create table syntax for Primary Key and Foreign Keys spec.

2020-02-12 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-9336.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Impala doc: Document the create table syntax for Primary Key and Foreign Keys 
> spec.
> ---
>
> Key: IMPALA-9336
> URL: https://issues.apache.org/jira/browse/IMPALA-9336
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Anurag Mantripragada
>Assignee: Kris Hahn
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> This Jira tracks the documentation needed for defining primary keys and 
> foreign keys as part of create table in Impala. See IMPALA-2112.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9256) Refactor constraint information into a separate class.

2020-02-12 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9256 started by Anurag Mantripragada.

> Refactor constraint information into a separate class.
> --
>
> Key: IMPALA-9256
> URL: https://issues.apache.org/jira/browse/IMPALA-9256
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>
> We recently added support for primary keys and foreign keys information for 
> tables. However, it is cleaner to have an SQLConstraint class as a container 
> for these constraints just like hive does here: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/constraint/Constraints.java.|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/constraint/Constraints.java]
> This can be extended to support other kinds of constraints in the future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9096) Create external table ddls should send column lineages.

2020-02-12 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9096:
-
Priority: Minor  (was: Critical)

> Create external table ddls should send column lineages.
> ---
>
> Key: IMPALA-9096
> URL: https://issues.apache.org/jira/browse/IMPALA-9096
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Priority: Minor
>
> Create external table with specified columns should create column lineages 
> for tools like Altas to consume.
>  
> For example:
> create EXTERNAL TABLE IF NOT EXISTS friday_ext6
> (STUD_ID int,
> DEPT_ID int,
> NAME string
> )
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ‘,’
> STORED AS TEXTFILE;
> Currently we send a lineage like:
> {code:java}
>  {
>  "queryText":"create EXTERNAL TABLE IF NOT EXISTS friday_ext5 (STUD_ID int, 
> DEPT_IDint, NAME string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ 
> STORED AS TEXTFILE LOCATION 
> ‘/warehouse/tablespace/external/hive/testdb.db/friday_ext5’",
>  "queryId":"4b471ac0ca2b0f93:029db79c",
>  "hash":"867fae20bc6c8254c05774cc923a99fa",
>  "user":"admin",
>  "timestamp":1572028716,
>  "endTime":1572028716,
>  "edges":[],
>  "vertices":[],
>  
> "tableLocation":"hdfs://sid-cdp-2-1.gce.cloudera.com:8020/warehouse/tablespace/external/hive/testdb.db/friday_ext"
> }
>  {code}
> Atlas needs fully qualified table name to create lineage.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-9311) test_show_create_table failed with primary key mismatch

2020-01-21 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020716#comment-17020716
 ] 

Anurag Mantripragada edited comment on IMPALA-9311 at 1/22/20 2:04 AM:
---

Thanks for creating the JIRA. In my local tests, hms always returned the 
primary keys in the reverse order of their definition. Obviously there is some 
inconsistency here. I will dig deeper into the HMS code to see if there is a 
specific pattern. If not, this could make a bunch of catalog tests that I wrote 
for PK/FK also flaky.


was (Author: anuragmantri):
Thanks for creating the JIRA. In my local tests, hms always returned the 
primary keys in the reverse order of their definition. Obviously there is some 
inconsistency here. I will dig deeper into the HMS code to see if there is a 
specific pattern if not, this could make a bunch of catalog tests that I wrote 
for PK/FK also flaky.

> test_show_create_table failed with primary key mismatch
> ---
>
> Key: IMPALA-9311
> URL: https://issues.apache.org/jira/browse/IMPALA-9311
> Project: IMPALA
>  Issue Type: Test
>Affects Versions: Impala 3.4.0
>Reporter: Xiaomeng Zhang
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: broken-build
>
> {code:java}
> Error Messagemetadata/test_show_create_table.py:62: in test_show_create_table 
> unique_database) metadata/test_show_create_table.py:110: in 
> __run_show_create_table_test_case self.__compare_result(expected_result, 
> create_table_result) metadata/test_show_create_table.py:146: in 
> __compare_result assert expected_sql_filtered == actual_sql_filtered E   
> assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'" E
>  Skipping 71 identical leading characters in diff, use -v to show E 
> Skipping 126 identical trailing characters in diff, use -v to show E - 
> MARY KEY (year, id)) ROW FO E ?    E + MARY KEY (id, 
> year)) ROW FO E ?   
> Stacktracemetadata/test_show_create_table.py:62: in test_show_create_table
> unique_database)
> metadata/test_show_create_table.py:110: in __run_show_create_table_test_case
> self.__compare_result(expected_result, create_table_result)
> metadata/test_show_create_table.py:146: in __compare_result
> assert expected_sql_filtered == actual_sql_filtered
> E   assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'"
> E Skipping 71 identical leading characters in diff, use -v to show
> E Skipping 126 identical trailing characters in diff, use -v to show
> E - MARY KEY (year, id)) ROW FO
> E ?   
> E + MARY KEY (id, year)) ROW FO
> E ?   {code}
> I think this is due to commit 
> [https://github.com/apache/impala/commit/cfe60858da110cf1256bd3aa5d4f8d374578a33d]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9311) test_show_create_table failed with primary key mismatch

2020-01-21 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020716#comment-17020716
 ] 

Anurag Mantripragada commented on IMPALA-9311:
--

Thanks for creating the JIRA. In my local tests, hms always returned the 
primary keys in the reverse order of their definition. Obviously there is some 
inconsistency here. I will dig deeper into the HMS code to see if there is a 
specific pattern if not, this could make a bunch of catalog tests that I wrote 
for PK/FK also flaky.

> test_show_create_table failed with primary key mismatch
> ---
>
> Key: IMPALA-9311
> URL: https://issues.apache.org/jira/browse/IMPALA-9311
> Project: IMPALA
>  Issue Type: Test
>Affects Versions: Impala 3.4.0
>Reporter: Xiaomeng Zhang
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: broken-build
>
> {code:java}
> Error Messagemetadata/test_show_create_table.py:62: in test_show_create_table 
> unique_database) metadata/test_show_create_table.py:110: in 
> __run_show_create_table_test_case self.__compare_result(expected_result, 
> create_table_result) metadata/test_show_create_table.py:146: in 
> __compare_result assert expected_sql_filtered == actual_sql_filtered E   
> assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'" E
>  Skipping 71 identical leading characters in diff, use -v to show E 
> Skipping 126 identical trailing characters in diff, use -v to show E - 
> MARY KEY (year, id)) ROW FO E ?    E + MARY KEY (id, 
> year)) ROW FO E ?   
> Stacktracemetadata/test_show_create_table.py:62: in test_show_create_table
> unique_database)
> metadata/test_show_create_table.py:110: in __run_show_create_table_test_case
> self.__compare_result(expected_result, create_table_result)
> metadata/test_show_create_table.py:146: in __compare_result
> assert expected_sql_filtered == actual_sql_filtered
> E   assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'"
> E Skipping 71 identical leading characters in diff, use -v to show
> E Skipping 126 identical trailing characters in diff, use -v to show
> E - MARY KEY (year, id)) ROW FO
> E ?   
> E + MARY KEY (id, year)) ROW FO
> E ?   {code}
> I think this is due to commit 
> [https://github.com/apache/impala/commit/cfe60858da110cf1256bd3aa5d4f8d374578a33d]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9311) test_show_create_table failed with primary key mismatch

2020-01-21 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020680#comment-17020680
 ] 

Anurag Mantripragada commented on IMPALA-9311:
--

Found in: 
[https://master-02.jenkins.cloudera.com/job/impala-asf-master-core-s3/557]

 

> test_show_create_table failed with primary key mismatch
> ---
>
> Key: IMPALA-9311
> URL: https://issues.apache.org/jira/browse/IMPALA-9311
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Xiaomeng Zhang
>Assignee: Anurag Mantripragada
>Priority: Major
>
> {code:java}
> Error Messagemetadata/test_show_create_table.py:62: in test_show_create_table 
> unique_database) metadata/test_show_create_table.py:110: in 
> __run_show_create_table_test_case self.__compare_result(expected_result, 
> create_table_result) metadata/test_show_create_table.py:146: in 
> __compare_result assert expected_sql_filtered == actual_sql_filtered E   
> assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'" E
>  Skipping 71 identical leading characters in diff, use -v to show E 
> Skipping 126 identical trailing characters in diff, use -v to show E - 
> MARY KEY (year, id)) ROW FO E ?    E + MARY KEY (id, 
> year)) ROW FO E ?   
> Stacktracemetadata/test_show_create_table.py:62: in test_show_create_table
> unique_database)
> metadata/test_show_create_table.py:110: in __run_show_create_table_test_case
> self.__compare_result(expected_result, create_table_result)
> metadata/test_show_create_table.py:146: in __compare_result
> assert expected_sql_filtered == actual_sql_filtered
> E   assert "CREATE EXTER...parent_table'" == "CREATE EXTERN...parent_table'"
> E Skipping 71 identical leading characters in diff, use -v to show
> E Skipping 126 identical trailing characters in diff, use -v to show
> E - MARY KEY (year, id)) ROW FO
> E ?   
> E + MARY KEY (id, year)) ROW FO
> E ?   {code}
> I think this is due to commit 
> [https://github.com/apache/impala/commit/cfe60858da110cf1256bd3aa5d4f8d374578a33d]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8290) Display constraint information in 'SHOW CREATE' statement

2020-01-22 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-8290.
--
Resolution: Fixed

Resolved as part of IMPALA-9158

> Display constraint information in 'SHOW CREATE' statement
> -
>
> Key: IMPALA-8290
> URL: https://issues.apache.org/jira/browse/IMPALA-8290
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> Show create statement should display primary key and foreign key information.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9158) Support loading PK/FK constraints in LocalCatalog.

2020-01-22 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-9158.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Support loading PK/FK constraints in LocalCatalog.
> --
>
> Key: IMPALA-9158
> URL: https://issues.apache.org/jira/browse/IMPALA-9158
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> Currently, we only added support for loading PK/FK information for Catalog 
> V1. Supporting it in LocalCatlog needs implementing loading in 
> CatalogMetaProvider and DirectMetaProvider.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8290) Display constraint information in 'SHOW CREATE' statement

2020-01-22 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-8290:
-
Fix Version/s: Impala 3.4.0

> Display constraint information in 'SHOW CREATE' statement
> -
>
> Key: IMPALA-8290
> URL: https://issues.apache.org/jira/browse/IMPALA-8290
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> Show create statement should display primary key and foreign key information.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9369) Inserts on large tables could be very slow when event processing it turned on

2020-03-09 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9369 started by Anurag Mantripragada.

> Inserts on large tables could be very slow when event processing it turned on
> -
>
> Key: IMPALA-9369
> URL: https://issues.apache.org/jira/browse/IMPALA-9369
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> In case where large number files are being inserted into a table, the 
> {{createInsertEvents}} method fires insert events to HMS for each partition 
> one take a time. This could be very slow for a insert statement which is 
> added hundreds or thousands of files.
> We should see if we can fire the insert events asynchronously instead of 
> blocking the query from returning to the user until all the insert events are 
> fired.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8005) Randomize partitioning exchanges destinations

2020-03-09 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8005 started by Anurag Mantripragada.

> Randomize partitioning exchanges destinations
> -
>
> Key: IMPALA-8005
> URL: https://issues.apache.org/jira/browse/IMPALA-8005
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: ramp-up
>
> Currently, we use the same hash seed for partitioning exchanges at the 
> sender. For a table with skew in distribution in the shuffling keys, multiple 
> queries using the same shuffling keys for exchanges will end up hashing to 
> the same destination fragments running on particular host and potentially 
> overloading that host.
> We should consider using the query id or other query specific information to 
> seed the hashing function to randomize the destinations for different 
> queries. Thanks to [~tlipcon] for pointing this problem out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9663) Insert overwrites should not throw NPE.

2020-04-16 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9663 started by Anurag Mantripragada.

> Insert overwrites should not throw NPE.
> ---
>
> Key: IMPALA-9663
> URL: https://issues.apache.org/jira/browse/IMPALA-9663
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0, Impala 3.4.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> Insert overwrite can throw NPE when there are no insert events, there is no 
> null check for this.
> https://github.com/apache/impala/blob/dc410a2cf47bcf06a0f4563d05a9d0a339af5fb2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L4517



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9663) Insert overwrites should not throw NPE.

2020-04-16 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9663:


 Summary: Insert overwrites should not throw NPE.
 Key: IMPALA-9663
 URL: https://issues.apache.org/jira/browse/IMPALA-9663
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 4.0, Impala 3.4.0
Reporter: Anurag Mantripragada
Assignee: Anurag Mantripragada


Insert overwrite can throw NPE when there are no insert events, there is no 
null check for this.

https://github.com/apache/impala/blob/dc410a2cf47bcf06a0f4563d05a9d0a339af5fb2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L4517



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-04-07 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9433 started by Anurag Mantripragada.

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-04-07 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9433:
-
Priority: Critical  (was: Minor)

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: Anurag Mantripragada
>Priority: Critical
>  Labels: frontend, ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work stopped] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-04-07 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9433 stopped by Anurag Mantripragada.

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-04-07 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9433:
-
Labels: frontend ramp-up  (was: ramp-up)

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: frontend, ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-04-07 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9433:
-
Labels: ramp-up  (was: frontend ramp-up)

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-04-07 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9433:
-
Priority: Minor  (was: Critical)

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: frontend, ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9369) Inserts on large tables could be very slow when event processing it turned on

2020-03-13 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-9369.
--
Target Version: Impala 3.4.0
Resolution: Fixed

> Inserts on large tables could be very slow when event processing it turned on
> -
>
> Key: IMPALA-9369
> URL: https://issues.apache.org/jira/browse/IMPALA-9369
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> In case where large number files are being inserted into a table, the 
> {{createInsertEvents}} method fires insert events to HMS for each partition 
> one take a time. This could be very slow for a insert statement which is 
> added hundreds or thousands of files.
> We should see if we can fire the insert events asynchronously instead of 
> blocking the query from returning to the user until all the insert events are 
> fired.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-03-26 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9433 started by Anurag Mantripragada.

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9673) Tests expecting results to be in test-warehouse/managed but find test-warehouse

2020-05-01 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9673:


Assignee: Anurag Mantripragada

> Tests expecting results to be in test-warehouse/managed but find  
> test-warehouse
> 
>
> Key: IMPALA-9673
> URL: https://issues.apache.org/jira/browse/IMPALA-9673
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> TestDdlStatements.test_create_database
> {code}
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 'test_create_database_b10825a1_2','hdfs://localhost:20500/test-warehouse/managed/test_create_database_b10825a1_2.db','For
>  testing' != 
> 'test_create_database_b10825a1_2','hdfs://localhost:20500/test-warehouse/test_create_database_b10825a1_2.db','For
>  testing'
> {code}
> TestMetadataQueryStatements.test_describe_db
> {code}
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 'default','hdfs://localhost:20500/test-warehouse/managed','Default Hive 
> database' != 'default','hdfs://localhost:20500/test-warehouse','Default Hive 
> database'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9673) Tests expecting results to be in test-warehouse/managed but find test-warehouse

2020-05-01 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9673:


Assignee: Xiaomeng Zhang  (was: Anurag Mantripragada)

> Tests expecting results to be in test-warehouse/managed but find  
> test-warehouse
> 
>
> Key: IMPALA-9673
> URL: https://issues.apache.org/jira/browse/IMPALA-9673
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Xiaomeng Zhang
>Priority: Critical
>
> TestDdlStatements.test_create_database
> {code}
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 'test_create_database_b10825a1_2','hdfs://localhost:20500/test-warehouse/managed/test_create_database_b10825a1_2.db','For
>  testing' != 
> 'test_create_database_b10825a1_2','hdfs://localhost:20500/test-warehouse/test_create_database_b10825a1_2.db','For
>  testing'
> {code}
> TestMetadataQueryStatements.test_describe_db
> {code}
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 'default','hdfs://localhost:20500/test-warehouse/managed','Default Hive 
> database' != 'default','hdfs://localhost:20500/test-warehouse','Default Hive 
> database'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9673) Tests expecting results to be in test-warehouse/managed but find test-warehouse

2020-05-01 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097733#comment-17097733
 ] 

Anurag Mantripragada commented on IMPALA-9673:
--

[~Xiaomeng Zhang], randomly assigning to you. Please feel free to reasign. 

> Tests expecting results to be in test-warehouse/managed but find  
> test-warehouse
> 
>
> Key: IMPALA-9673
> URL: https://issues.apache.org/jira/browse/IMPALA-9673
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Xiaomeng Zhang
>Priority: Critical
>
> TestDdlStatements.test_create_database
> {code}
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 'test_create_database_b10825a1_2','hdfs://localhost:20500/test-warehouse/managed/test_create_database_b10825a1_2.db','For
>  testing' != 
> 'test_create_database_b10825a1_2','hdfs://localhost:20500/test-warehouse/test_create_database_b10825a1_2.db','For
>  testing'
> {code}
> TestMetadataQueryStatements.test_describe_db
> {code}
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 'default','hdfs://localhost:20500/test-warehouse/managed','Default Hive 
> database' != 'default','hdfs://localhost:20500/test-warehouse','Default Hive 
> database'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8725) Improve usability when HMS is configured with strict managed tables

2020-05-19 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-8725:
-
Affects Version/s: Impala 3.3.0

> Improve usability when HMS is configured with strict managed tables
> ---
>
> Key: IMPALA-8725
> URL: https://issues.apache.org/jira/browse/IMPALA-8725
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> Users tend to create and query managed tables often and when HMS is 
> configured with strict managed tables they get: 
> {code:java}
> Table default.foo failed strict managed table checks due to the following 
> reason: Table is marked as a managed table but is not transactional{code}
> We should improve usability in these scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8725) Improve usability when HMS is configured with strict managed tables

2020-05-19 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110887#comment-17110887
 ] 

Anurag Mantripragada commented on IMPALA-8725:
--

This is not an issue anymore. Support for DEFAULT_TRANSACTIONAL_TYPE query 
option was added in IMPALA-8808. I'm trying to find the Jira made "insert_only" 
the default type.

> Improve usability when HMS is configured with strict managed tables
> ---
>
> Key: IMPALA-8725
> URL: https://issues.apache.org/jira/browse/IMPALA-8725
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> Users tend to create and query managed tables often and when HMS is 
> configured with strict managed tables they get: 
> {code:java}
> Table default.foo failed strict managed table checks due to the following 
> reason: Table is marked as a managed table but is not transactional{code}
> We should improve usability in these scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8725) Improve usability when HMS is configured with strict managed tables

2020-05-19 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-8725:
-
Fix Version/s: Impala 3.3.0

> Improve usability when HMS is configured with strict managed tables
> ---
>
> Key: IMPALA-8725
> URL: https://issues.apache.org/jira/browse/IMPALA-8725
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> Users tend to create and query managed tables often and when HMS is 
> configured with strict managed tables they get: 
> {code:java}
> Table default.foo failed strict managed table checks due to the following 
> reason: Table is marked as a managed table but is not transactional{code}
> We should improve usability in these scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8725) Improve usability when HMS is configured with strict managed tables

2020-05-19 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-8725:
-
Fix Version/s: (was: Impala 3.3.0)

> Improve usability when HMS is configured with strict managed tables
> ---
>
> Key: IMPALA-8725
> URL: https://issues.apache.org/jira/browse/IMPALA-8725
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> Users tend to create and query managed tables often and when HMS is 
> configured with strict managed tables they get: 
> {code:java}
> Table default.foo failed strict managed table checks due to the following 
> reason: Table is marked as a managed table but is not transactional{code}
> We should improve usability in these scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9749) ASAN builds should not run FE Tests

2020-05-14 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9749:


 Summary: ASAN builds should not run FE Tests
 Key: IMPALA-9749
 URL: https://issues.apache.org/jira/browse/IMPALA-9749
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.4.0
Reporter: Anurag Mantripragada
Assignee: Anurag Mantripragada






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9749) ASAN builds should not run FE Tests

2020-05-14 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9749:
-
Description: [https://gerrit.cloudera.org/#/c/15778/] inadvertently changed 
the behaviour of ASAN builds to to run FE tests. The behavior should be to skip 
FE tests.

> ASAN builds should not run FE Tests
> ---
>
> Key: IMPALA-9749
> URL: https://issues.apache.org/jira/browse/IMPALA-9749
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: build-failure
>
> [https://gerrit.cloudera.org/#/c/15778/] inadvertently changed the behaviour 
> of ASAN builds to to run FE tests. The behavior should be to skip FE tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-8725) Improve usability when HMS is configured with strict managed tables

2020-05-19 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada closed IMPALA-8725.

Target Version: Impala 3.3.0
Resolution: Not A Problem

> Improve usability when HMS is configured with strict managed tables
> ---
>
> Key: IMPALA-8725
> URL: https://issues.apache.org/jira/browse/IMPALA-8725
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> Users tend to create and query managed tables often and when HMS is 
> configured with strict managed tables they get: 
> {code:java}
> Table default.foo failed strict managed table checks due to the following 
> reason: Table is marked as a managed table but is not transactional{code}
> We should improve usability in these scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8005) Randomize partitioning exchanges destinations

2020-05-28 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-8005.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Randomize partitioning exchanges destinations
> -
>
> Key: IMPALA-8005
> URL: https://issues.apache.org/jira/browse/IMPALA-8005
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: ramp-up
> Fix For: Impala 3.4.0
>
>
> Currently, we use the same hash seed for partitioning exchanges at the 
> sender. For a table with skew in distribution in the shuffling keys, multiple 
> queries using the same shuffling keys for exchanges will end up hashing to 
> the same destination fragments running on particular host and potentially 
> overloading that host.
> We should consider using the query id or other query specific information to 
> seed the hashing function to randomize the destinations for different 
> queries. Thanks to [~tlipcon] for pointing this problem out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9749) ASAN builds should not run FE Tests

2020-06-01 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-9749.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> ASAN builds should not run FE Tests
> ---
>
> Key: IMPALA-9749
> URL: https://issues.apache.org/jira/browse/IMPALA-9749
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: build-failure
> Fix For: Impala 3.4.0
>
>
> [https://gerrit.cloudera.org/#/c/15778/] inadvertently changed the behaviour 
> of ASAN builds to to run FE tests. The behavior should be to skip FE tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9749) ASAN builds should not run FE Tests

2020-06-01 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9749:
-
Fix Version/s: (was: Impala 3.4.0)
   Impala 4.0

> ASAN builds should not run FE Tests
> ---
>
> Key: IMPALA-9749
> URL: https://issues.apache.org/jira/browse/IMPALA-9749
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>  Labels: build-failure
> Fix For: Impala 4.0
>
>
> [https://gerrit.cloudera.org/#/c/15778/] inadvertently changed the behaviour 
> of ASAN builds to to run FE tests. The behavior should be to skip FE tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9980) Remove jersey* jars from exclusions.

2020-07-21 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada resolved IMPALA-9980.
--
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Remove jersey* jars from exclusions.
> 
>
> Key: IMPALA-9980
> URL: https://issues.apache.org/jira/browse/IMPALA-9980
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
> Fix For: Impala 4.0
>
>
> IMPALA-9679 removed a jersey* jars from being included in docker images. 
> These jars are required by Impala ranger plugin and as a result ClassNotFound 
> errors are thrown in Impala docker container when ranger plugin is enabled. 
> Part of the exception below
> {code:java}
> I0720 03:41:41.647934 7 jni-util.cc:288] 
> org.apache.impala.common.InternalException: Unable to instantiate 
> authorization provider: 
> org.apache.impala.authorization.ranger.RangerAuthorizationFactory
> at 
> org.apache.impala.util.AuthorizationUtil.authzFactoryFrom(AuthorizationUtil.java:88)
> at org.apache.impala.service.JniCatalog.(JniCatalog.java:121)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.impala.util.AuthorizationUtil.authzFactoryFrom(AuthorizationUtil.java:86)
> ... 1 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:182)
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:175)
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:162)
> at com.sun.jersey.api.client.Client.init(Client.java:343)
> at com.sun.jersey.api.client.Client.access$000(Client.java:119)
> at com.sun.jersey.api.client.Client$1.f(Client.java:192)
> at com.sun.jersey.api.client.Client$1.f(Client.java:188)
> at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
> at com.sun.jersey.api.client.Client.(Client.java:188)
> at com.sun.jersey.api.client.Client.(Client.java:171)
> at com.sun.jersey.api.client.Client.create(Client.java:683)
> at 
> org.apache.ranger.plugin.util.RangerRESTClient.buildClient(RangerRESTClient.java:214)
> at 
> org.apache.ranger.plugin.util.RangerRESTClient.getClient(RangerRESTClient.java:187)
> at 
> org.apache.ranger.plugin.util.RangerRESTClient.get(RangerRESTClient.java:448)
> at 
> org.apache.ranger.admin.client.RangerAdminRESTClient$4.run(RangerAdminRESTClient.java:228)
> at 
> org.apache.ranger.admin.client.RangerAdminRESTClient$4.run(RangerAdminRESTClient.java:223)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:360)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1856)
> at 
> org.apache.ranger.admin.client.RangerAdminRESTClient.getRolesIfUpdated(RangerAdminRESTClient.java:235)
> at 
> org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRolesFromAdmin(RangerRolesProvider.java:183)
> at 
> org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRoles(RangerRolesProvider.java:123)
> at 
> org.apache.ranger.plugin.util.PolicyRefresher.loadRoles(PolicyRefresher.java:493)
> at 
> org.apache.ranger.plugin.util.PolicyRefresher.startRefresher(PolicyRefresher.java:143)
> at 
> org.apache.ranger.plugin.service.RangerBasePlugin.init(RangerBasePlugin.java:182)
> at 
> org.apache.impala.authorization.ranger.RangerImpalaPlugin.getInstance(RangerImpalaPlugin.java:53)
> at 
> org.apache.impala.authorization.ranger.RangerAuthorizationChecker.(RangerAuthorizationChecker.java:80)
> at 
> org.apache.impala.authorization.ranger.RangerAuthorizationFactory.(RangerAuthorizationFactory.java:44)
> ... 6 more
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.glassfish.jersey.internal.RuntimeDelegateImpl
> at 
> javax.ws.rs.ext.RuntimeDelegate.findDelegate(RuntimeDelegate.java:130)
> at 
> javax.ws.rs.ext.RuntimeDelegate.getInstance(RuntimeDelegate.java:97)
> at 

[jira] [Assigned] (IMPALA-9848) Coordinator unnecessarily invalidating locally cached table metadata

2020-07-09 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9848:


Assignee: Anurag Mantripragada

> Coordinator unnecessarily invalidating locally cached table metadata
> 
>
> Key: IMPALA-9848
> URL: https://issues.apache.org/jira/browse/IMPALA-9848
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog, Frontend
>Reporter: Sahil Takiar
>Assignee: Anurag Mantripragada
>Priority: Major
> Attachments: IMPALA-9848-catalogd.INFO, IMPALA-9848-impalad.INFO
>
>
> The following fails when run locally on master:
> {code:java}
> ./bin/start-impala-cluster.py --catalogd_args='--catalog_topic_mode=minimal' 
> --impalad_args='--use_local_catalog'
> ./bin/impala-shell.sh
> [localhost:21000] default> select count(l_comment) from tpch.lineitem; <--- 
> THIS WORKS
> # kill the catalogd process
> [localhost:21000] default> select count(l_comment) from tpch.lineitem; <--- 
> THIS FAILS
> ERROR: AnalysisException: Failed to load metadata for table: 'tpch.lineitem'
> CAUSED BY: TableLoadingException: Could not load table tpch.lineitem from 
> catalog
> CAUSED BY: TException: org.apache.impala.common.InternalException: Couldn't 
> open transport for localhost:26000 (connect() failed: Connection 
> refused)CAUSED BY: InternalException: Couldn't open transport for 
> localhost:26000 (connect() failed: Connection refused {code}
> The above experiment works with catalog v1 - e.g. if you remove the startup 
> flags in the {{./bin/start-impala-cluster.py}} everything works.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9095) Alter table events generated by renames are not renaming the table to a different DB.

2020-06-14 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135358#comment-17135358
 ] 

Anurag Mantripragada commented on IMPALA-9095:
--

Hi [~vihangk1], you are right, this doesn't seem to be caused by IMPALA-9017 
which is related to ALTER TABLES whereas this Jira is about ALTER DB renames. I 
will remove the linked JIRA.

> Alter table events generated by renames are not renaming the table to a 
> different DB.
> -
>
> Key: IMPALA-9095
> URL: https://issues.apache.org/jira/browse/IMPALA-9095
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> Alter table renames was recently refactored. This introduced a bug where 
> rename to a different database is not applied correctly.
> Steps to reproduce:
> From Hive:
> {code:java}
> create database bug1;
> create table bug1.foo (id int);
> create database bug2;
> alter table bug1.foo rename to bug2.foo;{code}
>  
> From Impala:
> {code:java}
> use bug2;
> show tables;{code}
>  
> Expect foo to show up in bug2, it doesn't. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9980) Remove jersey* jars from exclusions.

2020-07-20 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9980 started by Anurag Mantripragada.

> Remove jersey* jars from exclusions.
> 
>
> Key: IMPALA-9980
> URL: https://issues.apache.org/jira/browse/IMPALA-9980
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Major
>
> IMPALA-9679 removed a jersey* jars from being included in docker images. 
> These jars are required by Impala ranger plugin and as a result ClassNotFound 
> errors are thrown in Impala docker container when ranger plugin is enabled. 
> Part of the exception below
> {code:java}
> I0720 03:41:41.647934 7 jni-util.cc:288] 
> org.apache.impala.common.InternalException: Unable to instantiate 
> authorization provider: 
> org.apache.impala.authorization.ranger.RangerAuthorizationFactory
> at 
> org.apache.impala.util.AuthorizationUtil.authzFactoryFrom(AuthorizationUtil.java:88)
> at org.apache.impala.service.JniCatalog.(JniCatalog.java:121)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.impala.util.AuthorizationUtil.authzFactoryFrom(AuthorizationUtil.java:86)
> ... 1 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:182)
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:175)
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:162)
> at com.sun.jersey.api.client.Client.init(Client.java:343)
> at com.sun.jersey.api.client.Client.access$000(Client.java:119)
> at com.sun.jersey.api.client.Client$1.f(Client.java:192)
> at com.sun.jersey.api.client.Client$1.f(Client.java:188)
> at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
> at com.sun.jersey.api.client.Client.(Client.java:188)
> at com.sun.jersey.api.client.Client.(Client.java:171)
> at com.sun.jersey.api.client.Client.create(Client.java:683)
> at 
> org.apache.ranger.plugin.util.RangerRESTClient.buildClient(RangerRESTClient.java:214)
> at 
> org.apache.ranger.plugin.util.RangerRESTClient.getClient(RangerRESTClient.java:187)
> at 
> org.apache.ranger.plugin.util.RangerRESTClient.get(RangerRESTClient.java:448)
> at 
> org.apache.ranger.admin.client.RangerAdminRESTClient$4.run(RangerAdminRESTClient.java:228)
> at 
> org.apache.ranger.admin.client.RangerAdminRESTClient$4.run(RangerAdminRESTClient.java:223)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:360)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1856)
> at 
> org.apache.ranger.admin.client.RangerAdminRESTClient.getRolesIfUpdated(RangerAdminRESTClient.java:235)
> at 
> org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRolesFromAdmin(RangerRolesProvider.java:183)
> at 
> org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRoles(RangerRolesProvider.java:123)
> at 
> org.apache.ranger.plugin.util.PolicyRefresher.loadRoles(PolicyRefresher.java:493)
> at 
> org.apache.ranger.plugin.util.PolicyRefresher.startRefresher(PolicyRefresher.java:143)
> at 
> org.apache.ranger.plugin.service.RangerBasePlugin.init(RangerBasePlugin.java:182)
> at 
> org.apache.impala.authorization.ranger.RangerImpalaPlugin.getInstance(RangerImpalaPlugin.java:53)
> at 
> org.apache.impala.authorization.ranger.RangerAuthorizationChecker.(RangerAuthorizationChecker.java:80)
> at 
> org.apache.impala.authorization.ranger.RangerAuthorizationFactory.(RangerAuthorizationFactory.java:44)
> ... 6 more
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.glassfish.jersey.internal.RuntimeDelegateImpl
> at 
> javax.ws.rs.ext.RuntimeDelegate.findDelegate(RuntimeDelegate.java:130)
> at 
> javax.ws.rs.ext.RuntimeDelegate.getInstance(RuntimeDelegate.java:97)
> at javax.ws.rs.core.MediaType.valueOf(MediaType.java:172)
> at 

[jira] [Created] (IMPALA-9980) Remove jersey* jars from exclusions.

2020-07-20 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9980:


 Summary: Remove jersey* jars from exclusions.
 Key: IMPALA-9980
 URL: https://issues.apache.org/jira/browse/IMPALA-9980
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 4.0
Reporter: Anurag Mantripragada
Assignee: Anurag Mantripragada


IMPALA-9679 removed a jersey* jars from being included in docker images. These 
jars are required by Impala ranger plugin and as a result ClassNotFound errors 
are thrown in Impala docker container when ranger plugin is enabled. Part of 
the exception below
{code:java}
I0720 03:41:41.647934 7 jni-util.cc:288] 
org.apache.impala.common.InternalException: Unable to instantiate authorization 
provider: org.apache.impala.authorization.ranger.RangerAuthorizationFactory
at 
org.apache.impala.util.AuthorizationUtil.authzFactoryFrom(AuthorizationUtil.java:88)
at org.apache.impala.service.JniCatalog.(JniCatalog.java:121)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.impala.util.AuthorizationUtil.authzFactoryFrom(AuthorizationUtil.java:86)
... 1 more
Caused by: java.lang.ExceptionInInitializerError
at 
com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:182)
at 
com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:175)
at 
com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:162)
at com.sun.jersey.api.client.Client.init(Client.java:343)
at com.sun.jersey.api.client.Client.access$000(Client.java:119)
at com.sun.jersey.api.client.Client$1.f(Client.java:192)
at com.sun.jersey.api.client.Client$1.f(Client.java:188)
at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
at com.sun.jersey.api.client.Client.(Client.java:188)
at com.sun.jersey.api.client.Client.(Client.java:171)
at com.sun.jersey.api.client.Client.create(Client.java:683)
at 
org.apache.ranger.plugin.util.RangerRESTClient.buildClient(RangerRESTClient.java:214)
at 
org.apache.ranger.plugin.util.RangerRESTClient.getClient(RangerRESTClient.java:187)
at 
org.apache.ranger.plugin.util.RangerRESTClient.get(RangerRESTClient.java:448)
at 
org.apache.ranger.admin.client.RangerAdminRESTClient$4.run(RangerAdminRESTClient.java:228)
at 
org.apache.ranger.admin.client.RangerAdminRESTClient$4.run(RangerAdminRESTClient.java:223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1856)
at 
org.apache.ranger.admin.client.RangerAdminRESTClient.getRolesIfUpdated(RangerAdminRESTClient.java:235)
at 
org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRolesFromAdmin(RangerRolesProvider.java:183)
at 
org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRoles(RangerRolesProvider.java:123)
at 
org.apache.ranger.plugin.util.PolicyRefresher.loadRoles(PolicyRefresher.java:493)
at 
org.apache.ranger.plugin.util.PolicyRefresher.startRefresher(PolicyRefresher.java:143)
at 
org.apache.ranger.plugin.service.RangerBasePlugin.init(RangerBasePlugin.java:182)
at 
org.apache.impala.authorization.ranger.RangerImpalaPlugin.getInstance(RangerImpalaPlugin.java:53)
at 
org.apache.impala.authorization.ranger.RangerAuthorizationChecker.(RangerAuthorizationChecker.java:80)
at 
org.apache.impala.authorization.ranger.RangerAuthorizationFactory.(RangerAuthorizationFactory.java:44)
... 6 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
org.glassfish.jersey.internal.RuntimeDelegateImpl
at 
javax.ws.rs.ext.RuntimeDelegate.findDelegate(RuntimeDelegate.java:130)
at javax.ws.rs.ext.RuntimeDelegate.getInstance(RuntimeDelegate.java:97)
at javax.ws.rs.core.MediaType.valueOf(MediaType.java:172)
at com.sun.jersey.core.header.MediaTypes.(MediaTypes.java:65)
... 34 more
Caused by: java.lang.ClassNotFoundException: 
org.glassfish.jersey.internal.RuntimeDelegateImpl
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) 

[jira] [Updated] (IMPALA-9980) Remove jersey* jars from exclusions.

2020-07-20 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated IMPALA-9980:
-
Priority: Critical  (was: Major)

> Remove jersey* jars from exclusions.
> 
>
> Key: IMPALA-9980
> URL: https://issues.apache.org/jira/browse/IMPALA-9980
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> IMPALA-9679 removed a jersey* jars from being included in docker images. 
> These jars are required by Impala ranger plugin and as a result ClassNotFound 
> errors are thrown in Impala docker container when ranger plugin is enabled. 
> Part of the exception below
> {code:java}
> I0720 03:41:41.647934 7 jni-util.cc:288] 
> org.apache.impala.common.InternalException: Unable to instantiate 
> authorization provider: 
> org.apache.impala.authorization.ranger.RangerAuthorizationFactory
> at 
> org.apache.impala.util.AuthorizationUtil.authzFactoryFrom(AuthorizationUtil.java:88)
> at org.apache.impala.service.JniCatalog.(JniCatalog.java:121)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.impala.util.AuthorizationUtil.authzFactoryFrom(AuthorizationUtil.java:86)
> ... 1 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:182)
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.initReaders(MessageBodyFactory.java:175)
> at 
> com.sun.jersey.core.spi.factory.MessageBodyFactory.init(MessageBodyFactory.java:162)
> at com.sun.jersey.api.client.Client.init(Client.java:343)
> at com.sun.jersey.api.client.Client.access$000(Client.java:119)
> at com.sun.jersey.api.client.Client$1.f(Client.java:192)
> at com.sun.jersey.api.client.Client$1.f(Client.java:188)
> at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
> at com.sun.jersey.api.client.Client.(Client.java:188)
> at com.sun.jersey.api.client.Client.(Client.java:171)
> at com.sun.jersey.api.client.Client.create(Client.java:683)
> at 
> org.apache.ranger.plugin.util.RangerRESTClient.buildClient(RangerRESTClient.java:214)
> at 
> org.apache.ranger.plugin.util.RangerRESTClient.getClient(RangerRESTClient.java:187)
> at 
> org.apache.ranger.plugin.util.RangerRESTClient.get(RangerRESTClient.java:448)
> at 
> org.apache.ranger.admin.client.RangerAdminRESTClient$4.run(RangerAdminRESTClient.java:228)
> at 
> org.apache.ranger.admin.client.RangerAdminRESTClient$4.run(RangerAdminRESTClient.java:223)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:360)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1856)
> at 
> org.apache.ranger.admin.client.RangerAdminRESTClient.getRolesIfUpdated(RangerAdminRESTClient.java:235)
> at 
> org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRolesFromAdmin(RangerRolesProvider.java:183)
> at 
> org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRoles(RangerRolesProvider.java:123)
> at 
> org.apache.ranger.plugin.util.PolicyRefresher.loadRoles(PolicyRefresher.java:493)
> at 
> org.apache.ranger.plugin.util.PolicyRefresher.startRefresher(PolicyRefresher.java:143)
> at 
> org.apache.ranger.plugin.service.RangerBasePlugin.init(RangerBasePlugin.java:182)
> at 
> org.apache.impala.authorization.ranger.RangerImpalaPlugin.getInstance(RangerImpalaPlugin.java:53)
> at 
> org.apache.impala.authorization.ranger.RangerAuthorizationChecker.(RangerAuthorizationChecker.java:80)
> at 
> org.apache.impala.authorization.ranger.RangerAuthorizationFactory.(RangerAuthorizationFactory.java:44)
> ... 6 more
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.glassfish.jersey.internal.RuntimeDelegateImpl
> at 
> javax.ws.rs.ext.RuntimeDelegate.findDelegate(RuntimeDelegate.java:130)
> at 
> javax.ws.rs.ext.RuntimeDelegate.getInstance(RuntimeDelegate.java:97)
> at javax.ws.rs.core.MediaType.valueOf(MediaType.java:172)
> at 

[jira] [Assigned] (IMPALA-9848) Coordinator unnecessarily invalidating locally cached table metadata

2020-07-22 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9848:


Assignee: (was: Anurag Mantripragada)

> Coordinator unnecessarily invalidating locally cached table metadata
> 
>
> Key: IMPALA-9848
> URL: https://issues.apache.org/jira/browse/IMPALA-9848
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog, Frontend
>Reporter: Sahil Takiar
>Priority: Major
> Attachments: IMPALA-9848-catalogd.INFO, IMPALA-9848-impalad.INFO
>
>
> The following fails when run locally on master:
> {code:java}
> ./bin/start-impala-cluster.py --catalogd_args='--catalog_topic_mode=minimal' 
> --impalad_args='--use_local_catalog'
> ./bin/impala-shell.sh
> [localhost:21000] default> select count(l_comment) from tpch.lineitem; <--- 
> THIS WORKS
> # kill the catalogd process
> [localhost:21000] default> select count(l_comment) from tpch.lineitem; <--- 
> THIS FAILS
> ERROR: AnalysisException: Failed to load metadata for table: 'tpch.lineitem'
> CAUSED BY: TableLoadingException: Could not load table tpch.lineitem from 
> catalog
> CAUSED BY: TException: org.apache.impala.common.InternalException: Couldn't 
> open transport for localhost:26000 (connect() failed: Connection 
> refused)CAUSED BY: InternalException: Couldn't open transport for 
> localhost:26000 (connect() failed: Connection refused {code}
> The above experiment works with catalog v1 - e.g. if you remove the startup 
> flags in the {{./bin/start-impala-cluster.py}} everything works.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7507) Clean up user-facing error messages in LocalCatalog mode

2020-07-23 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-7507:


Assignee: (was: Anurag Mantripragada)

> Clean up user-facing error messages in LocalCatalog mode
> 
>
> Key: IMPALA-7507
> URL: https://issues.apache.org/jira/browse/IMPALA-7507
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: catalog-v2
>
> Currently even normal error messages for things like missing databases are 
> quite ugly when running with LocalCatalog:
> {code}
> ERROR: LocalCatalogException: Could not load table names for database 
> 'test_minimal_topic_updates_b246004e' from HMS
> CAUSED BY: TException: 
> TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
> error_msgs:[CatalogException: Database not found: 
> test_minimal_topic_updates_b246004e]))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8291) 'DESCRIBE EXTENDED ..' does not display constraint information

2020-07-23 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-8291:


Assignee: (was: Anurag Mantripragada)

> 'DESCRIBE EXTENDED ..' does not display constraint information
> --
>
> Key: IMPALA-8291
> URL: https://issues.apache.org/jira/browse/IMPALA-8291
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Anurag Mantripragada
>Priority: Major
>
> Currently, DESCRIBE EXTENDED table_name command does not display constraint 
> information like primary key / Foreign key information for tables created 
> through Hive.
> This work must also be extended to tables created through Impala once we have 
> support for pk/fk in create table syntax.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-3531) Implement deferrable and optionally enforced PK/FK constraints

2020-07-23 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-3531:


Assignee: (was: Anurag Mantripragada)

> Implement deferrable and optionally enforced PK/FK constraints
> --
>
> Key: IMPALA-3531
> URL: https://issues.apache.org/jira/browse/IMPALA-3531
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend, Perf Investigation
>Affects Versions: Impala 2.5.0, Impala 2.6.0
> Environment: CDH
>Reporter: Ruslan Dautkhanov
>Priority: Major
>  Labels: CBO, performance, ramp-up, sql-language
>
> Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for 
> Hive to start with something like that for PK/FK constraints. So CBO has more 
> information for optimizations. It does not have to actually check if that 
> constraint is relationship is true; it can just "rely" on that constraint.
> https://docs.oracle.com/database/121/SQLRF/clauses002.htm#sthref2289
> So it would be helpful with join cardinality estimates, and with cases like 
> IMPALA-2929.
> https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053
> "Overview of Constraint States":
> - Enforcement
> - Validation
> - Belief
> So FK/PK with "rely novalidate" will have Enforcement disabled but 
> Belief = RELY as it is possible to do in Oracle and now in Hive (HIVE-13076).
> It opens a lot of ways to do additional ways to optimize execution plans.
> As exxplined in Tom Kyte's "Metadata matters"
> http://www.peoug.org/wp-content/uploads/2009/12/MetadataMatters_PEOUG_Day2009_TKyte.pdf
> pp.30 - "Tell us how the tables relate and we can remove them from the 
> plan...".
> pp.35 - "Tell us how the tables relate and we have more access paths 
> available...".
> Also it might be helpful when Impala is being integrated with Kudu as the 
> latter have to have a PK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8592) Add support for insert events for 'LOAD DATA..' statements from Impala.

2020-07-23 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-8592:


Assignee: (was: Anurag Mantripragada)

> Add support for insert events for 'LOAD DATA..' statements from Impala.
> ---
>
> Key: IMPALA-8592
> URL: https://issues.apache.org/jira/browse/IMPALA-8592
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Anurag Mantripragada
>Priority: Major
>
> Hive generates INSERT events for LOAD DATA.. statements. We should support 
> the same in Impala.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8795) Enable event polling by default in tests

2020-07-23 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-8795:


Assignee: (was: Anurag Mantripragada)

> Enable event polling by default in tests
> 
>
> Key: IMPALA-8795
> URL: https://issues.apache.org/jira/browse/IMPALA-8795
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Priority: Major
>
> We should turn on event processing by default in all the tests to make sure 
> that there are no regressions when we turn ON the feature by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-07-23 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-9433:


Assignee: (was: Anurag Mantripragada)

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Priority: Minor
>  Labels: ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8208) Support ALTER commands for Rely/Norely novalidate for primary key/foreign key

2020-07-23 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-8208:


Assignee: (was: Anurag Mantripragada)

> Support ALTER commands for Rely/Norely novalidate for primary key/foreign key
> -
>
> Key: IMPALA-8208
> URL: https://issues.apache.org/jira/browse/IMPALA-8208
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Anurag Mantripragada
>Priority: Major
>
> Support for ALTER commands such as:
>  * {{ALTER TABLE T1 ADD CONSTRAINT pk2 primary key (a) disable novalidate;}}
>  * {{ALTER TABLE T2 ADD CONSTRAINT fk1 FOREIGN KEY ( x ) REFERENCES T1(a) 
> DISABLE NOVALIDATE RELY;}}
>  * {{ALTER TABLE T3 ADD CONSTRAINT fk4 FOREIGN KEY ( y ) REFERENCES T1(a) 
> DISABLE NOVALIDATE;}}
>  * {{ALTER TABLE sales DROP CONSTRAINT pk1;}}
>  * {{ALTER TABLE product DROP CONSTRAINT fk1;}}   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6141) Re-enable support for BOOLEAN partitions

2020-07-23 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-6141:


Assignee: (was: Anurag Mantripragada)

> Re-enable support for BOOLEAN partitions
> 
>
> Key: IMPALA-6141
> URL: https://issues.apache.org/jira/browse/IMPALA-6141
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend
>Reporter: Zach Amsden
>Priority: Trivial
>
> Since HIVE-6590 has been fixed upstream in Hive 2.1.1, Impala should be able 
> to re-enable support for boolean partition columns.  There may be some 
> additional work involved besides just backing out the analysis check in 
> InsertStmt.java, as this alone didn't get support back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7910) COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore

2020-07-23 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-7910:


Assignee: (was: Anurag Mantripragada)

> COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore
> 
>
> Key: IMPALA-7910
> URL: https://issues.apache.org/jira/browse/IMPALA-7910
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.9.0, Impala 2.11.0, Impala 2.12.0
>Reporter: Michael Brown
>Priority: Major
>
> COMPUTE STATS and possibly other DDL operations unnecessarily do the 
> equivalent of a REFRESH after writing to the Hive Metastore. This unnecessary 
> operation can be very expensive, so should be avoided.
> The behavior can be confirmed from the catalogd logs:
> {code}
> compute stats functional_parquet.alltypes;
> +---+
> | summary   |
> +---+
> | Updated 24 partition(s) and 11 column(s). |
> +---+
> Relevant catalogd.INFO snippet
> I0413 14:40:24.210749 27295 HdfsTable.java:1263] Incrementally loading table 
> metadata for: functional_parquet.alltypes
> I0413 14:40:24.242122 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.244634 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=10: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.247174 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=11: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.249713 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=12: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.252288 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=2: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.254629 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=3: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.256991 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=4: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.259464 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=5: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.262197 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=6: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.264463 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=7: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.266736 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=8: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.269210 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=9: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.271800 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.274348 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes 

[jira] [Commented] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-07-23 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164028#comment-17164028
 ] 

Anurag Mantripragada commented on IMPALA-9433:
--

I've done some work on this. Unfortunately, I cannot continue working on this.

I've posted a WIP code review -  [https://gerrit.cloudera.org/#/c/15699/]

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



<    1   2