[jira] [Updated] (IMPALA-12968) Early EndDataStream RPC could be responded earlier
[ https://issues.apache.org/jira/browse/IMPALA-12968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csaba Ringhofer updated IMPALA-12968: - Description: When a producer fragment sends no rows and finishes before the receiver is initialized the EndDataStream rpc is stored as early sender and is responded when the receiver is registered. [https://github.com/apache/impala/blob/effc9df933b46eb5b0acf55a858606415425505f/be/src/runtime/krpc-data-stream-mgr.cc#L150] While it is important to store the information that the EOS has happened to unregister the sender from the receiver, the RPC itself could be responded right after it was stored in the early sender map. was: When a producer fragment sends no rows and finishes before the receiver is initialized te e EndDataStream rpc is stored as early sender and is responded when the receiver is registered. [https://github.com/apache/impala/blob/effc9df933b46eb5b0acf55a858606415425505f/be/src/runtime/krpc-data-stream-mgr.cc#L150] While it is important to store the information that the EOS has happened to unregister the sender from the receiver, the RPC itself could be responded right after it was stored in the early sender map. > Early EndDataStream RPC could be responded earlier > -- > > Key: IMPALA-12968 > URL: https://issues.apache.org/jira/browse/IMPALA-12968 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Csaba Ringhofer >Priority: Minor > Labels: krpc > > When a producer fragment sends no rows and finishes before the receiver is > initialized the EndDataStream rpc is stored as early sender and is responded > when the receiver is registered. > [https://github.com/apache/impala/blob/effc9df933b46eb5b0acf55a858606415425505f/be/src/runtime/krpc-data-stream-mgr.cc#L150] > While it is important to store the information that the EOS has happened to > unregister the sender from the receiver, the RPC itself could be responded > right after it was stored in the early sender map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12968) Early EndDataStream RPC could be responded earlier
Csaba Ringhofer created IMPALA-12968: Summary: Early EndDataStream RPC could be responded earlier Key: IMPALA-12968 URL: https://issues.apache.org/jira/browse/IMPALA-12968 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Csaba Ringhofer When a producer fragment sends no rows and finishes before the receiver is initialized te e EndDataStream rpc is stored as early sender and is responded when the receiver is registered. [https://github.com/apache/impala/blob/effc9df933b46eb5b0acf55a858606415425505f/be/src/runtime/krpc-data-stream-mgr.cc#L150] While it is important to store the information that the EOS has happened to unregister the sender from the receiver, the RPC itself could be responded right after it was stored in the early sender map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12782) Show info of the event processing in /events webUI
[ https://issues.apache.org/jira/browse/IMPALA-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang resolved IMPALA-12782. - Fix Version/s: Impala 4.4.0 Resolution: Fixed > Show info of the event processing in /events webUI > -- > > Key: IMPALA-12782 > URL: https://issues.apache.org/jira/browse/IMPALA-12782 > Project: IMPALA > Issue Type: Improvement > Components: Catalog >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > Labels: observability > Fix For: Impala 4.4.0 > > Attachments: event-processor-info.png > > > Currently, we just show some metrics of the event-processor in the /events > page. It'd be helpful to show more info, e.g. lag time, what is the current > event batch/event that's being processing, etc. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12967) Testcase fails at test_migrated_table_field_id_resolution due to "Table does not exist"
[ https://issues.apache.org/jira/browse/IMPALA-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833320#comment-17833320 ] Yida Wu commented on IMPALA-12967: -- Hi [~daniel.becker], the jira assigns to you because the failure happens after IMPALA-12809, and it is a iceberg related testcase. Please feel free to reassign if it is not related. > Testcase fails at test_migrated_table_field_id_resolution due to "Table does > not exist" > --- > > Key: IMPALA-12967 > URL: https://issues.apache.org/jira/browse/IMPALA-12967 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Yida Wu >Assignee: Daniel Becker >Priority: Major > Labels: broken-build > > Testcase test_migrated_table_field_id_resolution fails at exhaustive release > build with following messages: > *Regression* > {code:java} > query_test.test_iceberg.TestIcebergTable.test_migrated_table_field_id_resolution[protocol: > beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] (from pytest) > {code} > *Error Message* > {code:java} > query_test/test_iceberg.py:266: in test_migrated_table_field_id_resolution > "iceberg_migrated_alter_test_orc", "orc") common/file_utils.py:68: in > create_iceberg_table_from_directory file_format)) > common/impala_connection.py:215: in execute > fetch_profile_after_close=fetch_profile_after_close) > beeswax/impala_beeswax.py:191: in execute handle = > self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:384: in __execute_query > self.wait_for_finished(handle) beeswax/impala_beeswax.py:405: in > wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + > error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: E > Query aborted:ImpalaRuntimeException: Error making 'createTable' RPC to Hive > Metastore: E CAUSED BY: IcebergTableLoadingException: Table does not exist > at location: > hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test_orc > Stacktrace > query_test/test_iceberg.py:266: in test_migrated_table_field_id_resolution > "iceberg_migrated_alter_test_orc", "orc") > common/file_utils.py:68: in create_iceberg_table_from_directory > file_format)) > common/impala_connection.py:215: in execute > fetch_profile_after_close=fetch_profile_after_close) > beeswax/impala_beeswax.py:191: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:384: in __execute_query > self.wait_for_finished(handle) > beeswax/impala_beeswax.py:405: in wait_for_finished > raise ImpalaBeeswaxException("Query aborted:" + error_log, None) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EQuery aborted:ImpalaRuntimeException: Error making 'createTable' RPC to > Hive Metastore: > E CAUSED BY: IcebergTableLoadingException: Table does not exist at > location: > hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test_orc > {code} > *Standard Error* > {code:java} > SET > client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_migrated_table_field_id_resolution[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_single_; > SET sync_ddl=False; > -- executing against localhost:21000 > DROP DATABASE IF EXISTS `test_migrated_table_field_id_resolution_b59d79db` > CASCADE; > -- 2024-04-02 00:56:55,137 INFO MainThread: Started query > f34399a8b7cddd67:031a3b96 > SET > client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_migrated_table_field_id_resolution[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_single_; > SET sync_ddl=False; > -- executing against localhost:21000 > CREATE DATABASE `test_migrated_table_field_id_resolution_b59d79db`; > -- 2024-04-02 00:56:57,302 INFO MainThread: Started query > 94465af69907eac5:e33f17e0 > -- 2024-04-02 00:56:57,353 INFO MainThread: Created database > "test_migrated_table_field_id_resolution_b59d79db" for test ID > "query_test/test_iceberg.py::TestIcebergTable::()::test_migrated_table_field_id_resolution[protocol: > beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none]" > Picked up JAVA_TOOL_OPTI
[jira] [Created] (IMPALA-12967) Testcase fails at test_migrated_table_field_id_resolution due to "Table does not exist"
Yida Wu created IMPALA-12967: Summary: Testcase fails at test_migrated_table_field_id_resolution due to "Table does not exist" Key: IMPALA-12967 URL: https://issues.apache.org/jira/browse/IMPALA-12967 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Yida Wu Assignee: Daniel Becker Testcase test_migrated_table_field_id_resolution fails at exhaustive release build with following messages: *Regression* {code:java} query_test.test_iceberg.TestIcebergTable.test_migrated_table_field_id_resolution[protocol: beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from pytest) {code} *Error Message* {code:java} query_test/test_iceberg.py:266: in test_migrated_table_field_id_resolution "iceberg_migrated_alter_test_orc", "orc") common/file_utils.py:68: in create_iceberg_table_from_directory file_format)) common/impala_connection.py:215: in execute fetch_profile_after_close=fetch_profile_after_close) beeswax/impala_beeswax.py:191: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:384: in __execute_query self.wait_for_finished(handle) beeswax/impala_beeswax.py:405: in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: EQuery aborted:ImpalaRuntimeException: Error making 'createTable' RPC to Hive Metastore: E CAUSED BY: IcebergTableLoadingException: Table does not exist at location: hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test_orc Stacktrace query_test/test_iceberg.py:266: in test_migrated_table_field_id_resolution "iceberg_migrated_alter_test_orc", "orc") common/file_utils.py:68: in create_iceberg_table_from_directory file_format)) common/impala_connection.py:215: in execute fetch_profile_after_close=fetch_profile_after_close) beeswax/impala_beeswax.py:191: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:384: in __execute_query self.wait_for_finished(handle) beeswax/impala_beeswax.py:405: in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: EQuery aborted:ImpalaRuntimeException: Error making 'createTable' RPC to Hive Metastore: E CAUSED BY: IcebergTableLoadingException: Table does not exist at location: hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test_orc {code} *Standard Error* {code:java} SET client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_migrated_table_field_id_resolution[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_single_; SET sync_ddl=False; -- executing against localhost:21000 DROP DATABASE IF EXISTS `test_migrated_table_field_id_resolution_b59d79db` CASCADE; -- 2024-04-02 00:56:55,137 INFO MainThread: Started query f34399a8b7cddd67:031a3b96 SET client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_migrated_table_field_id_resolution[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_single_; SET sync_ddl=False; -- executing against localhost:21000 CREATE DATABASE `test_migrated_table_field_id_resolution_b59d79db`; -- 2024-04-02 00:56:57,302 INFO MainThread: Started query 94465af69907eac5:e33f17e0 -- 2024-04-02 00:56:57,353 INFO MainThread: Created database "test_migrated_table_field_id_resolution_b59d79db" for test ID "query_test/test_iceberg.py::TestIcebergTable::()::test_migrated_table_field_id_resolution[protocol: beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]" Picked up JAVA_TOOL_OPTIONS: -javaagent:/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/fe/target/dependency/jamm-0.4.0.jar Picked up JAVA_TOOL_OPTIONS: -javaagent:/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/fe/target/dependency/jamm-0.4.0.jar -- executing against localhost:21000 create external table test_migrated_table_field_id_resolution_b59d79db.iceberg_migrated_alter_test stored as iceberg location '/test-warehouse/iceberg_migrated_alter_test' tblproperties('write.format.default'='parquet', 'iceberg.catalog'= 'hadoop.tables'); -- 2024-04-02 00:57:01,060 IN
[jira] [Commented] (IMPALA-12782) Show info of the event processing in /events webUI
[ https://issues.apache.org/jira/browse/IMPALA-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833293#comment-17833293 ] ASF subversion and git services commented on IMPALA-12782: -- Commit effc9df933b46eb5b0acf55a858606415425505f in impala's branch refs/heads/master from stiga-huang [ https://gitbox.apache.org/repos/asf?p=impala.git;h=effc9df93 ] IMPALA-12782: Show info of the event processing in /events webUI The /events page of catalogd shows the metrics and status of the event-processor. This patch adds more info in this page, including - lag info - current event batch that's being processed See the screenshot attached in the JIRA for how it looks like. Also moves the error message to the top to highlight the error status. Fixes the issue of not updating latest event id when event processor is stopped. Also fixes the issue of error message not cleared after global INVALIDATE METADATA. Adds a debug action, catalogd_event_processing_delay, to inject a sleep while processing an event. So the web page can be captured more easily. Also adds a missing test for showing the error message of event-processing in the /events page. Tests: - Add e2e test to verify the content of the page. Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c Reviewed-on: http://gerrit.cloudera.org:8080/20986 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Show info of the event processing in /events webUI > -- > > Key: IMPALA-12782 > URL: https://issues.apache.org/jira/browse/IMPALA-12782 > Project: IMPALA > Issue Type: Improvement > Components: Catalog >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > Labels: observability > Attachments: event-processor-info.png > > > Currently, we just show some metrics of the event-processor in the /events > page. It'd be helpful to show more info, e.g. lag time, what is the current > event batch/event that's being processing, etc. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12600) Support equality deletes when table has partition or schema evolution
[ https://issues.apache.org/jira/browse/IMPALA-12600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833291#comment-17833291 ] ASF subversion and git services commented on IMPALA-12600: -- Commit 18b9c08c5203901da57e738afcc47f1f7a53b3bc in impala's branch refs/heads/master from Gabor Kaszab [ https://gitbox.apache.org/repos/asf?p=impala.git;h=18b9c08c5 ] IMPALA-12600: Schema evolution with equality delete files This patch adds test coverage for a table that has equality delete files and also schema evolution, where the schema changes didn't affect the primary key columns. Note, partition evolution on tables with equality deletes is still not supported. Testing: - Added a new test table for this use-case and some E2E tests on that table. Change-Id: I125f72bade5b79bad5aaa6b676d6afaf3ca98395 Reviewed-on: http://gerrit.cloudera.org:8080/21210 Reviewed-by: Gabor Kaszab Tested-by: Impala Public Jenkins > Support equality deletes when table has partition or schema evolution > - > > Key: IMPALA-12600 > URL: https://issues.apache.org/jira/browse/IMPALA-12600 > Project: IMPALA > Issue Type: Sub-task >Reporter: Gabor Kaszab >Assignee: Gabor Kaszab >Priority: Major > > With adding the basic equality delete read support, we reject queries for > Iceberg tables that has equality delete files and has partition or schema > evolution. This ticket is to enhance this functionality. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12611) Add support to MAP type Iceberg Metadata table columns
[ https://issues.apache.org/jira/browse/IMPALA-12611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833292#comment-17833292 ] ASF subversion and git services commented on IMPALA-12611: -- Commit 63f52807f0641ced7560dac2f616f352e7b5a86c in impala's branch refs/heads/master from Daniel Becker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=63f52807f ] IMPALA-12611: Add support to MAP type Iceberg Metadata table columns This change adds support for querying MAP types from Iceberg Metadata tables. The 'IcebergMetadataScanner.ArrayScanner' java class is renamed to 'CollectionScanner' and extended to be able to handle maps. For arrays the iteration returns the element as before, for maps it returns 'Map.Entry' objects. Note that collections in the FROM clause are still not supported. Testing: - Added E2E tests in iceberg-metadata-tables.test. Change-Id: I8a8b3a574ca45c893315c3b41b33ce4e0eff865a Reviewed-on: http://gerrit.cloudera.org:8080/21125 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Add support to MAP type Iceberg Metadata table columns > -- > > Key: IMPALA-12611 > URL: https://issues.apache.org/jira/browse/IMPALA-12611 > Project: IMPALA > Issue Type: Sub-task > Components: Backend, Frontend >Affects Versions: Impala 4.4.0 >Reporter: Tamas Mate >Assignee: Daniel Becker >Priority: Major > Labels: impala-iceberg > > MAP type columns are currently filled with NULLs, we should populate them. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12243) Add support for DROP PARTITION for Iceberg tables
[ https://issues.apache.org/jira/browse/IMPALA-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy updated IMPALA-12243: --- Fix Version/s: Impala 4.4.0 (was: Impala 4.3.0) > Add support for DROP PARTITION for Iceberg tables > - > > Key: IMPALA-12243 > URL: https://issues.apache.org/jira/browse/IMPALA-12243 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Reporter: Zoltán Borók-Nagy >Assignee: Peter Rozsa >Priority: Major > Labels: impala-iceberg > Fix For: Impala 4.4.0 > > > Add support for DROP PARTITION for Iceberg tables. > Users should be able to run statements like the followings: > * alter table table_a drop partition (country = 'SG') > * alter table table_a drop partition (identity(country) = 'SG') > * alter table table_a drop partition (dt < '2023-01-01 00:00:00') > * alter table table_a drop partition (year(dt) < "2023") > * alter table table_a drop partition (year(dt) < "2023" and month(dt) < > "2023-06") > * alter table table_a drop partition (bucket(5, s) < 2) > * alter table table_a drop partition (truncate(5, s) = "trunc") -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12709) Hierarchical metastore event processing
[ https://issues.apache.org/jira/browse/IMPALA-12709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833178#comment-17833178 ] Maxwell Guo commented on IMPALA-12709: -- [~VenuReddy] Thanks very much , looking forward to your update. > Hierarchical metastore event processing > --- > > Key: IMPALA-12709 > URL: https://issues.apache.org/jira/browse/IMPALA-12709 > Project: IMPALA > Issue Type: Improvement > Components: Catalog >Reporter: Venugopal Reddy K >Assignee: Venugopal Reddy K >Priority: Major > Attachments: Hierarchical metastore event processing.docx > > > *Current Issue:* > At present, metastore event processor is single threaded. Notification events > are processed sequentially with a maximum limit of 1000 events fetched and > processed in a single batch. Multiple locks are used to address the > concurrency issues that may arise when catalog DDL operation processing and > metastore event processing tries to access/update the catalog objects > concurrently. Waiting for a lock or file metadata loading of a table can slow > the event processing and can affect the processing of other events following > it. Those events may not be dependent on the previous event. Altogether it > takes a very long time to synchronize all the HMS events. > *Proposal:* > Existing metastore event processing can be turned into multi-level event > processing. Idea is to segregate the events based on their dependency, > maintain the order of events as they occur within the dependency and process > them independently as much as possible: > # All the events of a table are processed in the same order they have > actually occurred. > # Events of different tables are processed in parallel. > # When a database is altered, all the events relating to the database(i.e., > for all its tables) occurring after the alter db event are processed only > after the alter database event is processed ensuring the order. > Have attached an initial proposal design document > https://docs.google.com/document/d/1KZ-ANko-qn5CYmY13m4OVJXAYjLaS1VP-c64Pumipq8/edit?pli=1#heading=h.qyk8qz8ez37b -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-12709) Hierarchical metastore event processing
[ https://issues.apache.org/jira/browse/IMPALA-12709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833161#comment-17833161 ] Venugopal Reddy K edited comment on IMPALA-12709 at 4/2/24 12:18 PM: - [~maxwellguo] Currently measuring and comparing the time taken with base and modified versions. Also tuning the configuration paramters added with the gerrit to see the change in the event processing time. Since there are no existing tests to measure the event processing time, I am adding some tests. [https://gerrit.cloudera.org/#/c/21031/8/fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorPerfTest.java] has a test to create 10 databases and 10 tables(non-transactional) on each db, inserted data into all these 100 tables and dropped tables and databases. All of them from hive. Just event processing on impala. Results showed that with Hierarchical Processing enabled, insert into table(generates ALTER and INSERT events) looks to be much faster.(nearly 5times). But create databases, tables and drop tables and databases event processing is not. I am checking it. Also planning to add more perf tests with partitioned tables, transactional tables, different possible sequence of events, generating events from impala side etc Test output for 2 runs: {noformat} I0402 17:04:40.532240 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: false I0402 17:04:40.705119 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 75.11 ms I0402 17:04:43.643066 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 136.4 ms I0402 17:04:44.130368 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 486.9 ms I0402 17:05:15.474153 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 1.955 s I0402 17:05:24.419824 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 97.01 ms I0402 17:05:24.684505 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 26.55 ms I0402 17:05:25.107113 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: true I0402 17:05:25.196743 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 15.21 ms I0402 17:05:28.118330 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 50.12 ms I0402 17:05:28.473388 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 354.8 ms I0402 17:05:52.529421 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 402.1 ms I0402 17:06:01.460664 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 132.2 ms I0402 17:06:01.848369 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 27.53 ms I0402 17:06:02.227852 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: false I0402 17:06:02.435050 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 18.10 ms I0402 17:06:05.132701 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 110.8 ms I0402 17:06:05.726616 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 593.7 ms I0402 17:06:30.767912 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 2.246 s I0402 17:06:40.019438 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 122.7 ms I0402 17:06:40.383190 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 22.18 ms I0402 17:06:40.801436 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: true I0402 17:06:41.036427 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 21.29 ms I0402 17:06:43.558152 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 101.3 ms I0402 17:06:43.942732 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 384.1 ms I0402 17:07:08.202667 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 465.2 ms I0402 17:07:17.037060 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 137.3 ms I0402 17:07:17.377442 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 20.56 ms {noformat} was (Author: venureddy): [~maxwellguo] Currently measuring and comparing the time taken
[jira] [Comment Edited] (IMPALA-12709) Hierarchical metastore event processing
[ https://issues.apache.org/jira/browse/IMPALA-12709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833161#comment-17833161 ] Venugopal Reddy K edited comment on IMPALA-12709 at 4/2/24 12:09 PM: - [~maxwellguo] Currently measuring and comparing the time taken with base and modified versions. Also tuning the configuration paramters added with the gerrit to see the change in the event processing time. Since there are no existing tests to measure the event processing time, I am adding some tests. [https://gerrit.cloudera.org/#/c/21031/8/fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorPerfTest.java] has a test to create 10 databases and 10 tables(non-transactional) on each db, inserted data into all these 100 tables and dropped tables and databases. Results showed that with Hierarchical Processing enabled, insert into table(generates ALTER and INSERT events) looks to be much faster.(nearly 5times). But create databases, tables and drop tables and databases it is not. I am checking it. Also planning to add more perf tests with partitioned tables, transactional tables, different possible sequence of events etc Test output for 2 runs: {noformat} I0402 17:04:40.532240 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: false I0402 17:04:40.705119 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 75.11 ms I0402 17:04:43.643066 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 136.4 ms I0402 17:04:44.130368 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 486.9 ms I0402 17:05:15.474153 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 1.955 s I0402 17:05:24.419824 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 97.01 ms I0402 17:05:24.684505 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 26.55 ms I0402 17:05:25.107113 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: true I0402 17:05:25.196743 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 15.21 ms I0402 17:05:28.118330 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 50.12 ms I0402 17:05:28.473388 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 354.8 ms I0402 17:05:52.529421 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 402.1 ms I0402 17:06:01.460664 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 132.2 ms I0402 17:06:01.848369 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 27.53 ms I0402 17:06:02.227852 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: false I0402 17:06:02.435050 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 18.10 ms I0402 17:06:05.132701 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 110.8 ms I0402 17:06:05.726616 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 593.7 ms I0402 17:06:30.767912 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 2.246 s I0402 17:06:40.019438 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 122.7 ms I0402 17:06:40.383190 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 22.18 ms I0402 17:06:40.801436 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: true I0402 17:06:41.036427 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 21.29 ms I0402 17:06:43.558152 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 101.3 ms I0402 17:06:43.942732 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 384.1 ms I0402 17:07:08.202667 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 465.2 ms I0402 17:07:17.037060 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 137.3 ms I0402 17:07:17.377442 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 20.56 ms {noformat} was (Author: venureddy): [~maxwellguo] Currently measuring and comparing the time taken with base and modified versions. Also tuning the configuration paramters added with the gerrit to see the
[jira] [Commented] (IMPALA-12709) Hierarchical metastore event processing
[ https://issues.apache.org/jira/browse/IMPALA-12709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833161#comment-17833161 ] Venugopal Reddy K commented on IMPALA-12709: [~maxwellguo] Currently measuring and comparing the time taken with base and modified versions. Also tuning the configuration paramters added with the gerrit to see the change in the event processing time. Since there are no existing tests to measure the event processing time, I am adding some tests. [https://gerrit.cloudera.org/#/c/21031/8/fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorPerfTest.java] has a test to create 10 databases and 10 tables(non-transactional) on each db, inserted data into all these 100 tables and dropped tables and databases. Results showed that with Hierarchical Processing enabled, insert into table(generates ALTER and INSERT events) looks to be much faster.(nearly 5times). But create databases, tables and drop tables and databases it is not. I am checking it. Also planning to add more perf tests with partitioned tables, transactional tables, different possible sequence of events etc Test output for 2 runs: {noformat} I0402 17:04:40.532240 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: false I0402 17:04:40.705119 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 75.11 ms I0402 17:04:43.643066 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 136.4 ms I0402 17:04:44.130368 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 486.9 ms I0402 17:05:15.474153 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 1.955 s I0402 17:05:24.419824 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 97.01 ms I0402 17:05:24.684505 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 26.55 ms I0402 17:05:25.107113 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: true I0402 17:05:25.196743 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 15.21 ms I0402 17:05:28.118330 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 50.12 ms I0402 17:05:28.473388 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 354.8 ms I0402 17:05:52.529421 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 402.1 ms I0402 17:06:01.460664 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 132.2 ms I0402 17:06:01.848369 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 27.53 ms I0402 17:06:02.227852 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: false I0402 17:06:02.435050 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 18.10 ms I0402 17:06:05.132701 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 110.8 ms I0402 17:06:05.726616 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 593.7 ms I0402 17:06:30.767912 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 2.246 s I0402 17:06:40.019438 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 122.7 ms I0402 17:06:40.383190 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 22.18 ms I0402 17:06:40.801436 712808 EventsProcessorPerfTest.java:131] [Performance] With Hierarchical Processing: true I0402 17:06:41.036427 712808 EventsProcessorPerfTest.java:140] [Performance] Time taken to process create database events: 21.29 ms I0402 17:06:43.558152 712808 EventsProcessorPerfTest.java:153] [Performance] Time taken to process create table events: 101.3 ms I0402 17:06:43.942732 712808 EventsProcessorPerfTest.java:181] [Performance] Time taken to load table: 384.1 ms I0402 17:07:08.202667 712808 EventsProcessorPerfTest.java:194] [Performance] Time taken to process insert events : 465.2 ms I0402 17:07:17.037060 712808 EventsProcessorPerfTest.java:206] [Performance] Time taken to process drop table events : 137.3 ms I0402 17:07:17.377442 712808 EventsProcessorPerfTest.java:216] [Performance] Time taken to process database events : 20.56 ms {noformat} > Hierarchical metastore event processing > --- > > Key: IMPALA-12709 > URL: https://issues.apache.org/jira/browse/IMPALA-12709 > Project: IMPALA > Issue
[jira] [Updated] (IMPALA-12966) Investigate expected error message in AuthorizationStmtTest.testShow after IMPALA-12609
[ https://issues.apache.org/jira/browse/IMPALA-12966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker updated IMPALA-12966: --- Labels: impala-iceberg (was: ) > Investigate expected error message in AuthorizationStmtTest.testShow after > IMPALA-12609 > --- > > Key: IMPALA-12966 > URL: https://issues.apache.org/jira/browse/IMPALA-12966 > Project: IMPALA > Issue Type: Improvement >Reporter: Daniel Becker >Assignee: Daniel Becker >Priority: Major > Labels: impala-iceberg > > In the commit belonging to IMPALA-12609, in one of the test calls in > AuthorizationStmtTest.testShow() the error message kept varying between > "functional_parquet.iceberg_query_metadata" and "functional_parquet.*.*". In > the end the expected error message was limited to "functional_parquet" > because it fit both, but we should find out why it kept changing. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12966) Investigate expected error message in AuthorizationStmtTest.testShow after IMPALA-12609
Daniel Becker created IMPALA-12966: -- Summary: Investigate expected error message in AuthorizationStmtTest.testShow after IMPALA-12609 Key: IMPALA-12966 URL: https://issues.apache.org/jira/browse/IMPALA-12966 Project: IMPALA Issue Type: Improvement Reporter: Daniel Becker Assignee: Daniel Becker In the commit belonging to IMPALA-12609, in one of the test calls in AuthorizationStmtTest.testShow() the error message kept varying between "functional_parquet.iceberg_query_metadata" and "functional_parquet.*.*". In the end the expected error message was limited to "functional_parquet" because it fit both, but we should find out why it kept changing. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12609) Implement SHOW TABLES IN statement to list Iceberg Metadata tables
[ https://issues.apache.org/jira/browse/IMPALA-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833136#comment-17833136 ] ASF subversion and git services commented on IMPALA-12609: -- Commit 72732da9d8d9211b9dc374af42340c56968fb86a in impala's branch refs/heads/master from Daniel Becker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=72732da9d ] IMPALA-12609: Implement SHOW METADATA TABLES IN statement to list Iceberg Metadata tables After this change, the new SHOW METADATA TABLES IN statement can be used to list all the available metadata tables of an Iceberg table. Note that in contrast to querying the contents of Iceberg metadata tables, this does not require fully qualified paths, e.g. both SHOW METADATA TABLES IN functional_parquet.iceberg_query_metadata; and USE functional_parquet; SHOW METADATA TABLES IN iceberg_query_metadata; work. The available metadata tables for all Iceberg tables are the same, corresponding to the values of the enum "org.apache.iceberg.MetadataTableType", so there is actually no need to pass the name of the regular table for which the metadata table list is requested through Thrift. This change, however, does send the table name because this way - if we add support for metadata tables for other table formats, the table name/path will be necessary to determine the correct list of metadata tables - we could later add support for different authorisation policies for individual tables - we can check also at the point of generating the list of metadata tables that the table is an Iceberg table Testing: - added and updated tests in ParserTest, AnalyzeDDLTest, ToSqlTest and AuthorizationStmtTest - added a custom cluster test in test_authorization.py - added functional tests in iceberg-metadata-tables.test Change-Id: Ide10ccf10fc0abf5c270119ba7092c67e712ec49 Reviewed-on: http://gerrit.cloudera.org:8080/21026 Tested-by: Impala Public Jenkins Reviewed-by: Zoltan Borok-Nagy > Implement SHOW TABLES IN statement to list Iceberg Metadata tables > -- > > Key: IMPALA-12609 > URL: https://issues.apache.org/jira/browse/IMPALA-12609 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Affects Versions: Impala 4.4.0 >Reporter: Tamas Mate >Assignee: Daniel Becker >Priority: Minor > Labels: impala-iceberg > > {{SHOW TABLES IN}} statement could be used to list all the available metadata > tables of an Iceberg table. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12852) Separate start_mini_dfs.sh and start_kudu.sh
[ https://issues.apache.org/jira/browse/IMPALA-12852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833135#comment-17833135 ] ASF subversion and git services commented on IMPALA-12852: -- Commit 23a14a249c783a9dfb38586052513c6455ce3dad in impala's branch refs/heads/master from zhangyifan27 [ https://gitbox.apache.org/repos/asf?p=impala.git;h=23a14a249 ] IMPALA-12852: Make Kudu service start and stop independent This patch decouples run-kudu.sh and kill-kudu.sh from run-mini-dfs.sh and kill-mini-dfs.sh. These scripts can be useful for setting up test environments that require no or only Kudu service. Testing: - Ran the modified and new scripts and checked they worked as expected. Change-Id: I9624aaa61353bb4520e879570e5688d5e3493201 Reviewed-on: http://gerrit.cloudera.org:8080/21090 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Separate start_mini_dfs.sh and start_kudu.sh > > > Key: IMPALA-12852 > URL: https://issues.apache.org/jira/browse/IMPALA-12852 > Project: IMPALA > Issue Type: Improvement >Reporter: YifanZhang >Priority: Minor > > Currently, we run 'start_mini_dfs.sh' to start several services, including > hdfs, kms, yarn, and kudu. Considering that Kudu can start and stop > independently, we should have start_kudu.sh and stop_kudu.sh for convenient > start and stop for Kudu service. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12609) Implement SHOW TABLES IN statement to list Iceberg Metadata tables
[ https://issues.apache.org/jira/browse/IMPALA-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker resolved IMPALA-12609. Resolution: Fixed > Implement SHOW TABLES IN statement to list Iceberg Metadata tables > -- > > Key: IMPALA-12609 > URL: https://issues.apache.org/jira/browse/IMPALA-12609 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Affects Versions: Impala 4.4.0 >Reporter: Tamas Mate >Assignee: Daniel Becker >Priority: Minor > Labels: impala-iceberg > > {{SHOW TABLES IN}} statement could be used to list all the available metadata > tables of an Iceberg table. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-12760) Investigate missing insert table event for an insert operation on Iceberg table
[ https://issues.apache.org/jira/browse/IMPALA-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Maheshwari closed IMPALA-12760. -- Resolution: Duplicate > Investigate missing insert table event for an insert operation on Iceberg > table > --- > > Key: IMPALA-12760 > URL: https://issues.apache.org/jira/browse/IMPALA-12760 > Project: IMPALA > Issue Type: Wish > Components: Catalog >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > > Currently, we are not firing an insert table event to metastore when there is > an insert operation on an Iceberg table. As a result, other impala clusters > consuming the events from metastore are unaware of that insert operation > resulting in (meta)data sync issues. > Need to investigate why the insert table event is not fired to metastore for > an insert table operation on Iceberg table. > h4. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org