[ 
https://issues.apache.org/jira/browse/IMPALA-12426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Fehr updated IMPALA-12426:
--------------------------------
    Description: 
Implement a way of querying (via SQL) information about completed 
queries/ddls/dmls.  Adds coordinator startup flags for users to specify that 
Impala will track completed queries in an internal table.

Impala will create and maintain an internal Iceberg table named 
"impala_query_log" in the "system database" that contains all completed 
queries. This table is automatically created at startup by each coordinator if 
it does not exist. Then, each completed query is queued in memory and flushed 
to the query history table at a set interval (either minutes or number of 
records).

Data in this table must match the corresponding data in the query profile.  
Develop automated testing that asserts this requirement is true.

Add the following metrics to the "impala-server" metrics group:
* Number of completed queries queued in memory waiting to be written to the 
table.
* Number of completed queries successfully written to the table.
* Number of attempts that failed to write completed queries to the table.
* Number of times completed queries were written at the regularly scheduled 
time.
* Number of times completed queries were written before the scheduled time 
because the max number of queued records was reached.

  was:
Implement a way of querying (via SQL) information about completed 
queries/ddls/dmls.

Design details:
# New Coordinator Startup Flags:
## store_query_history – string, value of "impala" stores the query history 
table as an Iceberg table, allows for future expansion to using Kudu as the 
table storage engine
## query_history_table_name – string, name of the table where query history 
will be stored, can be fully qualified with a database name (e.g. 
"mydb.my_table") or can be just the table name (e.g. "my_table") in which case 
the table will be stored in the "information" database, defaults to 
"information.query_history"
## query_history_write_duration – number, seconds to wait before inserting 
completed queries into the query history table, allows for batching inserts to 
avoid lots of small files, value of 0 indicates immediate insert with no 
batching, default 300
# Not all queries will be inserted in this table. Use/set/get queries as well 
as insert dmls to this table will not be inserted.
# The query history table will include data from the query profile that 
describe the overall query (e.g. query_id, session_id, user, sql, coordinator 
name, start time, end time, etc)


> SQL Interface to Completed Queries/DDLs/DMLs
> --------------------------------------------
>
>                 Key: IMPALA-12426
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12426
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend, be
>            Reporter: Jason Fehr
>            Assignee: Jason Fehr
>            Priority: Major
>              Labels: impala, workload-management
>
> Implement a way of querying (via SQL) information about completed 
> queries/ddls/dmls.  Adds coordinator startup flags for users to specify that 
> Impala will track completed queries in an internal table.
> Impala will create and maintain an internal Iceberg table named 
> "impala_query_log" in the "system database" that contains all completed 
> queries. This table is automatically created at startup by each coordinator 
> if it does not exist. Then, each completed query is queued in memory and 
> flushed to the query history table at a set interval (either minutes or 
> number of records).
> Data in this table must match the corresponding data in the query profile.  
> Develop automated testing that asserts this requirement is true.
> Add the following metrics to the "impala-server" metrics group:
> * Number of completed queries queued in memory waiting to be written to the 
> table.
> * Number of completed queries successfully written to the table.
> * Number of attempts that failed to write completed queries to the table.
> * Number of times completed queries were written at the regularly scheduled 
> time.
> * Number of times completed queries were written before the scheduled time 
> because the max number of queued records was reached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to