[jira] [Created] (IMPALA-12833) Enabled starting flag 'catalogd_ha_reset_metadata_on_failover' by default

2024-02-21 Thread Wenzhe Zhou (Jira)
Wenzhe Zhou created IMPALA-12833:


 Summary: Enabled starting flag 
'catalogd_ha_reset_metadata_on_failover' by default
 Key: IMPALA-12833
 URL: https://issues.apache.org/jira/browse/IMPALA-12833
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Wenzhe Zhou
Assignee: Wenzhe Zhou


In the abnormal state, table is loaded in coordinator but unloaded in the new 
active catalogd after catalogd failover. This lead coordinator to use the stale 
metadata and produce wrong results or encounter query failures. New active 
catalogd should reset metadata when it becomes active.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12832) EventProcessor shouldn't stop for failures in single-table event

2024-02-21 Thread Quanlong Huang (Jira)
Quanlong Huang created IMPALA-12832:
---

 Summary: EventProcessor shouldn't stop for failures in 
single-table event
 Key: IMPALA-12832
 URL: https://issues.apache.org/jira/browse/IMPALA-12832
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Quanlong Huang


EventProcessor goes into the ERROR/NEEDS_INVALIDATE state when it hits 
unexpected failures in processing an event. The cause are usually bugs. When it 
stops, all tables that need sync will be impacted. However, the event might 
just a single-table event (in contrast to multi-table events like RenameTable, 
DropDatabaseCascade, CommitTxn, AbortTxn). We can consider skipping the event 
and just invalidating the table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12831) HdfsTable.toMinimalTCatalogObject() should hold table read lock to generate incremental updates

2024-02-21 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-12831:

Description: 
When enable_incremental_metadata_updates=true (default), catalogd sends 
incremental partition updates to coordinators, which goes into 
HdfsTable.toMinimalTCatalogObject():
{code:java}
  public TCatalogObject toMinimalTCatalogObject() {
TCatalogObject catalogObject = super.toMinimalTCatalogObject();
if (!BackendConfig.INSTANCE.isIncrementalMetadataUpdatesEnabled()) {
  return catalogObject;
}
catalogObject.getTable().setTable_type(TTableType.HDFS_TABLE);
THdfsTable hdfsTable = new THdfsTable(hdfsBaseDir_, getColumnNames(),
nullPartitionKeyValue_, nullColumnValue_,
/*idToPartition=*/ new HashMap<>(),
/*prototypePartition=*/ new THdfsPartition());
for (HdfsPartition part : partitionMap_.values()) {
  hdfsTable.partitions.put(part.getId(), part.toMinimalTHdfsPartition());
}
hdfsTable.setHas_full_partitions(false);
// The minimal catalog object of partitions contain the partition names.
hdfsTable.setHas_partition_names(true);
catalogObject.getTable().setHdfs_table(hdfsTable);
return catalogObject;
  }{code}
Accessing table fields without holding the table read lock might be failed by 
concurrent DDLs. We've saw event-processor failed in processing a RELOAD event 
that want to invalidates an HdfsTable:
{noformat}
E0216 16:23:44.283689   253 MetastoreEventsProcessor.java:899] Unexpected 
exception received while processing event
Java exception follows:
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:911)
at java.util.ArrayList$Itr.next(ArrayList.java:861)
at org.apache.impala.catalog.Column.toColumnNames(Column.java:148)
at org.apache.impala.catalog.Table.getColumnNames(Table.java:844)
at 
org.apache.impala.catalog.HdfsTable.toMinimalTCatalogObject(HdfsTable.java:2132)
at 
org.apache.impala.catalog.CatalogServiceCatalog.addIncompleteTable(CatalogServiceCatalog.java:2221)
at 
org.apache.impala.catalog.CatalogServiceCatalog.addIncompleteTable(CatalogServiceCatalog.java:2202)
at 
org.apache.impala.catalog.CatalogServiceCatalog.invalidateTable(CatalogServiceCatalog.java:2797)
at 
org.apache.impala.catalog.events.MetastoreEvents$ReloadEvent.processTableInvalidate(MetastoreEvents.java:2734)
at 
org.apache.impala.catalog.events.MetastoreEvents$ReloadEvent.process(MetastoreEvents.java:2656)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:522)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1052)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:881)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750){noformat}
I can reproduce the issue using the following test:
{code:python}
  @CustomClusterTestSuite.with_args(
catalogd_args="--enable_incremental_metadata_updates=true")
  def test_concurrent_invalidate_metadata_with_refresh(self, unique_database):
# Create a wide table with some partitions
tbl = unique_database + ".wide_tbl"
create_stmt = "create table {} (".format(tbl)
for i in range(600):
  create_stmt += "col{} int, ".format(i)
create_stmt += "col600 int) partitioned by (p int) stored as textfile"
self.execute_query(create_stmt)
for i in range(10):
  self.execute_query("alter table {} add partition (p={})".format(tbl, i))

refresh_stmt = "refresh " + tbl
handle = self.client.execute_async(refresh_stmt)
for i in range(10):
  self.execute_query("invalidate metadata " + tbl)
  # Always keep a concurrent REFRESH statement running
  if self.client.get_state(handle) == self.client.QUERY_STATES['FINISHED']:
handle = self.client.execute_async(refresh_stmt){code}
and see a similar exception:
{noformat}
E0222 10:44:40.912338  6833 JniUtil.java:183] 
da4099ef24bb1f03:01c8f5d2] Error in INVALIDATE TABLE 
test_concurrent_invalidate_metadata_with_refresh_65c57cb0.wide_tbl issued by 
quanlong. Time spent: 32ms 
I0222 10:44:40.912528  

[jira] [Created] (IMPALA-12831) HdfsTable.toMinimalTCatalogObject() should hold table read lock to generate incremental updates

2024-02-21 Thread Quanlong Huang (Jira)
Quanlong Huang created IMPALA-12831:
---

 Summary: HdfsTable.toMinimalTCatalogObject() should hold table 
read lock to generate incremental updates
 Key: IMPALA-12831
 URL: https://issues.apache.org/jira/browse/IMPALA-12831
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Quanlong Huang
Assignee: Quanlong Huang


When enable_incremental_metadata_updates=true (default), catalogd sends 
incremental partition updates to coordinators, which goes into 
HdfsTable.toMinimalTCatalogObject():
{code:java}
  public TCatalogObject toMinimalTCatalogObject() {
TCatalogObject catalogObject = super.toMinimalTCatalogObject();
if (!BackendConfig.INSTANCE.isIncrementalMetadataUpdatesEnabled()) {
  return catalogObject;
}
catalogObject.getTable().setTable_type(TTableType.HDFS_TABLE);
THdfsTable hdfsTable = new THdfsTable(hdfsBaseDir_, getColumnNames(),
nullPartitionKeyValue_, nullColumnValue_,
/*idToPartition=*/ new HashMap<>(),
/*prototypePartition=*/ new THdfsPartition());
for (HdfsPartition part : partitionMap_.values()) {
  hdfsTable.partitions.put(part.getId(), part.toMinimalTHdfsPartition());
}
hdfsTable.setHas_full_partitions(false);
// The minimal catalog object of partitions contain the partition names.
hdfsTable.setHas_partition_names(true);
catalogObject.getTable().setHdfs_table(hdfsTable);
return catalogObject;
  }{code}
 
Accessing table fields without holding the table read lock might be failed by 
concurrent DDLs. We've saw event-processor failed in processing a RELOAD event 
that want to invalidates an HdfsTable:
{noformat}
E0216 16:23:44.283689   253 MetastoreEventsProcessor.java:899] Unexpected 
exception received while processing event
Java exception follows:
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:911)
at java.util.ArrayList$Itr.next(ArrayList.java:861)
at org.apache.impala.catalog.Column.toColumnNames(Column.java:148)
at org.apache.impala.catalog.Table.getColumnNames(Table.java:844)
at 
org.apache.impala.catalog.HdfsTable.toMinimalTCatalogObject(HdfsTable.java:2132)
at 
org.apache.impala.catalog.CatalogServiceCatalog.addIncompleteTable(CatalogServiceCatalog.java:2221)
at 
org.apache.impala.catalog.CatalogServiceCatalog.addIncompleteTable(CatalogServiceCatalog.java:2202)
at 
org.apache.impala.catalog.CatalogServiceCatalog.invalidateTable(CatalogServiceCatalog.java:2797)
at 
org.apache.impala.catalog.events.MetastoreEvents$ReloadEvent.processTableInvalidate(MetastoreEvents.java:2734)
at 
org.apache.impala.catalog.events.MetastoreEvents$ReloadEvent.process(MetastoreEvents.java:2656)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:522)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1052)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:881)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750){noformat}

I can reproduce the issue using the following test:
{code:python}
  @CustomClusterTestSuite.with_args(
catalogd_args="--enable_incremental_metadata_updates=true")
  def test_concurrent_invalidate_metadata_with_refresh(self, unique_database):
# Create a wide table with some partitions
tbl = unique_database + ".wide_tbl"
create_stmt = "create table {} (".format(tbl)
for i in range(600):
  create_stmt += "col{} int, ".format(i)
create_stmt += "col600 int) partitioned by (p int) stored as textfile"
self.execute_query(create_stmt)
for i in range(10):
  self.execute_query("alter table {} add partition (p={})".format(tbl, i))

refresh_stmt = "refresh " + tbl
handle = self.client.execute_async(refresh_stmt)
for i in range(10):
  self.execute_query("invalidate metadata " + tbl)
  # Always keep a concurrent REFRESH statement running
  if self.client.get_state(handle) == self.client.QUERY_STATES['FINISHED']:
handle = self.client.execute_async(refresh_stmt){code}



--
This message was sent by 

[jira] [Commented] (IMPALA-12573) Give configuration load_catalog_in_background more fine-grained configuration

2024-02-21 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819452#comment-17819452
 ] 

Maxwell Guo commented on IMPALA-12573:
--

[~stigahuang]Thanks for your reply. I think load_dbs_in_background and 
load_tables_in_background may help. 

But maybe I didn't understand your expression clearly, does this two 
configurations should be string that just like table/db black list , if some 
one want some tables be always loaded. These configurations are not going to be 
boolean flags , am I right ? 


> Give configuration load_catalog_in_background more fine-grained configuration
> -
>
> Key: IMPALA-12573
> URL: https://issues.apache.org/jira/browse/IMPALA-12573
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> As we know if  load_catalog_in_background set to true, then the table meta 
> will load async for catalogd.
> During this period when catalogd starts up, if the flag set to true, then all 
> the table will load async, then the queue will be big . So we may left it to 
> false by deafult. But if we invalidate some table manually ,we may want them 
> to load . So I think we can introduce a new flag 
> load_catalog_in_background_at_startup , we can set 
> load_catalog_in_background_at_startup to false, and 
> load_catalog_in_background to true by default. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-12830) test_webserver_hide_logs_link() could fail in the exhaustive build

2024-02-21 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819426#comment-17819426
 ] 

Fang-Yu Rao edited comment on IMPALA-12830 at 2/22/24 12:43 AM:


This issue seems to be similar to IMPALA-12170.

cc: [~stigahuang]


was (Author: fangyurao):
This issue seems to be similar to IMPALA-12170.

> test_webserver_hide_logs_link() could fail in the exhaustive build
> --
>
> Key: IMPALA-12830
> URL: https://issues.apache.org/jira/browse/IMPALA-12830
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Saurabh Katiyal
>Priority: Major
>  Labels: broken-build
>
> We found in an internal Jenkins run that test_webserver_hide_logs_link() 
> could fail in the exhaustive build with the following error.
> +*Error Message*+
> {code:java}
> AssertionError: bad links from webui port 25020 assert ['/', 
> '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]   At index 2 
> diff: u'/events' != '/hadoop-varz'   Full diff:   - [u'/',   ?  -   + ['/',   
> -  u'/catalog',   ?  -   +  '/catalog',   -  u'/events',   -  
> u'/hadoop-varz',   ?  -   +  '/hadoop-varz',   +  '/events',   -  u'/jmx',   
> ?  -   +  '/jmx',   -  u'/log_level',   ?  -   +  '/log_level',   -  
> u'/memz',   ?  -   +  '/memz',   -  u'/metrics',   ?  -   +  '/metrics',   -  
> u'/operations',   ?  -   +  '/operations',   -  u'/profile_docs',   ?  -   +  
> '/profile_docs',   -  u'/rpcz',   ?  -   +  '/rpcz',   -  u'/threadz',   ?  - 
>   +  '/threadz',   -  u'/varz']   ?  -   +  '/varz']
> {code}
> +*Stacktrace*+
> {code:java}
> custom_cluster/test_web_pages.py:248: in test_webserver_hide_logs_link
> assert found_links == expected_catalog_links, msg
> E   AssertionError: bad links from webui port 25020
> E   assert ['/', '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]
> E At index 2 diff: u'/events' != '/hadoop-varz'
> E Full diff:
> E - [u'/',
> E ?  -
> E + ['/',
> E -  u'/catalog',
> E ?  -
> E +  '/catalog',
> E -  u'/events',
> E -  u'/hadoop-varz',
> E ?  -
> E +  '/hadoop-varz',
> E +  '/events',
> E -  u'/jmx',
> E ?  -
> E +  '/jmx',
> E -  u'/log_level',
> E ?  -
> E +  '/log_level',
> E -  u'/memz',
> E ?  -
> E +  '/memz',
> E -  u'/metrics',
> E ?  -
> E +  '/metrics',
> E -  u'/operations',
> E ?  -
> E +  '/operations',
> E -  u'/profile_docs',
> E ?  -
> E +  '/profile_docs',
> E -  u'/rpcz',
> E ?  -
> E +  '/rpcz',
> E -  u'/threadz',
> E ?  -
> E +  '/threadz',
> E -  u'/varz']
> E ?  -
> E +  '/varz']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12830) test_webserver_hide_logs_link() could fail in the exhaustive build

2024-02-21 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819426#comment-17819426
 ] 

Fang-Yu Rao commented on IMPALA-12830:
--

This issue seems to be similar to IMPALA-12170.

> test_webserver_hide_logs_link() could fail in the exhaustive build
> --
>
> Key: IMPALA-12830
> URL: https://issues.apache.org/jira/browse/IMPALA-12830
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Saurabh Katiyal
>Priority: Major
>  Labels: broken-build
>
> We found in an internal Jenkins run that test_webserver_hide_logs_link() 
> could fail in the exhaustive build with the following error.
> +*Error Message*+
> {code:java}
> AssertionError: bad links from webui port 25020 assert ['/', 
> '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]   At index 2 
> diff: u'/events' != '/hadoop-varz'   Full diff:   - [u'/',   ?  -   + ['/',   
> -  u'/catalog',   ?  -   +  '/catalog',   -  u'/events',   -  
> u'/hadoop-varz',   ?  -   +  '/hadoop-varz',   +  '/events',   -  u'/jmx',   
> ?  -   +  '/jmx',   -  u'/log_level',   ?  -   +  '/log_level',   -  
> u'/memz',   ?  -   +  '/memz',   -  u'/metrics',   ?  -   +  '/metrics',   -  
> u'/operations',   ?  -   +  '/operations',   -  u'/profile_docs',   ?  -   +  
> '/profile_docs',   -  u'/rpcz',   ?  -   +  '/rpcz',   -  u'/threadz',   ?  - 
>   +  '/threadz',   -  u'/varz']   ?  -   +  '/varz']
> {code}
> +*Stacktrace*+
> {code:java}
> custom_cluster/test_web_pages.py:248: in test_webserver_hide_logs_link
> assert found_links == expected_catalog_links, msg
> E   AssertionError: bad links from webui port 25020
> E   assert ['/', '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]
> E At index 2 diff: u'/events' != '/hadoop-varz'
> E Full diff:
> E - [u'/',
> E ?  -
> E + ['/',
> E -  u'/catalog',
> E ?  -
> E +  '/catalog',
> E -  u'/events',
> E -  u'/hadoop-varz',
> E ?  -
> E +  '/hadoop-varz',
> E +  '/events',
> E -  u'/jmx',
> E ?  -
> E +  '/jmx',
> E -  u'/log_level',
> E ?  -
> E +  '/log_level',
> E -  u'/memz',
> E ?  -
> E +  '/memz',
> E -  u'/metrics',
> E ?  -
> E +  '/metrics',
> E -  u'/operations',
> E ?  -
> E +  '/operations',
> E -  u'/profile_docs',
> E ?  -
> E +  '/profile_docs',
> E -  u'/rpcz',
> E ?  -
> E +  '/rpcz',
> E -  u'/threadz',
> E ?  -
> E +  '/threadz',
> E -  u'/varz']
> E ?  -
> E +  '/varz']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12830) test_webserver_hide_logs_link() could fail in the exhaustive build

2024-02-21 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819425#comment-17819425
 ] 

Fang-Yu Rao commented on IMPALA-12830:
--

Hi [~skatiyal], assigned the JIRA to you since you revised the test case in 
IMPALA-9086 (Show Hive configurations in /hadoop-varz page) and thus may be 
more familiar with the context. Please feel free to re-assign as you see 
appropriate. Thanks!

> test_webserver_hide_logs_link() could fail in the exhaustive build
> --
>
> Key: IMPALA-12830
> URL: https://issues.apache.org/jira/browse/IMPALA-12830
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Saurabh Katiyal
>Priority: Major
>  Labels: broken-build
>
> We found in an internal Jenkins run that test_webserver_hide_logs_link() 
> could fail in the exhaustive build with the following error.
> +*Error Message*+
> {code:java}
> AssertionError: bad links from webui port 25020 assert ['/', 
> '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]   At index 2 
> diff: u'/events' != '/hadoop-varz'   Full diff:   - [u'/',   ?  -   + ['/',   
> -  u'/catalog',   ?  -   +  '/catalog',   -  u'/events',   -  
> u'/hadoop-varz',   ?  -   +  '/hadoop-varz',   +  '/events',   -  u'/jmx',   
> ?  -   +  '/jmx',   -  u'/log_level',   ?  -   +  '/log_level',   -  
> u'/memz',   ?  -   +  '/memz',   -  u'/metrics',   ?  -   +  '/metrics',   -  
> u'/operations',   ?  -   +  '/operations',   -  u'/profile_docs',   ?  -   +  
> '/profile_docs',   -  u'/rpcz',   ?  -   +  '/rpcz',   -  u'/threadz',   ?  - 
>   +  '/threadz',   -  u'/varz']   ?  -   +  '/varz']
> {code}
> +*Stacktrace*+
> {code:java}
> custom_cluster/test_web_pages.py:248: in test_webserver_hide_logs_link
> assert found_links == expected_catalog_links, msg
> E   AssertionError: bad links from webui port 25020
> E   assert ['/', '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]
> E At index 2 diff: u'/events' != '/hadoop-varz'
> E Full diff:
> E - [u'/',
> E ?  -
> E + ['/',
> E -  u'/catalog',
> E ?  -
> E +  '/catalog',
> E -  u'/events',
> E -  u'/hadoop-varz',
> E ?  -
> E +  '/hadoop-varz',
> E +  '/events',
> E -  u'/jmx',
> E ?  -
> E +  '/jmx',
> E -  u'/log_level',
> E ?  -
> E +  '/log_level',
> E -  u'/memz',
> E ?  -
> E +  '/memz',
> E -  u'/metrics',
> E ?  -
> E +  '/metrics',
> E -  u'/operations',
> E ?  -
> E +  '/operations',
> E -  u'/profile_docs',
> E ?  -
> E +  '/profile_docs',
> E -  u'/rpcz',
> E ?  -
> E +  '/rpcz',
> E -  u'/threadz',
> E ?  -
> E +  '/threadz',
> E -  u'/varz']
> E ?  -
> E +  '/varz']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12830) test_web_pages() could fail in the exhaustive build

2024-02-21 Thread Fang-Yu Rao (Jira)
Fang-Yu Rao created IMPALA-12830:


 Summary: test_web_pages() could fail in the exhaustive build
 Key: IMPALA-12830
 URL: https://issues.apache.org/jira/browse/IMPALA-12830
 Project: IMPALA
  Issue Type: Bug
Reporter: Fang-Yu Rao
Assignee: Saurabh Katiyal


We found in an internal Jenkins run that test_web_pages() could fail in the 
exhaustive build with the following error.
+*Error Message*+
{code}
AssertionError: bad links from webui port 25020 assert ['/', 
'/catal...g_level', ...] == ['/', '/catalo...g_level', ...]   At index 2 diff: 
u'/events' != '/hadoop-varz'   Full diff:   - [u'/',   ?  -   + ['/',   -  
u'/catalog',   ?  -   +  '/catalog',   -  u'/events',   -  u'/hadoop-varz',   ? 
 -   +  '/hadoop-varz',   +  '/events',   -  u'/jmx',   ?  -   +  '/jmx',   -  
u'/log_level',   ?  -   +  '/log_level',   -  u'/memz',   ?  -   +  '/memz',   
-  u'/metrics',   ?  -   +  '/metrics',   -  u'/operations',   ?  -   +  
'/operations',   -  u'/profile_docs',   ?  -   +  '/profile_docs',   -  
u'/rpcz',   ?  -   +  '/rpcz',   -  u'/threadz',   ?  -   +  '/threadz',   -  
u'/varz']   ?  -   +  '/varz']
{code}

+*Stacktrace*+
{code}
custom_cluster/test_web_pages.py:248: in test_webserver_hide_logs_link
assert found_links == expected_catalog_links, msg
E   AssertionError: bad links from webui port 25020
E   assert ['/', '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]
E At index 2 diff: u'/events' != '/hadoop-varz'
E Full diff:
E - [u'/',
E ?  -
E + ['/',
E -  u'/catalog',
E ?  -
E +  '/catalog',
E -  u'/events',
E -  u'/hadoop-varz',
E ?  -
E +  '/hadoop-varz',
E +  '/events',
E -  u'/jmx',
E ?  -
E +  '/jmx',
E -  u'/log_level',
E ?  -
E +  '/log_level',
E -  u'/memz',
E ?  -
E +  '/memz',
E -  u'/metrics',
E ?  -
E +  '/metrics',
E -  u'/operations',
E ?  -
E +  '/operations',
E -  u'/profile_docs',
E ?  -
E +  '/profile_docs',
E -  u'/rpcz',
E ?  -
E +  '/rpcz',
E -  u'/threadz',
E ?  -
E +  '/threadz',
E -  u'/varz']
E ?  -
E +  '/varz']
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12830) test_webserver_hide_logs_link() could fail in the exhaustive build

2024-02-21 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao updated IMPALA-12830:
-
Summary: test_webserver_hide_logs_link() could fail in the exhaustive build 
 (was: test_web_pages() could fail in the exhaustive build)

> test_webserver_hide_logs_link() could fail in the exhaustive build
> --
>
> Key: IMPALA-12830
> URL: https://issues.apache.org/jira/browse/IMPALA-12830
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Saurabh Katiyal
>Priority: Major
>  Labels: broken-build
>
> We found in an internal Jenkins run that test_web_pages() could fail in the 
> exhaustive build with the following error.
> +*Error Message*+
> {code}
> AssertionError: bad links from webui port 25020 assert ['/', 
> '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]   At index 2 
> diff: u'/events' != '/hadoop-varz'   Full diff:   - [u'/',   ?  -   + ['/',   
> -  u'/catalog',   ?  -   +  '/catalog',   -  u'/events',   -  
> u'/hadoop-varz',   ?  -   +  '/hadoop-varz',   +  '/events',   -  u'/jmx',   
> ?  -   +  '/jmx',   -  u'/log_level',   ?  -   +  '/log_level',   -  
> u'/memz',   ?  -   +  '/memz',   -  u'/metrics',   ?  -   +  '/metrics',   -  
> u'/operations',   ?  -   +  '/operations',   -  u'/profile_docs',   ?  -   +  
> '/profile_docs',   -  u'/rpcz',   ?  -   +  '/rpcz',   -  u'/threadz',   ?  - 
>   +  '/threadz',   -  u'/varz']   ?  -   +  '/varz']
> {code}
> +*Stacktrace*+
> {code}
> custom_cluster/test_web_pages.py:248: in test_webserver_hide_logs_link
> assert found_links == expected_catalog_links, msg
> E   AssertionError: bad links from webui port 25020
> E   assert ['/', '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]
> E At index 2 diff: u'/events' != '/hadoop-varz'
> E Full diff:
> E - [u'/',
> E ?  -
> E + ['/',
> E -  u'/catalog',
> E ?  -
> E +  '/catalog',
> E -  u'/events',
> E -  u'/hadoop-varz',
> E ?  -
> E +  '/hadoop-varz',
> E +  '/events',
> E -  u'/jmx',
> E ?  -
> E +  '/jmx',
> E -  u'/log_level',
> E ?  -
> E +  '/log_level',
> E -  u'/memz',
> E ?  -
> E +  '/memz',
> E -  u'/metrics',
> E ?  -
> E +  '/metrics',
> E -  u'/operations',
> E ?  -
> E +  '/operations',
> E -  u'/profile_docs',
> E ?  -
> E +  '/profile_docs',
> E -  u'/rpcz',
> E ?  -
> E +  '/rpcz',
> E -  u'/threadz',
> E ?  -
> E +  '/threadz',
> E -  u'/varz']
> E ?  -
> E +  '/varz']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12830) test_webserver_hide_logs_link() could fail in the exhaustive build

2024-02-21 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao updated IMPALA-12830:
-
Description: 
We found in an internal Jenkins run that test_webserver_hide_logs_link() could 
fail in the exhaustive build with the following error.
+*Error Message*+
{code:java}
AssertionError: bad links from webui port 25020 assert ['/', 
'/catal...g_level', ...] == ['/', '/catalo...g_level', ...]   At index 2 diff: 
u'/events' != '/hadoop-varz'   Full diff:   - [u'/',   ?  -   + ['/',   -  
u'/catalog',   ?  -   +  '/catalog',   -  u'/events',   -  u'/hadoop-varz',   ? 
 -   +  '/hadoop-varz',   +  '/events',   -  u'/jmx',   ?  -   +  '/jmx',   -  
u'/log_level',   ?  -   +  '/log_level',   -  u'/memz',   ?  -   +  '/memz',   
-  u'/metrics',   ?  -   +  '/metrics',   -  u'/operations',   ?  -   +  
'/operations',   -  u'/profile_docs',   ?  -   +  '/profile_docs',   -  
u'/rpcz',   ?  -   +  '/rpcz',   -  u'/threadz',   ?  -   +  '/threadz',   -  
u'/varz']   ?  -   +  '/varz']
{code}
+*Stacktrace*+
{code:java}
custom_cluster/test_web_pages.py:248: in test_webserver_hide_logs_link
assert found_links == expected_catalog_links, msg
E   AssertionError: bad links from webui port 25020
E   assert ['/', '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]
E At index 2 diff: u'/events' != '/hadoop-varz'
E Full diff:
E - [u'/',
E ?  -
E + ['/',
E -  u'/catalog',
E ?  -
E +  '/catalog',
E -  u'/events',
E -  u'/hadoop-varz',
E ?  -
E +  '/hadoop-varz',
E +  '/events',
E -  u'/jmx',
E ?  -
E +  '/jmx',
E -  u'/log_level',
E ?  -
E +  '/log_level',
E -  u'/memz',
E ?  -
E +  '/memz',
E -  u'/metrics',
E ?  -
E +  '/metrics',
E -  u'/operations',
E ?  -
E +  '/operations',
E -  u'/profile_docs',
E ?  -
E +  '/profile_docs',
E -  u'/rpcz',
E ?  -
E +  '/rpcz',
E -  u'/threadz',
E ?  -
E +  '/threadz',
E -  u'/varz']
E ?  -
E +  '/varz']
{code}

  was:
We found in an internal Jenkins run that test_web_pages() could fail in the 
exhaustive build with the following error.
+*Error Message*+
{code}
AssertionError: bad links from webui port 25020 assert ['/', 
'/catal...g_level', ...] == ['/', '/catalo...g_level', ...]   At index 2 diff: 
u'/events' != '/hadoop-varz'   Full diff:   - [u'/',   ?  -   + ['/',   -  
u'/catalog',   ?  -   +  '/catalog',   -  u'/events',   -  u'/hadoop-varz',   ? 
 -   +  '/hadoop-varz',   +  '/events',   -  u'/jmx',   ?  -   +  '/jmx',   -  
u'/log_level',   ?  -   +  '/log_level',   -  u'/memz',   ?  -   +  '/memz',   
-  u'/metrics',   ?  -   +  '/metrics',   -  u'/operations',   ?  -   +  
'/operations',   -  u'/profile_docs',   ?  -   +  '/profile_docs',   -  
u'/rpcz',   ?  -   +  '/rpcz',   -  u'/threadz',   ?  -   +  '/threadz',   -  
u'/varz']   ?  -   +  '/varz']
{code}

+*Stacktrace*+
{code}
custom_cluster/test_web_pages.py:248: in test_webserver_hide_logs_link
assert found_links == expected_catalog_links, msg
E   AssertionError: bad links from webui port 25020
E   assert ['/', '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]
E At index 2 diff: u'/events' != '/hadoop-varz'
E Full diff:
E - [u'/',
E ?  -
E + ['/',
E -  u'/catalog',
E ?  -
E +  '/catalog',
E -  u'/events',
E -  u'/hadoop-varz',
E ?  -
E +  '/hadoop-varz',
E +  '/events',
E -  u'/jmx',
E ?  -
E +  '/jmx',
E -  u'/log_level',
E ?  -
E +  '/log_level',
E -  u'/memz',
E ?  -
E +  '/memz',
E -  u'/metrics',
E ?  -
E +  '/metrics',
E -  u'/operations',
E ?  -
E +  '/operations',
E -  u'/profile_docs',
E ?  -
E +  '/profile_docs',
E -  u'/rpcz',
E ?  -
E +  '/rpcz',
E -  u'/threadz',
E ?  -
E +  '/threadz',
E -  u'/varz']
E ?  -
E +  '/varz']
{code}


> test_webserver_hide_logs_link() could fail in the exhaustive build
> --
>
> Key: IMPALA-12830
> URL: https://issues.apache.org/jira/browse/IMPALA-12830
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Saurabh Katiyal
>Priority: Major
>  Labels: broken-build
>
> We found in an internal Jenkins run that test_webserver_hide_logs_link() 
> could fail in the exhaustive build with the following error.
> +*Error Message*+
> {code:java}
> AssertionError: bad links from webui port 25020 assert ['/', 
> '/catal...g_level', ...] == ['/', '/catalo...g_level', ...]   At index 2 
> diff: u'/events' != '/hadoop-varz'   Full diff:   - [u'/',   ?  -   + ['/',   
> -  u'/catalog',   ?  -   +  '/catalog',   -  u'/events',   -  
> u'/hadoop-varz',   ?  -   +  

[jira] [Updated] (IMPALA-12828) Remove Usage of "this->"

2024-02-21 Thread Jason Fehr (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Fehr updated IMPALA-12828:

Description: The commit 
[408c606|https://github.com/apache/impala/commit/408c606dd9b891b3abc7f5ce95c95a92334abb8b]
 added code that used the pattern "this->" unnecessarily.  Remove instances of 
using "this->" from the code changes in this commit.  (was: Remove instances of 
using "this->" from the internal server code.)

> Remove Usage of "this->"
> 
>
> Key: IMPALA-12828
> URL: https://issues.apache.org/jira/browse/IMPALA-12828
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Jason Fehr
>Assignee: Jason Fehr
>Priority: Major
>
> The commit 
> [408c606|https://github.com/apache/impala/commit/408c606dd9b891b3abc7f5ce95c95a92334abb8b]
>  added code that used the pattern "this->" unnecessarily.  Remove instances 
> of using "this->" from the code changes in this commit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12828) Remove Usage of "this->"

2024-02-21 Thread Jason Fehr (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Fehr updated IMPALA-12828:

Parent: IMPALA-12426
Issue Type: Sub-task  (was: Task)

> Remove Usage of "this->"
> 
>
> Key: IMPALA-12828
> URL: https://issues.apache.org/jira/browse/IMPALA-12828
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Jason Fehr
>Assignee: Jason Fehr
>Priority: Major
>
> Remove instances of using "this->" from the internal server code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12426) SQL Interface to Completed Queries/DDLs/DMLs

2024-02-21 Thread Jason Fehr (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Fehr updated IMPALA-12426:

Issue Type: New Feature  (was: Story)

> SQL Interface to Completed Queries/DDLs/DMLs
> 
>
> Key: IMPALA-12426
> URL: https://issues.apache.org/jira/browse/IMPALA-12426
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, be
>Reporter: Jason Fehr
>Assignee: Jason Fehr
>Priority: Major
>  Labels: impala, workload-management
>
> Implement a way of querying (via SQL) information about completed 
> queries/ddls/dmls.  Adds coordinator startup flags for users to specify that 
> Impala will track completed queries in an internal table.
> Impala will create and maintain an internal Iceberg table named 
> "impala_query_log" in the "system database" that contains all completed 
> queries. This table is automatically created at startup by each coordinator 
> if it does not exist. Then, each completed query is queued in memory and 
> flushed to the query history table either at a set interval (user specified 
> number of minutes) or when a user specified number of completed queries are 
> queued in memory.  Partition this table by the hour of the query end time.
> Data in this table must match the corresponding data in the query profile.  
> Develop automated testing that asserts this requirement is true.
> Don't write use, show, and set queries to this table.
> Add the following metrics to the "impala-server" metrics group:
> * Number of completed queries queued in memory waiting to be written to the 
> table.
> * Number of completed queries successfully written to the table.
> * Number of attempts that failed to write completed queries to the table.
> * Number of times completed queries were written at the regularly scheduled 
> time.
> * Number of times completed queries were written before the scheduled time 
> because the max number of queued records was reached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12426) SQL Interface to Completed Queries/DDLs/DMLs

2024-02-21 Thread Jason Fehr (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Fehr updated IMPALA-12426:

Issue Type: New Feature  (was: Improvement)

> SQL Interface to Completed Queries/DDLs/DMLs
> 
>
> Key: IMPALA-12426
> URL: https://issues.apache.org/jira/browse/IMPALA-12426
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, be
>Reporter: Jason Fehr
>Assignee: Jason Fehr
>Priority: Major
>  Labels: impala, workload-management
>
> Implement a way of querying (via SQL) information about completed 
> queries/ddls/dmls.  Adds coordinator startup flags for users to specify that 
> Impala will track completed queries in an internal table.
> Impala will create and maintain an internal Iceberg table named 
> "impala_query_log" in the "system database" that contains all completed 
> queries. This table is automatically created at startup by each coordinator 
> if it does not exist. Then, each completed query is queued in memory and 
> flushed to the query history table either at a set interval (user specified 
> number of minutes) or when a user specified number of completed queries are 
> queued in memory.  Partition this table by the hour of the query end time.
> Data in this table must match the corresponding data in the query profile.  
> Develop automated testing that asserts this requirement is true.
> Don't write use, show, and set queries to this table.
> Add the following metrics to the "impala-server" metrics group:
> * Number of completed queries queued in memory waiting to be written to the 
> table.
> * Number of completed queries successfully written to the table.
> * Number of attempts that failed to write completed queries to the table.
> * Number of times completed queries were written at the regularly scheduled 
> time.
> * Number of times completed queries were written before the scheduled time 
> because the max number of queued records was reached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12426) SQL Interface to Completed Queries/DDLs/DMLs

2024-02-21 Thread Jason Fehr (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Fehr updated IMPALA-12426:

Issue Type: Story  (was: New Feature)

> SQL Interface to Completed Queries/DDLs/DMLs
> 
>
> Key: IMPALA-12426
> URL: https://issues.apache.org/jira/browse/IMPALA-12426
> Project: IMPALA
>  Issue Type: Story
>  Components: Backend, be
>Reporter: Jason Fehr
>Assignee: Jason Fehr
>Priority: Major
>  Labels: impala, workload-management
>
> Implement a way of querying (via SQL) information about completed 
> queries/ddls/dmls.  Adds coordinator startup flags for users to specify that 
> Impala will track completed queries in an internal table.
> Impala will create and maintain an internal Iceberg table named 
> "impala_query_log" in the "system database" that contains all completed 
> queries. This table is automatically created at startup by each coordinator 
> if it does not exist. Then, each completed query is queued in memory and 
> flushed to the query history table either at a set interval (user specified 
> number of minutes) or when a user specified number of completed queries are 
> queued in memory.  Partition this table by the hour of the query end time.
> Data in this table must match the corresponding data in the query profile.  
> Develop automated testing that asserts this requirement is true.
> Don't write use, show, and set queries to this table.
> Add the following metrics to the "impala-server" metrics group:
> * Number of completed queries queued in memory waiting to be written to the 
> table.
> * Number of completed queries successfully written to the table.
> * Number of attempts that failed to write completed queries to the table.
> * Number of times completed queries were written at the regularly scheduled 
> time.
> * Number of times completed queries were written before the scheduled time 
> because the max number of queued records was reached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12828) Remove Usage of "this->"

2024-02-21 Thread Jason Fehr (Jira)
Jason Fehr created IMPALA-12828:
---

 Summary: Remove Usage of "this->"
 Key: IMPALA-12828
 URL: https://issues.apache.org/jira/browse/IMPALA-12828
 Project: IMPALA
  Issue Type: Task
  Components: Backend
Reporter: Jason Fehr
Assignee: Jason Fehr


Remove instances of using "this->" from the internal server code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-12824) Add Built-in Functions to Pretty Print Duration and Bytes

2024-02-21 Thread Jason Fehr (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Fehr closed IMPALA-12824.
---
Resolution: Fixed

> Add Built-in Functions to Pretty Print Duration and Bytes
> -
>
> Key: IMPALA-12824
> URL: https://issues.apache.org/jira/browse/IMPALA-12824
> Project: IMPALA
>  Issue Type: Improvement
>  Components: be
>Reporter: Jason Fehr
>Assignee: Jason Fehr
>Priority: Major
>  Labels: backend, functions
>
> Implement new built-in Impala string functions to pretty print time duration 
> from an input of nanoseconds and pretty print a memory value from an input of 
> bytes.
> For example, pretty printing a duration of 2147483648 nanoseconds would 
> output "2s147ms".  Pretty printing a memory value of 32768 would output 
> "32.00 KB".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12824) Add Built-in Functions to Pretty Print Duration and Bytes

2024-02-21 Thread Jason Fehr (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Fehr updated IMPALA-12824:

Summary: Add Built-in Functions to Pretty Print Duration and Bytes  (was: 
Add Built-in Functions to Pretty Print Duration and Bytes.)

> Add Built-in Functions to Pretty Print Duration and Bytes
> -
>
> Key: IMPALA-12824
> URL: https://issues.apache.org/jira/browse/IMPALA-12824
> Project: IMPALA
>  Issue Type: Improvement
>  Components: be
>Reporter: Jason Fehr
>Assignee: Jason Fehr
>Priority: Major
>  Labels: backend, functions
>
> Implement new built-in Impala string functions to pretty print time duration 
> from an input of nanoseconds and pretty print a memory value from an input of 
> bytes.
> For example, pretty printing a duration of 2147483648 nanoseconds would 
> output "2s147ms".  Pretty printing a memory value of 32768 would output 
> "32.00 KB".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12609) Implement SHOW TABLES IN statement to list Iceberg Metadata tables

2024-02-21 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819255#comment-17819255
 ] 

Daniel Becker commented on IMPALA-12609:


https://gerrit.cloudera.org/#/c/21026/

> Implement SHOW TABLES IN statement to list Iceberg Metadata tables
> --
>
> Key: IMPALA-12609
> URL: https://issues.apache.org/jira/browse/IMPALA-12609
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Affects Versions: Impala 4.4.0
>Reporter: Tamas Mate
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: impala-iceberg
>
> {{SHOW TABLES IN}} statement could be used to list all the available metadata 
> tables of an Iceberg table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12827) Precondition was hit in MutableValidReaderWriteIdList

2024-02-21 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-12827:
-
Description: 
The callstack below led to stopping metastore event processor during an abort 
transaction event:
{code}
MetastoreEventsProcessor.java:899] Unexpected exception received while 
processing event
Java exception follows:
java.lang.IllegalStateException
at 
com.google.common.base.Preconditions.checkState(Preconditions.java:486)
at 
org.apache.impala.hive.common.MutableValidReaderWriteIdList.addAbortedWriteIds(MutableValidReaderWriteIdList.java:274)
at org.apache.impala.catalog.HdfsTable.addWriteIds(HdfsTable.java:3101)
at 
org.apache.impala.catalog.CatalogServiceCatalog.addWriteIdsToTable(CatalogServiceCatalog.java:3885)
at 
org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.addAbortedWriteIdsToTables(MetastoreEvents.java:2775)
at 
org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.process(MetastoreEvents.java:2761)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:522)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1052)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:881)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{code}

Precondition: 
https://github.com/apache/impala/blob/2f14fd29c0b47fc2c170a7f0eb1cecaf6b9704f4/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java#L274

I was not able to reproduce this so far.



  was:
The callstack below led to stopping metastore event processor during an abort 
transaction event:
{code}
MetastoreEventsProcessor.java:899] Unexpected exception received while 
processing event
Java exception follows:
java.lang.IllegalStateException
at 
com.google.common.base.Preconditions.checkState(Preconditions.java:486)
at 
org.apache.impala.hive.common.MutableValidReaderWriteIdList.addAbortedWriteIds(MutableValidReaderWriteIdList.java:274)
at org.apache.impala.catalog.HdfsTable.addWriteIds(HdfsTable.java:3101)
at 
org.apache.impala.catalog.CatalogServiceCatalog.addWriteIdsToTable(CatalogServiceCatalog.java:3885)
at 
org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.addAbortedWriteIdsToTables(MetastoreEvents.java:2775)
at 
org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.process(MetastoreEvents.java:2761)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:522)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1052)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:881)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{code}

Precondition: 
https://github.com/apache/impala/blob/2f14fd29c0b47fc2c170a7f0eb1cecaf6b9704f4/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java#L274

I was not able to reproduce this yet.




> Precondition was hit in MutableValidReaderWriteIdList
> -
>
> Key: IMPALA-12827
> URL: https://issues.apache.org/jira/browse/IMPALA-12827
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: ACID, catalog
>
> The callstack below led to stopping metastore event processor during an abort 
> transaction event:
> {code}
> 

[jira] [Updated] (IMPALA-12827) Precondition was hit in MutableValidReaderWriteIdList

2024-02-21 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-12827:
-
Labels: catalog  (was: )

> Precondition was hit in MutableValidReaderWriteIdList
> -
>
> Key: IMPALA-12827
> URL: https://issues.apache.org/jira/browse/IMPALA-12827
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: catalog
>
> The callstack below led to stopping metastore event processor during an abort 
> transaction event:
> {code}
> MetastoreEventsProcessor.java:899] Unexpected exception received while 
> processing event
> Java exception follows:
> java.lang.IllegalStateException
>   at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:486)
>   at 
> org.apache.impala.hive.common.MutableValidReaderWriteIdList.addAbortedWriteIds(MutableValidReaderWriteIdList.java:274)
>   at org.apache.impala.catalog.HdfsTable.addWriteIds(HdfsTable.java:3101)
>   at 
> org.apache.impala.catalog.CatalogServiceCatalog.addWriteIdsToTable(CatalogServiceCatalog.java:3885)
>   at 
> org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.addAbortedWriteIdsToTables(MetastoreEvents.java:2775)
>   at 
> org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.process(MetastoreEvents.java:2761)
>   at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:522)
>   at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1052)
>   at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:881)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:750)
> {code}
> Precondition: 
> https://github.com/apache/impala/blob/2f14fd29c0b47fc2c170a7f0eb1cecaf6b9704f4/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java#L274
> I was not able to reproduce this yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12827) Precondition was hit in MutableValidReaderWriteIdList

2024-02-21 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-12827:
-
Labels: ACID catalog  (was: catalog)

> Precondition was hit in MutableValidReaderWriteIdList
> -
>
> Key: IMPALA-12827
> URL: https://issues.apache.org/jira/browse/IMPALA-12827
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: ACID, catalog
>
> The callstack below led to stopping metastore event processor during an abort 
> transaction event:
> {code}
> MetastoreEventsProcessor.java:899] Unexpected exception received while 
> processing event
> Java exception follows:
> java.lang.IllegalStateException
>   at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:486)
>   at 
> org.apache.impala.hive.common.MutableValidReaderWriteIdList.addAbortedWriteIds(MutableValidReaderWriteIdList.java:274)
>   at org.apache.impala.catalog.HdfsTable.addWriteIds(HdfsTable.java:3101)
>   at 
> org.apache.impala.catalog.CatalogServiceCatalog.addWriteIdsToTable(CatalogServiceCatalog.java:3885)
>   at 
> org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.addAbortedWriteIdsToTables(MetastoreEvents.java:2775)
>   at 
> org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.process(MetastoreEvents.java:2761)
>   at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:522)
>   at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1052)
>   at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:881)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:750)
> {code}
> Precondition: 
> https://github.com/apache/impala/blob/2f14fd29c0b47fc2c170a7f0eb1cecaf6b9704f4/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java#L274
> I was not able to reproduce this yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12827) Precondition was hit in MutableValidReaderWriteIdList

2024-02-21 Thread Csaba Ringhofer (Jira)
Csaba Ringhofer created IMPALA-12827:


 Summary: Precondition was hit in MutableValidReaderWriteIdList
 Key: IMPALA-12827
 URL: https://issues.apache.org/jira/browse/IMPALA-12827
 Project: IMPALA
  Issue Type: Bug
Reporter: Csaba Ringhofer


The callstack below led to stopping metastore event processor during an abort 
transaction event:
{code}
MetastoreEventsProcessor.java:899] Unexpected exception received while 
processing event
Java exception follows:
java.lang.IllegalStateException
at 
com.google.common.base.Preconditions.checkState(Preconditions.java:486)
at 
org.apache.impala.hive.common.MutableValidReaderWriteIdList.addAbortedWriteIds(MutableValidReaderWriteIdList.java:274)
at org.apache.impala.catalog.HdfsTable.addWriteIds(HdfsTable.java:3101)
at 
org.apache.impala.catalog.CatalogServiceCatalog.addWriteIdsToTable(CatalogServiceCatalog.java:3885)
at 
org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.addAbortedWriteIdsToTables(MetastoreEvents.java:2775)
at 
org.apache.impala.catalog.events.MetastoreEvents$AbortTxnEvent.process(MetastoreEvents.java:2761)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:522)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1052)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:881)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{code}

Precondition: 
https://github.com/apache/impala/blob/2f14fd29c0b47fc2c170a7f0eb1cecaf6b9704f4/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java#L274

I was not able to reproduce this yet.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-02-21 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819210#comment-17819210
 ] 

Maxwell Guo commented on IMPALA-12771:
--

done , publish it now

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-02-21 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819203#comment-17819203
 ] 

Quanlong Huang commented on IMPALA-12771:
-

[~maxwellguo] Thanks for uploading a patch. Please publish it by clicking the 
"publish" button so it can be visible to us. Or if you want to keep it private, 
add us to the reviewer list.

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12824) Add Built-in Functions to Pretty Print Duration and Bytes.

2024-02-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819174#comment-17819174
 ] 

ASF subversion and git services commented on IMPALA-12824:
--

Commit d03ffc70f2da0e313846d1595b1577824808f9da in impala's branch 
refs/heads/master from jasonmfehr
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d03ffc70f ]

IMPALA-12824: Adds built-in functions prettyprint_duration and 
prettyprint_bytes.

The prettyprint_duration function takes an integer input containing a
number of nanoseconds and returns a human readable value breaking down
the input by hours, minutes, seconds, milliseconds, microseconds, and
nanoseconds.

The prettyprint_bytes function takes an integer input containing a
number of bytes and returns a human readable values breaking down the
input by gigabytes, megabytes, kilobytes, and bytes.

Functionality tests were added to the existing expr-test suite that
tests built-in functions.

Functional-query workloads were added in two new .test files under the
testdata directory to exercise these two new functions. Corresponding
pytests were added to run the tests in these new .test files.

Benchmarks were added to expr-benchmark, and new benchmarks were
generated with a release build running on a machine with the cpu
Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz.

Documentation was added to the built-in string functions docs.

Change-Id: I3e76632ce21ad2ca5df474160338699a542a6913
Reviewed-on: http://gerrit.cloudera.org:8080/21038
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add Built-in Functions to Pretty Print Duration and Bytes.
> --
>
> Key: IMPALA-12824
> URL: https://issues.apache.org/jira/browse/IMPALA-12824
> Project: IMPALA
>  Issue Type: Improvement
>  Components: be
>Reporter: Jason Fehr
>Assignee: Jason Fehr
>Priority: Major
>  Labels: backend, functions
>
> Implement new built-in Impala string functions to pretty print time duration 
> from an input of nanoseconds and pretty print a memory value from an input of 
> bytes.
> For example, pretty printing a duration of 2147483648 nanoseconds would 
> output "2s147ms".  Pretty printing a memory value of 32768 would output 
> "32.00 KB".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12433) KrpcDataStreamSender could share some buffers between channels

2024-02-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819175#comment-17819175
 ] 

ASF subversion and git services commented on IMPALA-12433:
--

Commit 2f14fd29c0b47fc2c170a7f0eb1cecaf6b9704f4 in impala's branch 
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2f14fd29c ]

IMPALA-12433: Share buffers among channels in KrpcDataStreamSender

Before this patch each KrpcDataStreamSender::Channel had 2
OutboundRowBatch with its own serialization and compression buffers.

This patch switches to use a single buffer per channel. This is
enough to store the in-flight data in KRPC, while other buffers
are only used during serialization and compression which is done for
just a single channel at a time, so can be shared among channels.

Memory estimates in the planner are not changed because the existing
calculation has several issues (see IMPALA-12594).

Change-Id: I64854a350a9dae8bf3af11c871882ea4750e60b3
Reviewed-on: http://gerrit.cloudera.org:8080/20719
Tested-by: Impala Public Jenkins 
Reviewed-by: Kurt Deschler 
Reviewed-by: Zihao Ye 
Reviewed-by: Michael Smith 


> KrpcDataStreamSender could share some buffers between channels
> --
>
> Key: IMPALA-12433
> URL: https://issues.apache.org/jira/browse/IMPALA-12433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: memory-saving, performance
>
> Currently each channel has two outbound row batches and each of those have 2 
> buffers, one for serialization and another for compression.
> https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/be/src/runtime/row-batch.h#L100
> https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/be/src/runtime/krpc-data-stream-sender.cc#L236
> https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java#L81
> As serialization + compression is always done from the fragment instance 
> thread only one compression is done at a time, so a single compression buffer 
> could be shared between channels. If this buffer is sent via KRPC then it 
> could be swapped with the per channel buffer. 
> As far as I understand at least one buffer per channel is needed because  
> async KRPC calls can use it from another thread (this is done to avoid an 
> extra copy of the buffer before RPCs). We can only reuse that buffer after 
> getting a callback from KRPC.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12594) KrpcDataStreamSender's mem estimate is different than real usage

2024-02-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819176#comment-17819176
 ] 

ASF subversion and git services commented on IMPALA-12594:
--

Commit 2f14fd29c0b47fc2c170a7f0eb1cecaf6b9704f4 in impala's branch 
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2f14fd29c ]

IMPALA-12433: Share buffers among channels in KrpcDataStreamSender

Before this patch each KrpcDataStreamSender::Channel had 2
OutboundRowBatch with its own serialization and compression buffers.

This patch switches to use a single buffer per channel. This is
enough to store the in-flight data in KRPC, while other buffers
are only used during serialization and compression which is done for
just a single channel at a time, so can be shared among channels.

Memory estimates in the planner are not changed because the existing
calculation has several issues (see IMPALA-12594).

Change-Id: I64854a350a9dae8bf3af11c871882ea4750e60b3
Reviewed-on: http://gerrit.cloudera.org:8080/20719
Tested-by: Impala Public Jenkins 
Reviewed-by: Kurt Deschler 
Reviewed-by: Zihao Ye 
Reviewed-by: Michael Smith 


> KrpcDataStreamSender's mem estimate is different than real usage
> 
>
> Key: IMPALA-12594
> URL: https://issues.apache.org/jira/browse/IMPALA-12594
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Frontend
>Reporter: Csaba Ringhofer
>Priority: Major
>
> IMPALA-6684 added memory estimates for KrpcDataStreamSender's, but there are 
> few gaps between the how the frontend estimates memory and how the backend 
> actually allocates it:
> The frontend uses the following formula:
> buffer_size = num_channels * 2 * (tuple_buffer_length + 
> compressed_buffer_length)
> This takes account for the serialization and compression buffer for each 
> OutboundRowBatch.
> This can  both under and over estimate:
> 1. it doesn't take account of the RowBatch used by channels during 
> partitioned exchange to collact rows belonging to a single channel 
> https://github.com/apache/impala/blob/4c762725c707f8d150fe250c03faf486008702d4/be/src/runtime/krpc-data-stream-sender.cc#L232
> 2.it ignores the adjustment to the RowBatch capacity above based on flag 
> data_stream_sender_buffer_size 
> https://github.com/apache/impala/blob/4c762725c707f8d150fe250c03faf486008702d4/be/src/runtime/krpc-data-stream-sender.cc#L379
> This adjustment can both increase or decrease the capacity to have to desired 
> total size (16K by defaul).
> Note that the adjustment above ignores var len data, so it can massively 
> underestimate in some cases. Meanwhile the frontend logic calculates string 
> sizes if stats are present. Ideally both logic would be improved and synced 
> to use both data_stream_sender_buffer_size and the string sizes for the 
> estimate (I am not sure about collection types).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12826) Add better cardinality estimation for Iceberg V2 tables with equality deletes

2024-02-21 Thread Gabor Kaszab (Jira)
Gabor Kaszab created IMPALA-12826:
-

 Summary: Add better cardinality estimation for Iceberg V2 tables 
with equality deletes
 Key: IMPALA-12826
 URL: https://issues.apache.org/jira/browse/IMPALA-12826
 Project: IMPALA
  Issue Type: Sub-task
  Components: Frontend
Reporter: Gabor Kaszab


there is a similar ticket for positional deletes: 
https://issues.apache.org/jira/browse/IMPALA-12371

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-02-21 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819161#comment-17819161
 ] 

Maxwell Guo edited comment on IMPALA-12771 at 2/21/24 9:48 AM:
---

The initial version is here
 https://gerrit.cloudera.org/#/c/21045/
and I am doing local testing at the same time. 

CC [~mylogi...@gmail.com][~stigahuang][~VenuReddy] ,let me know if there is 
something obviously wrong with my modifications.


was (Author: maxwellguo):
The initial version is here https://gerrit.cloudera.org/#/c/21045/
and I am doing local testing at the same time. 

CC [~mylogi...@gmail.com][~stigahuang][~VenuReddy] ,let me know if there is 
something obviously wrong with my modifications.

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-02-21 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819161#comment-17819161
 ] 

Maxwell Guo commented on IMPALA-12771:
--

The initial version is here https://gerrit.cloudera.org/#/c/21045/
and I am doing local testing at the same time. 

CC [~mylogi...@gmail.com][~stigahuang][~VenuReddy] ,let me know if there is 
something obviously wrong with my modifications.

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org