[jira] [Commented] (IMPALA-12365) Show fragment's memory and thread usage on the query timeline

2023-11-08 Thread Surya Hebbar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784288#comment-17784288
 ] 

Surya Hebbar commented on IMPALA-12365:
---

*Commit fed578580bb4dc688ab2843a07a8aa508b452fd8 in impala's branch 
refs/heads/master from Surya Hebbar*
*[ https://gitbox.apache.org/repos/asf?p=impala.git;h=fed578580 ]*

IMPALA-12364: Display memory, disk and network metrics in webUI's query timeline

This patch adds fragment-level metrics to the WebUI query timeline
display along with additional disk and network metrics.

The fragment's plan nodes are enlarged with an animated transition on
hovering over the fragment's row in query timeline's fragment diagram.

On clicking the plan nodes, total thread and memory usage of the parent
fragment are displayed, after accumulating memory and thread usage of
all child nodes. Thread usage is being shown on the additional Y-axis.

In this way, memory and thread usage of multiple fragments can be
compared alongside. A fragment's usage can be hidden by clicking
on any of the child plan nodes again.

These counters are available within the profile with following names.

- MemoryUsage
- ThreadUsage

Once a fragment's metrics are displayed, they are updated as they
are collected from the profile during a running query.

A grid-line is displayed along with a tooltip on hovering over the
fragment diagram, containing the instantaneous time at that position.
This grid-line also triggers tooltips and gridlines in other charts.

A warning is displayed on clicking a fragment with less number of samples
available.

RESOURCE_TRACE_RATIO query option must be set for providing periodic
metrics within the profile. This allows the following time series
counters to be displayed on the query timeline.

- HostDiskWriteThroughput
- HostDiskReadThroughput
- HostNetworkRx
- HostNetworkTx

The additional Y-axis within the utilization chart is used to represent
the average of these metrics.

The memory units in tooltips and ticks on co-ordinate axes are displayed
in human readable form such as KB, MB, GB and PB for convenience.

Both of the charts contain controls to close the chart. These charts
can also be resized until a maximum and minmum limit by dragging the
resize bar's handle.

Along with mouse wheel events, the diagrams can be horizontally
stretched by the help of buttons with horizontal zoom icons at the
top of the page. The zoom out button is disabled, when further zoom out
is not possible.

Timeticks are being autoscaled during fragment diagram's horizontal zoom.

In addition to the scrollbar, hovering on edges of the window allows
horizontal scrolling.

Test cases have been for the additional disk, network and fragment level
memory metrics parsing functions.

Change-Id: Ifd25e6f0bc9fbd664ec98936daff3f27182dfc7f
Reviewed-on: http://gerrit.cloudera.org:8080/20355
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 

> Show fragment's memory and thread usage on the query timeline
> -
>
> Key: IMPALA-12365
> URL: https://issues.apache.org/jira/browse/IMPALA-12365
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
> Attachments: aligned_gridlines.png, 
> aligned_gridlines_and_hovering_scroll.mkv, both_charts_resize.mkv, 
> both_charts_resize.png, clickable_plan_nodes.mkv, clickable_plan_nodes.png, 
> draggable_resize_handle.png, fragment_metrics_chart_resize.mkv, 
> fragment_metrics_close_button.png, fragment_metrics_resize_bar.png, 
> hor_zoom_buttons.png, horizontal_zoom_buttons.mkv, 
> multiple_fragment_metrics.mkv, multiple_fragment_metrics_cropped.png, 
> resize_drag_handle.mkv
>
>
> The query timeline's fragment diagram can be used to display different memory 
> and thread usage metrics, to support query planning and debugging purposes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work stopped] (IMPALA-12364) Display disk and network metrics in webUI's query timeline

2023-11-08 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-12364 stopped by Surya Hebbar.
-
> Display disk and network metrics in webUI's query timeline
> --
>
> Key: IMPALA-12364
> URL: https://issues.apache.org/jira/browse/IMPALA-12364
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
> Attachments: average_disk_network_metrics.mkv, 
> averaged_disk_network_metrics.png, both_charts_resize.mkv, 
> both_charts_resize.png, close_cpu_utilization_button.mkv, 
> draggable_resize_handle.png, hor_zoom_buttons.png, 
> horizontal_zoom_buttons.mkv, host_utilization_chart_resize.mkv, 
> host_utilization_close_button.png, host_utilization_resize_bar.png, 
> multiple_fragment_metrics.png, resize_drag_handle.mkv
>
>
> It would be helpful to display disk and network usage in human readable form 
> on the query timeline, aligning it along with the CPU utilization plot, below 
> the fragment timing diagram.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work stopped] (IMPALA-12365) Show fragment's memory and thread usage on the query timeline

2023-11-08 Thread Surya Hebbar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-12365 stopped by Surya Hebbar.
-
> Show fragment's memory and thread usage on the query timeline
> -
>
> Key: IMPALA-12365
> URL: https://issues.apache.org/jira/browse/IMPALA-12365
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
> Attachments: aligned_gridlines.png, 
> aligned_gridlines_and_hovering_scroll.mkv, both_charts_resize.mkv, 
> both_charts_resize.png, clickable_plan_nodes.mkv, clickable_plan_nodes.png, 
> draggable_resize_handle.png, fragment_metrics_chart_resize.mkv, 
> fragment_metrics_close_button.png, fragment_metrics_resize_bar.png, 
> hor_zoom_buttons.png, horizontal_zoom_buttons.mkv, 
> multiple_fragment_metrics.mkv, multiple_fragment_metrics_cropped.png, 
> resize_drag_handle.mkv
>
>
> The query timeline's fragment diagram can be used to display different memory 
> and thread usage metrics, to support query planning and debugging purposes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12550) test_statestored_auto_failover_with_disabling_network flaky

2023-11-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784268#comment-17784268
 ] 

ASF subversion and git services commented on IMPALA-12550:
--

Commit 44c85e85a51bad4faca95f8771b2c2c5a686ca90 in impala's branch 
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=44c85e85a ]

IMPALA-12525: Fix flaky test test_statestored_manual_failover

In test_statestored_manual_failover, statestore service failover is not
triggered sometimes when the network of active statestored is disabled
after manually forced failover.

During test, the network of active statestored could be disabled before
all subscribers re-registered with restarted statestored. This caused
some subscribers to not receive the notification of active statestored
change so that they could not correctly report connection states for
the requests from standby statestored.

This patch made following changes:
1) Updated the test case test_statestored_manual_failover to disable
the network of active statestored after all subscribers re-registering
with the restarted statestored.

2) Defined a new mutex active_lock_ in class StatestoreStub to protect
is_active_ since the mutex lock_ could be held for long time if the
subscriber lose the connection with statestored and enter recovery
mode.

3) Found one case that was not handled on Statestore subscribers. The
subscribers could be started before both statestore instances are
ready to accept registration requests. This caused impalad hit DCHECK.
Changed code to handle this case in this patch.
Added test cases to inject a real delay in statestored startup and
verify impalads and catalogd are able to tolerate this delay.

4) Updated address of active catalogd in the metrics of statestored
after statestore service failover.

5) Another test test_statestored_auto_failover_with_disabling_network
failed occasionally due to delay of HA Handshake RPC between two
statestore instances. The issue is tracked with IMPALA-12550. The last
two lines of the test are commented out temporarily.

Testing:
 - Repeatedly ran test_statestored_manual_failover on Jenkins for
   hundreds of times.
 - Repeatedly ran test_statestored_manual_failover on local machine for
   thousand times without failure.
 - Passed core tests

Change-Id: If03bf09d22a2875d2c1eec8a4f62eeefc5d855dc
Reviewed-on: http://gerrit.cloudera.org:8080/20657
Reviewed-by: Riza Suminto 
Reviewed-by: Michael Smith 
Tested-by: Impala Public Jenkins 


> test_statestored_auto_failover_with_disabling_network flaky
> ---
>
> Key: IMPALA-12550
> URL: https://issues.apache.org/jira/browse/IMPALA-12550
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.4.0
>Reporter: Wenzhe Zhou
>Assignee: Wenzhe Zhou
>Priority: Major
>
> TestStatestoredHA.test_statestored_auto_failover_with_disabling_network 
> failed with following stack trace when repeatedly run this test.
> tests/custom_cluster/test_statestored_ha.py:645: in 
> test_statestored_auto_failover_with_disabling_network
> "statestore.in-ha-recovery-mode", expected_value=False, timeout=120)
> tests/common/impala_service.py:144: in wait_for_metric_value
> self.__metric_timeout_assert(metric_name, expected_value, timeout)
> tests/common/impala_service.py:213: in __metric_timeout_assert
> assert 0, assert_string
> E   AssertionError: Metric statestore.in-ha-recovery-mode did not reach value 
> False in 120s.
> From log messages, the issue was caused by the delay of HA Handshake RPC 
> between two statestore instances. Sometimes the active statestore took a few 
> minutes to response the handshake requests from standby statestore.
> This issue is different from IMPALA-12525, which was caused locking issue on 
> subscribers side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12525) statestore.active-status did not reach value True in 120s

2023-11-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784267#comment-17784267
 ] 

ASF subversion and git services commented on IMPALA-12525:
--

Commit 44c85e85a51bad4faca95f8771b2c2c5a686ca90 in impala's branch 
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=44c85e85a ]

IMPALA-12525: Fix flaky test test_statestored_manual_failover

In test_statestored_manual_failover, statestore service failover is not
triggered sometimes when the network of active statestored is disabled
after manually forced failover.

During test, the network of active statestored could be disabled before
all subscribers re-registered with restarted statestored. This caused
some subscribers to not receive the notification of active statestored
change so that they could not correctly report connection states for
the requests from standby statestored.

This patch made following changes:
1) Updated the test case test_statestored_manual_failover to disable
the network of active statestored after all subscribers re-registering
with the restarted statestored.

2) Defined a new mutex active_lock_ in class StatestoreStub to protect
is_active_ since the mutex lock_ could be held for long time if the
subscriber lose the connection with statestored and enter recovery
mode.

3) Found one case that was not handled on Statestore subscribers. The
subscribers could be started before both statestore instances are
ready to accept registration requests. This caused impalad hit DCHECK.
Changed code to handle this case in this patch.
Added test cases to inject a real delay in statestored startup and
verify impalads and catalogd are able to tolerate this delay.

4) Updated address of active catalogd in the metrics of statestored
after statestore service failover.

5) Another test test_statestored_auto_failover_with_disabling_network
failed occasionally due to delay of HA Handshake RPC between two
statestore instances. The issue is tracked with IMPALA-12550. The last
two lines of the test are commented out temporarily.

Testing:
 - Repeatedly ran test_statestored_manual_failover on Jenkins for
   hundreds of times.
 - Repeatedly ran test_statestored_manual_failover on local machine for
   thousand times without failure.
 - Passed core tests

Change-Id: If03bf09d22a2875d2c1eec8a4f62eeefc5d855dc
Reviewed-on: http://gerrit.cloudera.org:8080/20657
Reviewed-by: Riza Suminto 
Reviewed-by: Michael Smith 
Tested-by: Impala Public Jenkins 


> statestore.active-status did not reach value True in 120s
> -
>
> Key: IMPALA-12525
> URL: https://issues.apache.org/jira/browse/IMPALA-12525
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Wenzhe Zhou
>Priority: Major
>  Labels: broken-build, flaky-test
>
> We found that it's possible that 
> [statestore.active-status|https://github.com/apache/impala/blob/master/tests/custom_cluster/test_statestored_ha.py#L452]
>  could not reach value True in 120s.
> *+Error Message+*
> {code:java}
> AssertionError: Metric statestore.active-status did not reach value True in 
> 120s. Dumping debug webpages in JSON format... Dumped memz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20231026_01:53:53/json/memz.json 
> Dumped metrics JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20231026_01:53:53/json/metrics.json 
> Dumped queries JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20231026_01:53:53/json/queries.json 
> Dumped sessions JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20231026_01:53:53/json/sessions.json 
> Dumped threadz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20231026_01:53:53/json/threadz.json 
> Dumped rpcz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20231026_01:53:53/json/rpcz.json 
> Dumping minidumps for impalads/catalogds... Dumped minidump for Impalad PID 
> 32539 Dumped minidump for Impalad PID 32543 Dumped minidump for Impalad PID 
> 32550 Dumped minidump for Catalogd PID 32460
> {code}
> *+Stacktrace+*
> {code:java}
> custom_cluster/test_statestored_ha.py:500: in test_statestored_manual_failover
> self.__test_statestored_manual_failover(second_failover=True)
> custom_cluster/test_statestored_ha.py:452: in 
> __test_statestored_manual_failover
> "statestore.active-status", expected_value=True, timeout=120)
> common/impala_service.py:144: in wait_for_metric_value
> self.__metric_timeout_assert(metric_name, expected_value, timeout)
> common/impala_service.py:213: in __metric_timeout_assert
> assert 0, assert_string
> E   AssertionError: Metric statestore.active-status did not reach value True 
> in 120s.
> E   Dumping debug webpages in JSON format...
> E   Dumped memz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags

[jira] [Commented] (IMPALA-12364) Display disk and network metrics in webUI's query timeline

2023-11-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784269#comment-17784269
 ] 

ASF subversion and git services commented on IMPALA-12364:
--

Commit fed578580bb4dc688ab2843a07a8aa508b452fd8 in impala's branch 
refs/heads/master from Surya Hebbar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=fed578580 ]

IMPALA-12364: Display memory, disk and network metrics in webUI's query timeline

This patch adds fragment-level metrics to the WebUI query timeline
display along with additional disk and network metrics.

The fragment's plan nodes are enlarged with an animated transition on
hovering over the fragment's row in query timeline's fragment diagram.

On clicking the plan nodes, total thread and memory usage of the parent
fragment are displayed, after accumulating memory and thread usage of
all child nodes. Thread usage is being shown on the additional Y-axis.

In this way, memory and thread usage of multiple fragments can be
compared alongside. A fragment's usage can be hidden by clicking
on any of the child plan nodes again.

These counters are available within the profile with following names.

- MemoryUsage
- ThreadUsage

Once a fragment's metrics are displayed, they are updated as they
are collected from the profile during a running query.

A grid-line is displayed along with a tooltip on hovering over the
fragment diagram, containing the instantaneous time at that position.
This grid-line also triggers tooltips and gridlines in other charts.

A warning is displayed on clicking a fragment with less number of samples
available.

RESOURCE_TRACE_RATIO query option must be set for providing periodic
metrics within the profile. This allows the following time series
counters to be displayed on the query timeline.

- HostDiskWriteThroughput
- HostDiskReadThroughput
- HostNetworkRx
- HostNetworkTx

The additional Y-axis within the utilization chart is used to represent
the average of these metrics.

The memory units in tooltips and ticks on co-ordinate axes are displayed
in human readable form such as KB, MB, GB and PB for convenience.

Both of the charts contain controls to close the chart. These charts
can also be resized until a maximum and minmum limit by dragging the
resize bar's handle.

Along with mouse wheel events, the diagrams can be horizontally
stretched by the help of buttons with horizontal zoom icons at the
top of the page. The zoom out button is disabled, when further zoom out
is not possible.

Timeticks are being autoscaled during fragment diagram's horizontal zoom.

In addition to the scrollbar, hovering on edges of the window allows
horizontal scrolling.

Test cases have been for the additional disk, network and fragment level
memory metrics parsing functions.

Change-Id: Ifd25e6f0bc9fbd664ec98936daff3f27182dfc7f
Reviewed-on: http://gerrit.cloudera.org:8080/20355
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Display disk and network metrics in webUI's query timeline
> --
>
> Key: IMPALA-12364
> URL: https://issues.apache.org/jira/browse/IMPALA-12364
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Surya Hebbar
>Assignee: Surya Hebbar
>Priority: Major
> Attachments: average_disk_network_metrics.mkv, 
> averaged_disk_network_metrics.png, both_charts_resize.mkv, 
> both_charts_resize.png, close_cpu_utilization_button.mkv, 
> draggable_resize_handle.png, hor_zoom_buttons.png, 
> horizontal_zoom_buttons.mkv, host_utilization_chart_resize.mkv, 
> host_utilization_close_button.png, host_utilization_resize_bar.png, 
> multiple_fragment_metrics.png, resize_drag_handle.mkv
>
>
> It would be helpful to display disk and network usage in human readable form 
> on the query timeline, aligning it along with the CPU utilization plot, below 
> the fragment timing diagram.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12513) The metadata should be reset when the CatalogD become active

2023-11-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784266#comment-17784266
 ] 

ASF subversion and git services commented on IMPALA-12513:
--

Commit d2ae6594fbf313cec01217f620a5f3b20945 in impala's branch 
refs/heads/master from ttz
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d2ae6 ]

IMPALA-12513: Allow to reset metadata when the CatalogD becomes active

When switching active catalogd, the loaded metadata in the standby
catalogd may not be the latest. A backend flag should be provided
to control whether to reset metadata when the catalogd become active.
Adds the following startup flags for catalogd:
'catalogd_ha_reset_metadata_on_failover'. Default is false. If true,
reset all metadata when the catalogd becomes active.

Testing:
- Added a test case to start both catalogds with flag
   'catalogd_ha_reset_metadata_on_failover' as true.
- Passed core tests

Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
Reviewed-on: http://gerrit.cloudera.org:8080/20614
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> The metadata should be reset when the CatalogD become active
> 
>
> Key: IMPALA-12513
> URL: https://issues.apache.org/jira/browse/IMPALA-12513
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zhi Tang
>Priority: Major
>
> When performing an automatic failover of CatalogD, we should ensure that the 
> metadata for active Catalogd is up to date.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12515) Allow building impala-shell tarballs for multiple Python 3 versions

2023-11-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784270#comment-17784270
 ] 

ASF subversion and git services commented on IMPALA-12515:
--

Commit 3e99dfcd169fe380356eeb74064372fc38201cdc in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=3e99dfcd1 ]

IMPALA-12515: Clarify behavior with redundant system python

Clarifies the behavior building impala-shell tarball when one of the
system pythons is also included in IMPALA_EXTRA_PACKAGE_PYTHONS. System
python will always replace the same version from
IMPALA_EXTRA_PACKAGE_PYTHONS, as system pythons are appended to the end.

Updates make_shell_tarball to delete the old ext-py install when it
would be replaced rather than relying on 'pip --upgrade', and iterates
by python executable first to make that possible.

Change-Id: I629bdab38d98c8c4232d4cae7b0429a5118d9ff7
Reviewed-on: http://gerrit.cloudera.org:8080/20687
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Allow building impala-shell tarballs for multiple Python 3 versions
> ---
>
> Key: IMPALA-12515
> URL: https://issues.apache.org/jira/browse/IMPALA-12515
> Project: IMPALA
>  Issue Type: Task
>Reporter: Michael Smith
>Assignee: Michael Smith
>Priority: Major
> Fix For: Impala 4.4.0
>
>
> Impala builds an impala-shell tarball where dependencies are already 
> installed so they can be used with specific Python versions without needing 
> dev tools. Currently they produce 2 builds, Python 2 and Python 3. However 
> native binaries built for Python 3 only work with the specific minor version 
> they're built with (i.e. Python 3.6).
> Extend build tooling to support building for multiple Python 3 minor targets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-12515) Allow building impala-shell tarballs for multiple Python 3 versions

2023-11-08 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-12515 started by Michael Smith.
--
> Allow building impala-shell tarballs for multiple Python 3 versions
> ---
>
> Key: IMPALA-12515
> URL: https://issues.apache.org/jira/browse/IMPALA-12515
> Project: IMPALA
>  Issue Type: Task
>Reporter: Michael Smith
>Assignee: Michael Smith
>Priority: Major
> Fix For: Impala 4.4.0
>
>
> Impala builds an impala-shell tarball where dependencies are already 
> installed so they can be used with specific Python versions without needing 
> dev tools. Currently they produce 2 builds, Python 2 and Python 3. However 
> native binaries built for Python 3 only work with the specific minor version 
> they're built with (i.e. Python 3.6).
> Extend build tooling to support building for multiple Python 3 minor targets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-12515) Allow building impala-shell tarballs for multiple Python 3 versions

2023-11-08 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith reopened IMPALA-12515:


> Allow building impala-shell tarballs for multiple Python 3 versions
> ---
>
> Key: IMPALA-12515
> URL: https://issues.apache.org/jira/browse/IMPALA-12515
> Project: IMPALA
>  Issue Type: Task
>Reporter: Michael Smith
>Assignee: Michael Smith
>Priority: Major
> Fix For: Impala 4.4.0
>
>
> Impala builds an impala-shell tarball where dependencies are already 
> installed so they can be used with specific Python versions without needing 
> dev tools. Currently they produce 2 builds, Python 2 and Python 3. However 
> native binaries built for Python 3 only work with the specific minor version 
> they're built with (i.e. Python 3.6).
> Extend build tooling to support building for multiple Python 3 minor targets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12551) impala-shell should handle broken pipe error gracefully

2023-11-08 Thread Riza Suminto (Jira)
Riza Suminto created IMPALA-12551:
-

 Summary: impala-shell should handle broken pipe error gracefully
 Key: IMPALA-12551
 URL: https://issues.apache.org/jira/browse/IMPALA-12551
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 4.3.0
Reporter: Riza Suminto


Ran query inline with impala-shell.sh and stumble on broken pipe error message 
like this:
{code:java}
$ impala-shell.sh -d tpcds_parquet -q "show partitions store_sales;" | head -n 
10 > /tmp/test.out
Starting Impala Shell with no authentication using Python 2.7.16
Warning: live_progress only applies to interactive shell sessions, and is being 
skipped for now.
Opened TCP connection to localhost:21050
Connected to localhost:21050
Server version: impalad version 4.4.0-SNAPSHOT DEBUG (build 
67ed00560d0df2282731dc47a58fb3d679ffb99b)
Query: use `tpcds_parquet`
Query: use `tpcds_parquet`
Query: show partitions store_sales
Unknown Exception : [Errno 32] Broken pipe
Traceback (most recent call last):
  File "/home/rsuminto/workspace/impala/shell/impala_shell.py", line 1389, in 
_execute_stmt
    self.output_stream.write(rows)
  File "/home/rsuminto/workspace/impala/shell/shell_output.py", line 187, in 
write
    print(formatted_data)
IOError: [Errno 32] Broken pipe
Could not execute command: show partitions store_sales{code}
impala-shell should gracefully handle broken pipe error like this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12543) test_iceberg_self_events failed in JDK11 build

2023-11-08 Thread Riza Suminto (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784154#comment-17784154
 ] 

Riza Suminto commented on IMPALA-12543:
---

It does look like flaky. It failed on ASAN build last night.
[~hemanth619] [~boroknagyz] any idea why this test becomes flaky?

> test_iceberg_self_events failed in JDK11 build
> --
>
> Key: IMPALA-12543
> URL: https://issues.apache.org/jira/browse/IMPALA-12543
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Riza Suminto
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: broken-build
>
> test_iceberg_self_events failed in JDK11 build with following error.
>  
> {code:java}
> Error Message
> assert 0 == 1
> Stacktrace
> custom_cluster/test_events_custom_configs.py:637: in test_iceberg_self_events
>     check_self_events("ALTER TABLE {0} ADD COLUMN j INT".format(tbl_name))
> custom_cluster/test_events_custom_configs.py:624: in check_self_events
>     assert tbls_refreshed_before == tbls_refreshed_after
> E   assert 0 == 1 {code}
> This test still pass before IMPALA-11387 merged.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12544) Replace scan progress with query progress as progress reporting for the shell

2023-11-08 Thread Michael Smith (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784140#comment-17784140
 ] 

Michael Smith commented on IMPALA-12544:


That looks like a great improvement, thanks!

> Replace scan progress with query progress as progress reporting for the shell
> -
>
> Key: IMPALA-12544
> URL: https://issues.apache.org/jira/browse/IMPALA-12544
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 4.3.0
>Reporter: Ye Zihao
>Assignee: Ye Zihao
>Priority: Major
> Attachments: POPO20231108-112750.jpg, POPO20231108-112808.jpg
>
>
> After IMPALA-12048 is resolved, we can see the query progress of a query on 
> the /queries page. Unlike scan progress, it uses the completion count of 
> query fragment instances to calculate progress, providing a more accurate 
> reflection of query execution progress, especially for computation-intensive 
> queries (such as TPC-DS query 78). Perhaps we should also use query progress 
> instead of scan progress in the dynamic progress reporting for the impala 
> shell, to provide more accurate query completion progress reports and avoid 
> cases where the scan is complete but the query still needs a long time to 
> finish computation even though the progress is reported as 100%.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12550) test_statestored_auto_failover_with_disabling_network flaky

2023-11-08 Thread Wenzhe Zhou (Jira)
Wenzhe Zhou created IMPALA-12550:


 Summary: test_statestored_auto_failover_with_disabling_network 
flaky
 Key: IMPALA-12550
 URL: https://issues.apache.org/jira/browse/IMPALA-12550
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 4.4.0
Reporter: Wenzhe Zhou
Assignee: Wenzhe Zhou


TestStatestoredHA.test_statestored_auto_failover_with_disabling_network failed 
with following stack trace when repeatedly run this test.
tests/custom_cluster/test_statestored_ha.py:645: in 
test_statestored_auto_failover_with_disabling_network
"statestore.in-ha-recovery-mode", expected_value=False, timeout=120)
tests/common/impala_service.py:144: in wait_for_metric_value
self.__metric_timeout_assert(metric_name, expected_value, timeout)
tests/common/impala_service.py:213: in __metric_timeout_assert
assert 0, assert_string
E   AssertionError: Metric statestore.in-ha-recovery-mode did not reach value 
False in 120s.


>From log messages, the issue was caused by the delay of HA Handshake RPC 
>between two statestore instances. Sometimes the active statestore took a few 
>minutes to response the handshake requests from standby statestore.

This issue is different from IMPALA-12525, which was caused locking issue on 
subscribers side.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12549) Adjust estimations for small strings

2023-11-08 Thread Jira
Zoltán Borók-Nagy created IMPALA-12549:
--

 Summary: Adjust estimations for small strings
 Key: IMPALA-12549
 URL: https://issues.apache.org/jira/browse/IMPALA-12549
 Project: IMPALA
  Issue Type: Sub-task
  Components: Frontend
Reporter: Zoltán Borók-Nagy


With small strings, the queries consume less memory.

We should adjust the memory estimations / min reservations to take the small 
strings into account.

At first we can be conservative, i.e. only take them into account for columns 
with max size less than the small string limit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-12322) return wrong timestamp when scan kudu timestamp with timezone

2023-11-08 Thread Ye Zihao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-12322 started by Ye Zihao.
-
> return wrong timestamp when scan kudu timestamp with timezone
> -
>
> Key: IMPALA-12322
> URL: https://issues.apache.org/jira/browse/IMPALA-12322
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.1.1
> Environment: impala 4.1.1
>Reporter: daicheng
>Assignee: Ye Zihao
>Priority: Major
> Attachments: image-2022-04-24-00-01-05-746-1.png, 
> image-2022-04-24-00-01-05-746.png, image-2022-04-24-00-01-37-520.png, 
> image-2022-04-24-00-03-14-467-1.png, image-2022-04-24-00-03-14-467.png, 
> image-2022-04-24-00-04-16-240-1.png, image-2022-04-24-00-04-16-240.png, 
> image-2022-04-24-00-04-52-860-1.png, image-2022-04-24-00-04-52-860.png, 
> image-2022-04-24-00-05-52-086-1.png, image-2022-04-24-00-05-52-086.png, 
> image-2022-04-24-00-07-09-776-1.png, image-2022-04-24-00-07-09-776.png, 
> image-2023-07-28-20-31-09-457.png, image-2023-07-28-22-27-38-521.png, 
> image-2023-07-28-22-29-40-083.png, image-2023-07-28-22-36-17-460.png, 
> image-2023-07-28-22-36-37-884.png, image-2023-07-28-22-38-19-728.png
>
>
> impala version is 3.1.0-cdh6.1
> i have set system timezone=Asia/Shanghai:
> !image-2022-04-24-00-01-37-520.png!
> !image-2022-04-24-00-01-05-746.png!
> here is the bug:
> *step 1*
> i have parquet file with two columns like below,and read it with impala-shell 
> and spark (timezone=shanghai)
> !image-2022-04-24-00-03-14-467.png|width=1016,height=154!
> !image-2022-04-24-00-04-16-240.png|width=944,height=367!
> the result both exactly right。
> *step two*
> create kudu table  with impala-shell:
> CREATE TABLE default.test_{_}test{_}_test_time2 (id BIGINT,t 
> TIMESTAMP,PRIMARY KEY (id) ) STORED AS KUDU;
> note: kudu version:1.8
> and  insert 2 row into the table with spark :
> !image-2022-04-24-00-04-52-860.png|width=914,height=279!
> *stop 3*
> read it with spark (timezone=shanghai),spark read kudu table with kudu-client 
> api,here is the result:
> !image-2022-04-24-00-05-52-086.png|width=914,height=301!
> the result is still exactly right。
> but read it with impala-shell: 
> !image-2022-04-24-00-07-09-776.png|width=915,height=154!
> the result show late 8hour
> *conclusion*
>    it seems like impala timezone didn't work when kudu column type is 
> timestamp, but it work fine in parquet file,I don't know why?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org