[jira] [Created] (IMPALA-7934) Switch to using Java 8's Base64 impl for incremental stats encoding

2018-12-05 Thread bharath v (JIRA)
bharath v created IMPALA-7934:
-

 Summary: Switch to using Java 8's Base64 impl for incremental 
stats encoding
 Key: IMPALA-7934
 URL: https://issues.apache.org/jira/browse/IMPALA-7934
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.1.0
Reporter: bharath v
 Attachments: base64.png

Incremental stats are compressed and Base64 encoded before they are chunked and 
written to the HMS' partition parameters map. When they are read back, we need 
to Base64 decode and decompress. 

For certain incremental stats heavy tables, we noticed that a significant 
amount of time is spent in these base64 classes (see the attached image for the 
stack. Unfortunately, I don't have the text version of it).

Java 8 comes with its own Base64 implementation and that has shown much better 
perf results [1] compared to apache codec's impl. So consider switching to Java 
8's base64 impl.

 [1] http://java-performance.info/base64-encoding-and-decoding-performance/

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7934) Switch to using Java 8's Base64 impl for incremental stats encoding

2018-12-05 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-7934:
--
Labels: ramp-up  (was: )

> Switch to using Java 8's Base64 impl for incremental stats encoding
> ---
>
> Key: IMPALA-7934
> URL: https://issues.apache.org/jira/browse/IMPALA-7934
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Priority: Major
>  Labels: ramp-up
> Attachments: base64.png
>
>
> Incremental stats are compressed and Base64 encoded before they are chunked 
> and written to the HMS' partition parameters map. When they are read back, we 
> need to Base64 decode and decompress. 
> For certain incremental stats heavy tables, we noticed that a significant 
> amount of time is spent in these base64 classes (see the attached image for 
> the stack. Unfortunately, I don't have the text version of it).
> Java 8 comes with its own Base64 implementation and that has shown much 
> better perf results [1] compared to apache codec's impl. So consider 
> switching to Java 8's base64 impl.
>  [1] http://java-performance.info/base64-encoding-and-decoding-performance/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7933) Consider using read-write locks for partial fetch requests.

2018-12-05 Thread bharath v (JIRA)
bharath v created IMPALA-7933:
-

 Summary: Consider using read-write locks for partial fetch 
requests.
 Key: IMPALA-7933
 URL: https://issues.apache.org/jira/browse/IMPALA-7933
 Project: IMPALA
  Issue Type: Sub-task
  Components: Catalog
Affects Versions: Impala 3.1.0
Reporter: bharath v


Partial table fetches currently use an exclusive lock. Switch to a read-write 
lock instead?

{code}
 // TODO(todd): consider a read-write lock here.
  table.getLock().lock();
  try {
return table.getPartialInfo(req);
  } finally {
table.getLock().unlock();
  }
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7249) Cancel shutdown of impalad

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7249:
--
Description: 
Following on from IMPALA-1760, it could be useful to cancel shutdown for some 
use cases.

An extension would be to allow extending the deadline.

  was:Following on from IMPALA-1760, it could be useful to cancel shutdown for 
some use cases.


> Cancel shutdown of impalad
> --
>
> Key: IMPALA-7249
> URL: https://issues.apache.org/jira/browse/IMPALA-7249
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Distributed Exec
>Reporter: Tim Armstrong
>Priority: Minor
>
> Following on from IMPALA-1760, it could be useful to cancel shutdown for some 
> use cases.
> An extension would be to allow extending the deadline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7932) Cannot change shutdown deadline after issuing initial shutdown command

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7932.
---
Resolution: Duplicate

I think this is essentially the same use case as IMPALA-7249. I don't have 
plans to work on that.

> Cannot change shutdown deadline after issuing initial shutdown command
> --
>
> Key: IMPALA-7932
> URL: https://issues.apache.org/jira/browse/IMPALA-7932
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Lars Volker
>Assignee: Tim Armstrong
>Priority: Major
>
> Starting Impala Shell without Kerberos authentication
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build 
> 3d38043e6b9da2bab38490a23dda2103368f4e0a)
> ***
> Welcome to the Impala shell.
> (Impala Shell v3.1.0-SNAPSHOT (3d38043) built on Mon Dec  3 15:50:55 PST 2018)
> After running a query, type SUMMARY to see a summary of where time was spent.
> ***
> [localhost:21000] default> :shutdown(100);
> Query: :shutdown(100)
> Query submitted at: 2018-12-05 19:22:07 (Coordinator: http://lv-desktop:25000)
> Query progress can be monitored at: 
> http://lv-desktop:25000/query_plan?query_id=2f41eaf1c21603b8:d6546bf1
> +---+
> | summary 
>   |
> +---+
> | startup grace period left: 2m, deadline left: 1m40s, fragment instances: 0, 
> queries registered: 1 |
> +---+
> Fetched 1 row(s) in 0.11s
> [localhost:21000] default> :shutdown(10);
> Query: :shutdown(10)
> Query submitted at: 2018-12-05 19:22:10 (Coordinator: http://lv-desktop:25000)
> ERROR: Server is being shut down: startup grace period left: 1m56s, deadline 
> left: 1m36s, fragment instances: 0, queries registered: 0.
> [localhost:21000] default> :shutdown(1000);
> Query: :shutdown(1000)
> Query submitted at: 2018-12-05 19:22:14 (Coordinator: http://lv-desktop:25000)
> ERROR: Server is being shut down: startup grace period left: 1m52s, deadline 
> left: 1m32s, fragment instances: 0, queries registered: 0.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state

2018-12-05 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-7931:

Description: 
On a recent S3 test run test_shutdown_executor hit a timeout waiting for a 
query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).

{noformat}
12:51:11 __ TestShutdownCommand.test_shutdown_executor 
__
12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor
12:51:11 assert self.__fetch_and_get_num_backends(QUERY, 
before_shutdown_handle) == 3
12:51:11 custom_cluster/test_restart_services.py:356: in 
__fetch_and_get_num_backends
12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20)
12:51:11 common/impala_service.py:267: in wait_for_query_state
12:51:11 target_state, query_state)
12:51:11 E   AssertionError: Did not reach query state in time target=4 actual=5
{noformat}

>From the logs I can see that the query fails because one of the executors 
>becomes unreachable:

{noformat}
I1204 12:31:39.954125  5609 impala-server.cc:1792] Query 
a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): 
jenkins-worker:22001
{noformat}

The query was {{select count\(*) from functional_parquet.alltypes where 
sleep(1) = bool_col}}. 

It seems that the query took longer than expected and was still running when 
the executor shut down.

  was:
On a recent S3 test run test_shutdown_executor hit a timeout waiting for a 
query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).

{noformat}
12:51:11 __ TestShutdownCommand.test_shutdown_executor 
__
12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor
12:51:11 assert self.__fetch_and_get_num_backends(QUERY, 
before_shutdown_handle) == 3
12:51:11 custom_cluster/test_restart_services.py:356: in 
__fetch_and_get_num_backends
12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20)
12:51:11 common/impala_service.py:267: in wait_for_query_state
12:51:11 target_state, query_state)
12:51:11 E   AssertionError: Did not reach query state in time target=4 actual=5
{noformat}

>From the logs I can see that the query fails because one of the executors 
>becomes unreachable:

{noformat}
I1204 12:31:39.954125  5609 impala-server.cc:1792] Query 
a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): 
jenkins-worker:22001
{noformat}

The query was {{select count(*) from functional_parquet.alltypes where sleep(1) 
= bool_col}}. 

It seems that the query took longer than expected and was still running when 
the executor shut down.


> test_shutdown_executor fails with timeout waiting for query target state
> 
>
> Key: IMPALA-7931
> URL: https://issues.apache.org/jira/browse/IMPALA-7931
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Priority: Critical
>  Labels: broken-build
>
> On a recent S3 test run test_shutdown_executor hit a timeout waiting for a 
> query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).
> {noformat}
> 12:51:11 __ TestShutdownCommand.test_shutdown_executor 
> __
> 12:51:11 custom_cluster/test_restart_services.py:209: in 
> test_shutdown_executor
> 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, 
> before_shutdown_handle) == 3
> 12:51:11 custom_cluster/test_restart_services.py:356: in 
> __fetch_and_get_num_backends
> 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20)
> 12:51:11 common/impala_service.py:267: in wait_for_query_state
> 12:51:11 target_state, query_state)
> 12:51:11 E   AssertionError: Did not reach query state in time target=4 
> actual=5
> {noformat}
> From the logs I can see that the query fails because one of the executors 
> becomes unreachable:
> {noformat}
> I1204 12:31:39.954125  5609 impala-server.cc:1792] Query 
> a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): 
> jenkins-worker:22001
> {noformat}
> The query was {{select count\(*) from functional_parquet.alltypes where 
> sleep(1) = bool_col}}. 
> It seems that the query took longer than expected and was still running when 
> the executor shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7932) Cannot change shutdown deadline after issuing initial shutdown command

2018-12-05 Thread Lars Volker (JIRA)
Lars Volker created IMPALA-7932:
---

 Summary: Cannot change shutdown deadline after issuing initial 
shutdown command
 Key: IMPALA-7932
 URL: https://issues.apache.org/jira/browse/IMPALA-7932
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0, Impala 3.2.0
Reporter: Lars Volker
Assignee: Tim Armstrong


Starting Impala Shell without Kerberos authentication
Opened TCP connection to localhost:21000
Connected to localhost:21000
Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build 
3d38043e6b9da2bab38490a23dda2103368f4e0a)
***
Welcome to the Impala shell.
(Impala Shell v3.1.0-SNAPSHOT (3d38043) built on Mon Dec  3 15:50:55 PST 2018)

After running a query, type SUMMARY to see a summary of where time was spent.
***
[localhost:21000] default> :shutdown(100);
Query: :shutdown(100)
Query submitted at: 2018-12-05 19:22:07 (Coordinator: http://lv-desktop:25000)
Query progress can be monitored at: 
http://lv-desktop:25000/query_plan?query_id=2f41eaf1c21603b8:d6546bf1
+---+
| summary   
|
+---+
| startup grace period left: 2m, deadline left: 1m40s, fragment instances: 0, 
queries registered: 1 |
+---+
Fetched 1 row(s) in 0.11s
[localhost:21000] default> :shutdown(10);
Query: :shutdown(10)
Query submitted at: 2018-12-05 19:22:10 (Coordinator: http://lv-desktop:25000)
ERROR: Server is being shut down: startup grace period left: 1m56s, deadline 
left: 1m36s, fragment instances: 0, queries registered: 0.

[localhost:21000] default> :shutdown(1000);
Query: :shutdown(1000)
Query submitted at: 2018-12-05 19:22:14 (Coordinator: http://lv-desktop:25000)
ERROR: Server is being shut down: startup grace period left: 1m52s, deadline 
left: 1m32s, fragment instances: 0, queries registered: 0.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state

2018-12-05 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7931:
---

Assignee: Tim Armstrong

> test_shutdown_executor fails with timeout waiting for query target state
> 
>
> Key: IMPALA-7931
> URL: https://issues.apache.org/jira/browse/IMPALA-7931
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build
>
> On a recent S3 test run test_shutdown_executor hit a timeout waiting for a 
> query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).
> {noformat}
> 12:51:11 __ TestShutdownCommand.test_shutdown_executor 
> __
> 12:51:11 custom_cluster/test_restart_services.py:209: in 
> test_shutdown_executor
> 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, 
> before_shutdown_handle) == 3
> 12:51:11 custom_cluster/test_restart_services.py:356: in 
> __fetch_and_get_num_backends
> 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20)
> 12:51:11 common/impala_service.py:267: in wait_for_query_state
> 12:51:11 target_state, query_state)
> 12:51:11 E   AssertionError: Did not reach query state in time target=4 
> actual=5
> {noformat}
> From the logs I can see that the query fails because one of the executors 
> becomes unreachable:
> {noformat}
> I1204 12:31:39.954125  5609 impala-server.cc:1792] Query 
> a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): 
> jenkins-worker:22001
> {noformat}
> The query was {{select count\(*) from functional_parquet.alltypes where 
> sleep(1) = bool_col}}. 
> It seems that the query took longer than expected and was still running when 
> the executor shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state

2018-12-05 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710932#comment-16710932
 ] 

Lars Volker commented on IMPALA-7931:
-

[~tarmstrong] - I’m assigning this to you thinking you might have an idea 
what’s going on here; feel free to find another person or assign back to me if 
you're swamped.

> test_shutdown_executor fails with timeout waiting for query target state
> 
>
> Key: IMPALA-7931
> URL: https://issues.apache.org/jira/browse/IMPALA-7931
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Priority: Critical
>  Labels: broken-build
>
> On a recent S3 test run test_shutdown_executor hit a timeout waiting for a 
> query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).
> {noformat}
> 12:51:11 __ TestShutdownCommand.test_shutdown_executor 
> __
> 12:51:11 custom_cluster/test_restart_services.py:209: in 
> test_shutdown_executor
> 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, 
> before_shutdown_handle) == 3
> 12:51:11 custom_cluster/test_restart_services.py:356: in 
> __fetch_and_get_num_backends
> 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20)
> 12:51:11 common/impala_service.py:267: in wait_for_query_state
> 12:51:11 target_state, query_state)
> 12:51:11 E   AssertionError: Did not reach query state in time target=4 
> actual=5
> {noformat}
> From the logs I can see that the query fails because one of the executors 
> becomes unreachable:
> {noformat}
> I1204 12:31:39.954125  5609 impala-server.cc:1792] Query 
> a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): 
> jenkins-worker:22001
> {noformat}
> The query was {{select count\(*) from functional_parquet.alltypes where 
> sleep(1) = bool_col}}. 
> It seems that the query took longer than expected and was still running when 
> the executor shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state

2018-12-05 Thread Lars Volker (JIRA)
Lars Volker created IMPALA-7931:
---

 Summary: test_shutdown_executor fails with timeout waiting for 
query target state
 Key: IMPALA-7931
 URL: https://issues.apache.org/jira/browse/IMPALA-7931
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.2.0
Reporter: Lars Volker


On a recent S3 test run test_shutdown_executor hit a timeout waiting for a 
query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).

{noformat}
12:51:11 __ TestShutdownCommand.test_shutdown_executor 
__
12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor
12:51:11 assert self.__fetch_and_get_num_backends(QUERY, 
before_shutdown_handle) == 3
12:51:11 custom_cluster/test_restart_services.py:356: in 
__fetch_and_get_num_backends
12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20)
12:51:11 common/impala_service.py:267: in wait_for_query_state
12:51:11 target_state, query_state)
12:51:11 E   AssertionError: Did not reach query state in time target=4 actual=5
{noformat}

>From the logs I can see that the query fails because one of the executors 
>becomes unreachable:

{noformat}
I1204 12:31:39.954125  5609 impala-server.cc:1792] Query 
a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): 
jenkins-worker:22001
{noformat}

The query was {{select count(*) from functional_parquet.alltypes where sleep(1) 
= bool_col}}. 

It seems that the query took longer than expected and was still running when 
the executor shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state

2018-12-05 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-7931:

Labels: broken-build  (was: )

> test_shutdown_executor fails with timeout waiting for query target state
> 
>
> Key: IMPALA-7931
> URL: https://issues.apache.org/jira/browse/IMPALA-7931
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Priority: Critical
>  Labels: broken-build
>
> On a recent S3 test run test_shutdown_executor hit a timeout waiting for a 
> query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).
> {noformat}
> 12:51:11 __ TestShutdownCommand.test_shutdown_executor 
> __
> 12:51:11 custom_cluster/test_restart_services.py:209: in 
> test_shutdown_executor
> 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, 
> before_shutdown_handle) == 3
> 12:51:11 custom_cluster/test_restart_services.py:356: in 
> __fetch_and_get_num_backends
> 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20)
> 12:51:11 common/impala_service.py:267: in wait_for_query_state
> 12:51:11 target_state, query_state)
> 12:51:11 E   AssertionError: Did not reach query state in time target=4 
> actual=5
> {noformat}
> From the logs I can see that the query fails because one of the executors 
> becomes unreachable:
> {noformat}
> I1204 12:31:39.954125  5609 impala-server.cc:1792] Query 
> a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): 
> jenkins-worker:22001
> {noformat}
> The query was {{select count(*) from functional_parquet.alltypes where 
> sleep(1) = bool_col}}. 
> It seems that the query took longer than expected and was still running when 
> the executor shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7906) Crash in JVM PSPromotionManager::copy_to_survivor_space

2018-12-05 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710881#comment-16710881
 ] 

Tim Armstrong commented on IMPALA-7906:
---

I tried looping some tests to reproduce with no luck.

> Crash in JVM PSPromotionManager::copy_to_survivor_space
> ---
>
> Key: IMPALA-7906
> URL: https://issues.apache.org/jira/browse/IMPALA-7906
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build, crash
> Attachments: hs_err_pid6290.log
>
>
> {noformat}
> #0  0x7f44ca5261f7 in raise () from /lib64/libc.so.6
> #1  0x7f44ca5278e8 in abort () from /lib64/libc.so.6
> #2  0x7f44cd726185 in os::abort(bool) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #3  0x7f44cd8c8593 in VMError::report_and_die() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #4  0x7f44cd8c8a7e in crash_handler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #5  0x7f44cd724f72 in os::Linux::chained_handler(int, siginfo*, void*) () 
> from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #6  0x7f44cd72b5f6 in JVM_handle_linux_signal () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #7  0x7f44cd721be3 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #8  
> #9  0x7f44cd713e95 in oopDesc::print_on(outputStream*) const () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #10 0x7f44cd72afdb in os::print_register_info(outputStream*, void*) () 
> from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #11 0x7f44cd8c6c13 in VMError::report(outputStream*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #12 0x7f44cd8c818a in VMError::report_and_die() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #13 0x7f44cd72b68f in JVM_handle_linux_signal () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #14 0x7f44cd721be3 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #15 
> #16 0x7f44cd78f562 in oopDesc* 
> PSPromotionManager::copy_to_survivor_space(oopDesc*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #17 0x7f44cd7924a5 in PSRootsClosure::do_oop(oopDesc**) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #18 0x7f44cd716a96 in InterpreterOopMap::iterate_oop(OffsetClosure*) 
> const () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #19 0x7f44cd38f789 in frame::oops_interpreted_do(OopClosure*, 
> CLDClosure*, RegisterMap const*, bool) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #20 0x7f44cd86eaa1 in JavaThread::oops_do(OopClosure*, CLDClosure*, 
> CodeBlobClosure*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #21 0x7f44cd79270f in ThreadRootsTask::do_it(GCTaskManager*, unsigned 
> int) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #22 0x7f44cd3d7ecf in GCTaskThread::run() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #23 0x7f44cd727338 in java_start(Thread*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #24 0x7f44ca8bbe25 in start_thread () from /lib64/libpthread.so.0
> #25 0x7f44ca5e934d in clone () from /lib64/libc.so.6
> {noformat}
> These are the tests running at the time
> {noformat}
> 006:53:04 [gw1] PASSED 
> query_test/test_mem_usage_scaling.py::TestQueryMemLimitScaling::test_mem_usage_scaling[mem_limit:
>  -1 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 06:53:07 
> query_test/test_mem_usage_scaling.py::TestQueryMemLimitScaling::test_mem_usage_scaling[mem_limit:
>  400m | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 06:53:07 [gw5] PASSED 
> query_test/test_analytic_tpcds.py::TestAnalyticTpcds::test_analytic_functions_tpcds[batch_size:
>  1 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 06:53:08 
> 

[jira] [Commented] (IMPALA-7930) Crash in thrift-server-test

2018-12-05 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710856#comment-16710856
 ] 

Lars Volker commented on IMPALA-7930:
-

[~twmarshall] - I’m assigning this to you thinking you might have an idea 
what’s going on here; feel free to find another person or assign back to me if 
you're swamped.

> Crash in thrift-server-test
> ---
>
> Key: IMPALA-7930
> URL: https://issues.apache.org/jira/browse/IMPALA-7930
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Priority: Critical
>  Labels: broken-build, flaky
>
> I've seen a crash in thrift-server-test during an exhaustive test run. 
> Unfortunately the core file indicated that it was written by a directory, 
> which caused the automatic core dump resolution to fail. Here's the resolved 
> minidump:
> {noformat}
> Crash reason:  SIGABRT
> Crash address: 0x7d11d19
> Process uptime: not available
> Thread 0 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x7f1e65876000
> rsi = 0x1d19   rdi = 0x1d19
> rbp = 0x7f1e61dbde68   rsp = 0x7fffc22796d8
>  r8 = 0x000a1a10r9 = 0xfefefefefeff092d
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x029dca31   r13 = 0x033a5e00
> r14 = 0x   r15 = 0x
> rip = 0x7f1e61c721f7
> Found by: given as instruction pointer in context
>  1  libc-2.17.so + 0x368e8
> rsp = 0x7fffc22796e0   rip = 0x7f1e61c738e8
> Found by: stack scanning
>  2  libc-2.17.so + 0x17df70
> rsp = 0x7fffc2279770   rip = 0x7f1e61dbaf70
> Found by: stack scanning
>  3  thrift-server-test!_fini + 0xdf918
> rsp = 0x7fffc2279778   rip = 0x02ab0288
> Found by: stack scanning
>  4  libc-2.17.so + 0x2fbc3
> rsp = 0x7fffc2279790   rip = 0x7f1e61c6cbc3
> Found by: stack scanning
>  5  
> thrift-server-test!testing::internal::TestEventRepeater::OnTestProgramEnd(testing::UnitTest
>  const&) + 0x55
> rsp = 0x7fffc22797b0   rip = 0x028711f5
> Found by: stack scanning
>  6  libc-2.17.so + 0x17df70
> rbx = 0x   rbp = 0x
> rsp = 0x7fffc22797e0   r12 = 0x
> r13 = 0x0005   rip = 0x7f1e61dbaf70
> Found by: call frame info
>  7  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc22797f0   rip = 0x029dca31
> Found by: stack scanning
>  8  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc22797f8   rip = 0x033a5e00
> Found by: stack scanning
>  9  libc-2.17.so + 0x180e68
> rsp = 0x7fffc2279808   rip = 0x7f1e61dbde68
> Found by: stack scanning
> 10  libc-2.17.so + 0x2e266
> rsp = 0x7fffc2279810   rip = 0x7f1e61c6b266
> Found by: stack scanning
> 11  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc2279818   rip = 0x033a5e00
> Found by: stack scanning
> 12  libc-2.17.so + 0x17df70
> rsp = 0x7fffc2279820   rip = 0x7f1e61dbaf70
> Found by: stack scanning
> 13  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc2279828   rip = 0x029dca31
> Found by: stack scanning
> 14  thrift-server-test!_fini + 0xdf918
> rsp = 0x7fffc2279840   rip = 0x02ab0288
> Found by: stack scanning
> 15  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc2279850   rip = 0x033a5e00
> Found by: stack scanning
> 16  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc2279860   rip = 0x029dca31
> Found by: stack scanning
> 17  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc2279870   rip = 0x033a5e00
> Found by: stack scanning
> 18  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc2279878   rip = 0x029dca31
> Found by: stack scanning
> 19  thrift-server-test!_fini + 0xdf918
> rsp = 0x7fffc2279880   rip = 0x02ab0288
> Found by: stack scanning
> 20  libc-2.17.so + 0x2e312
> rsp = 0x7fffc2279890   rip = 0x7f1e61c6b312
> Found by: stack scanning
> 21  
> thrift-server-test!boost::shared_array::~shared_array()
>  + 0x70
> rsp = 0x7fffc22798b0   rip = 0x02719b40
> Found by: stack scanning
> 22  
> thrift-server-test!boost::detail::sp_counted_impl_p::dispose()
>  + 0x4f
> rsp = 0x7fffc22798c0   rip = 0x0271e1af
> Found by: stack scanning
> 23  
> thrift-server-test!boost::detail::sp_counted_impl_pd  boost::checked_array_deleter 
> >::dispose() + 0xaa
> rbx = 0x042f7128   rsp = 0x7fffc22798d0
> rip = 

[jira] [Commented] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates

2018-12-05 Thread Bikramjeet Vig (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710865#comment-16710865
 ] 

Bikramjeet Vig commented on IMPALA-7351:


Yes, I still have to look at sinks. Will address those soon.

> Add memory estimates for plan nodes and sinks with missing estimates
> 
>
> Key: IMPALA-7351
> URL: https://issues.apache.org/jira/browse/IMPALA-7351
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
>
> Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, 
> etc are missing memory estimates entirely. 
> We should add a basic estimate for all these cases based on experiments and 
> data from real workloads. In some cases 0 may be the right estimate (e.g. for 
> streaming nodes like SelectNode that just pass through data) but we should 
> remove TODOs and document the reasoning in those cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7930) Crash in thrift-server-test

2018-12-05 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7930:
---

Assignee: Lars Volker

> Crash in thrift-server-test
> ---
>
> Key: IMPALA-7930
> URL: https://issues.apache.org/jira/browse/IMPALA-7930
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Critical
>  Labels: broken-build, flaky
>
> I've seen a crash in thrift-server-test during an exhaustive test run. 
> Unfortunately the core file indicated that it was written by a directory, 
> which caused the automatic core dump resolution to fail. Here's the resolved 
> minidump:
> {noformat}
> Crash reason:  SIGABRT
> Crash address: 0x7d11d19
> Process uptime: not available
> Thread 0 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x7f1e65876000
> rsi = 0x1d19   rdi = 0x1d19
> rbp = 0x7f1e61dbde68   rsp = 0x7fffc22796d8
>  r8 = 0x000a1a10r9 = 0xfefefefefeff092d
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x029dca31   r13 = 0x033a5e00
> r14 = 0x   r15 = 0x
> rip = 0x7f1e61c721f7
> Found by: given as instruction pointer in context
>  1  libc-2.17.so + 0x368e8
> rsp = 0x7fffc22796e0   rip = 0x7f1e61c738e8
> Found by: stack scanning
>  2  libc-2.17.so + 0x17df70
> rsp = 0x7fffc2279770   rip = 0x7f1e61dbaf70
> Found by: stack scanning
>  3  thrift-server-test!_fini + 0xdf918
> rsp = 0x7fffc2279778   rip = 0x02ab0288
> Found by: stack scanning
>  4  libc-2.17.so + 0x2fbc3
> rsp = 0x7fffc2279790   rip = 0x7f1e61c6cbc3
> Found by: stack scanning
>  5  
> thrift-server-test!testing::internal::TestEventRepeater::OnTestProgramEnd(testing::UnitTest
>  const&) + 0x55
> rsp = 0x7fffc22797b0   rip = 0x028711f5
> Found by: stack scanning
>  6  libc-2.17.so + 0x17df70
> rbx = 0x   rbp = 0x
> rsp = 0x7fffc22797e0   r12 = 0x
> r13 = 0x0005   rip = 0x7f1e61dbaf70
> Found by: call frame info
>  7  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc22797f0   rip = 0x029dca31
> Found by: stack scanning
>  8  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc22797f8   rip = 0x033a5e00
> Found by: stack scanning
>  9  libc-2.17.so + 0x180e68
> rsp = 0x7fffc2279808   rip = 0x7f1e61dbde68
> Found by: stack scanning
> 10  libc-2.17.so + 0x2e266
> rsp = 0x7fffc2279810   rip = 0x7f1e61c6b266
> Found by: stack scanning
> 11  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc2279818   rip = 0x033a5e00
> Found by: stack scanning
> 12  libc-2.17.so + 0x17df70
> rsp = 0x7fffc2279820   rip = 0x7f1e61dbaf70
> Found by: stack scanning
> 13  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc2279828   rip = 0x029dca31
> Found by: stack scanning
> 14  thrift-server-test!_fini + 0xdf918
> rsp = 0x7fffc2279840   rip = 0x02ab0288
> Found by: stack scanning
> 15  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc2279850   rip = 0x033a5e00
> Found by: stack scanning
> 16  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc2279860   rip = 0x029dca31
> Found by: stack scanning
> 17  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc2279870   rip = 0x033a5e00
> Found by: stack scanning
> 18  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc2279878   rip = 0x029dca31
> Found by: stack scanning
> 19  thrift-server-test!_fini + 0xdf918
> rsp = 0x7fffc2279880   rip = 0x02ab0288
> Found by: stack scanning
> 20  libc-2.17.so + 0x2e312
> rsp = 0x7fffc2279890   rip = 0x7f1e61c6b312
> Found by: stack scanning
> 21  
> thrift-server-test!boost::shared_array::~shared_array()
>  + 0x70
> rsp = 0x7fffc22798b0   rip = 0x02719b40
> Found by: stack scanning
> 22  
> thrift-server-test!boost::detail::sp_counted_impl_p::dispose()
>  + 0x4f
> rsp = 0x7fffc22798c0   rip = 0x0271e1af
> Found by: stack scanning
> 23  
> thrift-server-test!boost::detail::sp_counted_impl_pd  boost::checked_array_deleter 
> >::dispose() + 0xaa
> rbx = 0x042f7128   rsp = 0x7fffc22798d0
> rip = 0x02719cfa
> Found by: call frame info
> 24  
> thrift-server-test!boost::shared_array::~shared_array()
>  + 0x39
> rbx = 0x0436f900   

[jira] [Assigned] (IMPALA-7930) Crash in thrift-server-test

2018-12-05 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7930:
---

Assignee: Thomas Tauber-Marshall  (was: Lars Volker)

> Crash in thrift-server-test
> ---
>
> Key: IMPALA-7930
> URL: https://issues.apache.org/jira/browse/IMPALA-7930
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Lars Volker
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build, flaky
>
> I've seen a crash in thrift-server-test during an exhaustive test run. 
> Unfortunately the core file indicated that it was written by a directory, 
> which caused the automatic core dump resolution to fail. Here's the resolved 
> minidump:
> {noformat}
> Crash reason:  SIGABRT
> Crash address: 0x7d11d19
> Process uptime: not available
> Thread 0 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x7f1e65876000
> rsi = 0x1d19   rdi = 0x1d19
> rbp = 0x7f1e61dbde68   rsp = 0x7fffc22796d8
>  r8 = 0x000a1a10r9 = 0xfefefefefeff092d
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x029dca31   r13 = 0x033a5e00
> r14 = 0x   r15 = 0x
> rip = 0x7f1e61c721f7
> Found by: given as instruction pointer in context
>  1  libc-2.17.so + 0x368e8
> rsp = 0x7fffc22796e0   rip = 0x7f1e61c738e8
> Found by: stack scanning
>  2  libc-2.17.so + 0x17df70
> rsp = 0x7fffc2279770   rip = 0x7f1e61dbaf70
> Found by: stack scanning
>  3  thrift-server-test!_fini + 0xdf918
> rsp = 0x7fffc2279778   rip = 0x02ab0288
> Found by: stack scanning
>  4  libc-2.17.so + 0x2fbc3
> rsp = 0x7fffc2279790   rip = 0x7f1e61c6cbc3
> Found by: stack scanning
>  5  
> thrift-server-test!testing::internal::TestEventRepeater::OnTestProgramEnd(testing::UnitTest
>  const&) + 0x55
> rsp = 0x7fffc22797b0   rip = 0x028711f5
> Found by: stack scanning
>  6  libc-2.17.so + 0x17df70
> rbx = 0x   rbp = 0x
> rsp = 0x7fffc22797e0   r12 = 0x
> r13 = 0x0005   rip = 0x7f1e61dbaf70
> Found by: call frame info
>  7  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc22797f0   rip = 0x029dca31
> Found by: stack scanning
>  8  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc22797f8   rip = 0x033a5e00
> Found by: stack scanning
>  9  libc-2.17.so + 0x180e68
> rsp = 0x7fffc2279808   rip = 0x7f1e61dbde68
> Found by: stack scanning
> 10  libc-2.17.so + 0x2e266
> rsp = 0x7fffc2279810   rip = 0x7f1e61c6b266
> Found by: stack scanning
> 11  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc2279818   rip = 0x033a5e00
> Found by: stack scanning
> 12  libc-2.17.so + 0x17df70
> rsp = 0x7fffc2279820   rip = 0x7f1e61dbaf70
> Found by: stack scanning
> 13  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc2279828   rip = 0x029dca31
> Found by: stack scanning
> 14  thrift-server-test!_fini + 0xdf918
> rsp = 0x7fffc2279840   rip = 0x02ab0288
> Found by: stack scanning
> 15  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc2279850   rip = 0x033a5e00
> Found by: stack scanning
> 16  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc2279860   rip = 0x029dca31
> Found by: stack scanning
> 17  thrift-server-test!_fini + 0x9d5490
> rsp = 0x7fffc2279870   rip = 0x033a5e00
> Found by: stack scanning
> 18  thrift-server-test!_fini + 0xc0c1
> rsp = 0x7fffc2279878   rip = 0x029dca31
> Found by: stack scanning
> 19  thrift-server-test!_fini + 0xdf918
> rsp = 0x7fffc2279880   rip = 0x02ab0288
> Found by: stack scanning
> 20  libc-2.17.so + 0x2e312
> rsp = 0x7fffc2279890   rip = 0x7f1e61c6b312
> Found by: stack scanning
> 21  
> thrift-server-test!boost::shared_array::~shared_array()
>  + 0x70
> rsp = 0x7fffc22798b0   rip = 0x02719b40
> Found by: stack scanning
> 22  
> thrift-server-test!boost::detail::sp_counted_impl_p::dispose()
>  + 0x4f
> rsp = 0x7fffc22798c0   rip = 0x0271e1af
> Found by: stack scanning
> 23  
> thrift-server-test!boost::detail::sp_counted_impl_pd  boost::checked_array_deleter 
> >::dispose() + 0xaa
> rbx = 0x042f7128   rsp = 0x7fffc22798d0
> rip = 0x02719cfa
> Found by: call frame info
> 24  
> thrift-server-test!boost::shared_array::~shared_array()
> 

[jira] [Created] (IMPALA-7930) Crash in thrift-server-test

2018-12-05 Thread Lars Volker (JIRA)
Lars Volker created IMPALA-7930:
---

 Summary: Crash in thrift-server-test
 Key: IMPALA-7930
 URL: https://issues.apache.org/jira/browse/IMPALA-7930
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.2.0
Reporter: Lars Volker


I've seen a crash in thrift-server-test during an exhaustive test run. 
Unfortunately the core file indicated that it was written by a directory, which 
caused the automatic core dump resolution to fail. Here's the resolved minidump:

{noformat}
Crash reason:  SIGABRT
Crash address: 0x7d11d19
Process uptime: not available

Thread 0 (crashed)
 0  libc-2.17.so + 0x351f7
rax = 0x   rdx = 0x0006
rcx = 0x   rbx = 0x7f1e65876000
rsi = 0x1d19   rdi = 0x1d19
rbp = 0x7f1e61dbde68   rsp = 0x7fffc22796d8
 r8 = 0x000a1a10r9 = 0xfefefefefeff092d
r10 = 0x0008   r11 = 0x0202
r12 = 0x029dca31   r13 = 0x033a5e00
r14 = 0x   r15 = 0x
rip = 0x7f1e61c721f7
Found by: given as instruction pointer in context
 1  libc-2.17.so + 0x368e8
rsp = 0x7fffc22796e0   rip = 0x7f1e61c738e8
Found by: stack scanning
 2  libc-2.17.so + 0x17df70
rsp = 0x7fffc2279770   rip = 0x7f1e61dbaf70
Found by: stack scanning
 3  thrift-server-test!_fini + 0xdf918
rsp = 0x7fffc2279778   rip = 0x02ab0288
Found by: stack scanning
 4  libc-2.17.so + 0x2fbc3
rsp = 0x7fffc2279790   rip = 0x7f1e61c6cbc3
Found by: stack scanning
 5  
thrift-server-test!testing::internal::TestEventRepeater::OnTestProgramEnd(testing::UnitTest
 const&) + 0x55
rsp = 0x7fffc22797b0   rip = 0x028711f5
Found by: stack scanning
 6  libc-2.17.so + 0x17df70
rbx = 0x   rbp = 0x
rsp = 0x7fffc22797e0   r12 = 0x
r13 = 0x0005   rip = 0x7f1e61dbaf70
Found by: call frame info
 7  thrift-server-test!_fini + 0xc0c1
rsp = 0x7fffc22797f0   rip = 0x029dca31
Found by: stack scanning
 8  thrift-server-test!_fini + 0x9d5490
rsp = 0x7fffc22797f8   rip = 0x033a5e00
Found by: stack scanning
 9  libc-2.17.so + 0x180e68
rsp = 0x7fffc2279808   rip = 0x7f1e61dbde68
Found by: stack scanning
10  libc-2.17.so + 0x2e266
rsp = 0x7fffc2279810   rip = 0x7f1e61c6b266
Found by: stack scanning
11  thrift-server-test!_fini + 0x9d5490
rsp = 0x7fffc2279818   rip = 0x033a5e00
Found by: stack scanning
12  libc-2.17.so + 0x17df70
rsp = 0x7fffc2279820   rip = 0x7f1e61dbaf70
Found by: stack scanning
13  thrift-server-test!_fini + 0xc0c1
rsp = 0x7fffc2279828   rip = 0x029dca31
Found by: stack scanning
14  thrift-server-test!_fini + 0xdf918
rsp = 0x7fffc2279840   rip = 0x02ab0288
Found by: stack scanning
15  thrift-server-test!_fini + 0x9d5490
rsp = 0x7fffc2279850   rip = 0x033a5e00
Found by: stack scanning
16  thrift-server-test!_fini + 0xc0c1
rsp = 0x7fffc2279860   rip = 0x029dca31
Found by: stack scanning
17  thrift-server-test!_fini + 0x9d5490
rsp = 0x7fffc2279870   rip = 0x033a5e00
Found by: stack scanning
18  thrift-server-test!_fini + 0xc0c1
rsp = 0x7fffc2279878   rip = 0x029dca31
Found by: stack scanning
19  thrift-server-test!_fini + 0xdf918
rsp = 0x7fffc2279880   rip = 0x02ab0288
Found by: stack scanning
20  libc-2.17.so + 0x2e312
rsp = 0x7fffc2279890   rip = 0x7f1e61c6b312
Found by: stack scanning
21  
thrift-server-test!boost::shared_array::~shared_array()
 + 0x70
rsp = 0x7fffc22798b0   rip = 0x02719b40
Found by: stack scanning
22  
thrift-server-test!boost::detail::sp_counted_impl_p::dispose()
 + 0x4f
rsp = 0x7fffc22798c0   rip = 0x0271e1af
Found by: stack scanning
23  
thrift-server-test!boost::detail::sp_counted_impl_pd >::dispose() 
+ 0xaa
rbx = 0x042f7128   rsp = 0x7fffc22798d0
rip = 0x02719cfa
Found by: call frame info
24  
thrift-server-test!boost::shared_array::~shared_array()
 + 0x39
rbx = 0x0436f900   rbp = 0x
rsp = 0x7fffc2279900   r12 = 0x0001
r13 = 0x04323680   r14 = 0x
rip = 0x02719b09
Found by: call frame info
25  libc-2.17.so + 0x38a69
rbx = 0x   rbp = 0x7f1e61ff96c8
rsp = 0x7fffc2279920   r12 = 0x0001
r13 = 0x04323680   r14 = 0x
rip = 0x7f1e61c75a69
Found by: call frame info
26  thrift-server-test!_GLOBAL__sub_I_json_escaping.cc + 0x2e
rsp = 

[jira] [Commented] (IMPALA-7802) Implement support for closing idle sessions

2018-12-05 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710806#comment-16710806
 ] 

Tim Armstrong commented on IMPALA-7802:
---

[~zoram] The thing is does do is report a meaningful error message when the 
user comes back and tries to use the session - i.e. "Client session expired due 
to more than...".

E.g. a quick google revealed this forum thread where the error message pointed 
the user in the right direction, 
https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Query-blah-expired-due-to-client-inactivity-timeout-is-10m/m-p/66842.

Maybe the error reporting isn't worth the other hassles though - or maybe we 
just need to set clearer expectations for client behaviour - that they need to 
handle sessions being terminated in this way.

> Implement support for closing idle sessions
> ---
>
> Key: IMPALA-7802
> URL: https://issues.apache.org/jira/browse/IMPALA-7802
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Zoram Thanga
>Priority: Critical
>  Labels: supportability
>
> Currently, the query option {{idle_session_timeout}} specifies a timeout in 
> seconds after which all running queries of that idle session will be 
> cancelled and no new queries can be issued to it. However, the idle session 
> will remain open and it needs to be closed explicitly. Please see the 
> [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html]
>  for details.
> This behavior may be undesirable as each session still consumes an Impala 
> frontend service thread. The number of frontend service threads is bound by 
> the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala 
> server can have a lot of idle sessions but they still consume against the 
> quota of {{fe_service_threads}}. If the number of sessions established 
> reaches {{fe_service_threads}}, all new session creations will block until 
> some of the existing sessions exit. There may be no time bound on when these 
> zombie idle sessions will be closed and it's at the mercy of the client 
> implementation to close them. In some sense, leaving many idle sessions open 
> is a way to launch a denial of service attack on Impala.
> To fix this situation, we should have an option to forcefully close a session 
> when it's considered idle so it won't unnecessarily consume the limited 
> number of frontend service threads. cc'ing [~zoram]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-7802) Implement support for closing idle sessions

2018-12-05 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710806#comment-16710806
 ] 

Tim Armstrong edited comment on IMPALA-7802 at 12/6/18 12:38 AM:
-

[~zoram] The thing is does do is report a meaningful error message when the 
user comes back and tries to use the session - i.e. "Client session expired due 
to more than...".

E.g. a quick google revealed this forum thread where the error message pointed 
the user in the right direction, 
https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Query-blah-expired-due-to-client-inactivity-timeout-is-10m/m-p/66842.

Maybe the error reporting isn't worth the other hassles though - or maybe we 
just need to set clearer expectations for client behaviour - that they need to 
handle sessions being terminated in this way.

Edit: definitely glad that you're pushing on this though, the current state of 
things isn't right.


was (Author: tarmstrong):
[~zoram] The thing is does do is report a meaningful error message when the 
user comes back and tries to use the session - i.e. "Client session expired due 
to more than...".

E.g. a quick google revealed this forum thread where the error message pointed 
the user in the right direction, 
https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Query-blah-expired-due-to-client-inactivity-timeout-is-10m/m-p/66842.

Maybe the error reporting isn't worth the other hassles though - or maybe we 
just need to set clearer expectations for client behaviour - that they need to 
handle sessions being terminated in this way.

> Implement support for closing idle sessions
> ---
>
> Key: IMPALA-7802
> URL: https://issues.apache.org/jira/browse/IMPALA-7802
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Zoram Thanga
>Priority: Critical
>  Labels: supportability
>
> Currently, the query option {{idle_session_timeout}} specifies a timeout in 
> seconds after which all running queries of that idle session will be 
> cancelled and no new queries can be issued to it. However, the idle session 
> will remain open and it needs to be closed explicitly. Please see the 
> [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html]
>  for details.
> This behavior may be undesirable as each session still consumes an Impala 
> frontend service thread. The number of frontend service threads is bound by 
> the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala 
> server can have a lot of idle sessions but they still consume against the 
> quota of {{fe_service_threads}}. If the number of sessions established 
> reaches {{fe_service_threads}}, all new session creations will block until 
> some of the existing sessions exit. There may be no time bound on when these 
> zombie idle sessions will be closed and it's at the mercy of the client 
> implementation to close them. In some sense, leaving many idle sessions open 
> is a way to launch a denial of service attack on Impala.
> To fix this situation, we should have an option to forcefully close a session 
> when it's considered idle so it won't unnecessarily consume the limited 
> number of frontend service threads. cc'ing [~zoram]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-5397) Set "End Time" earlier rather than on unregistration.

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-5397:
--
Labels: admission-control query-lifecycle  (was: query-lifecycle)

> Set "End Time" earlier rather than on unregistration.
> -
>
> Key: IMPALA-5397
> URL: https://issues.apache.org/jira/browse/IMPALA-5397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Mostafa Mokhtar
>Priority: Major
>  Labels: admission-control, query-lifecycle
>
> When queries are executed from Hue and hit the idle query timeout then the 
> query duration keeps going up even though the query was cancelled and it is 
> not actually doing any more work. The end time is only set when the query is 
> actually unregistered.
> Queries below finished in 1s640ms while the reported time is much longer. 
> |User||Default Db||Statement||Query Type||Start Time||Waiting 
> Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource 
> Pool||Details||Action|
> |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
> 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row 
> fetched|1|root.default|Details|Close|
> |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
> 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( 
> 100%)|FINISHED|1|root.default|Details|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5397) Set "End Time" earlier rather than on unregistration.

2018-12-05 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710781#comment-16710781
 ] 

Tim Armstrong commented on IMPALA-5397:
---

I think we should adopt a definition of "End Time" that more closely aligns to 
intuition, i.e. when the "real work" of the operation has completed.
* For queries, where execution proceeds concurrently with results being 
fetched, End Time should be set when admission resources are released (when the 
query is cancelled, or all results are fetched). 
* For other operations, End Time should be set when the operation enters the 
FINISHED state.

> Set "End Time" earlier rather than on unregistration.
> -
>
> Key: IMPALA-5397
> URL: https://issues.apache.org/jira/browse/IMPALA-5397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Mostafa Mokhtar
>Priority: Major
>  Labels: query-lifecycle
>
> When queries are executed from Hue and hit the idle query timeout then the 
> query duration keeps going up even though the query was cancelled and it is 
> not actually doing any more work. The end time is only set when the query is 
> actually unregistered.
> Queries below finished in 1s640ms while the reported time is much longer. 
> |User||Default Db||Statement||Query Type||Start Time||Waiting 
> Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource 
> Pool||Details||Action|
> |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
> 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row 
> fetched|1|root.default|Details|Close|
> |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
> 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( 
> 100%)|FINISHED|1|root.default|Details|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-5397) Set "End Time" earlier rather than on unregistration.

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-5397:
--
Summary: Set "End Time" earlier rather than on unregistration.  (was: 
Queries/sessions that are left idle after executing a query report incorrect 
duration )

> Set "End Time" earlier rather than on unregistration.
> -
>
> Key: IMPALA-5397
> URL: https://issues.apache.org/jira/browse/IMPALA-5397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Mostafa Mokhtar
>Priority: Major
>  Labels: query-lifecycle
>
> When queries are executed from Hue and hit the idle query timeout then the 
> query duration keeps going up even though the query was cancelled and it is 
> not actually doing any more work. The end time is only set when the query is 
> actually unregistered.
> Queries below finished in 1s640ms while the reported time is much longer. 
> |User||Default Db||Statement||Query Type||Start Time||Waiting 
> Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource 
> Pool||Details||Action|
> |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
> 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row 
> fetched|1|root.default|Details|Close|
> |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
> 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( 
> 100%)|FINISHED|1|root.default|Details|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-5397) Queries/sessions that are left idle after executing a query report incorrect duration

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-5397:
--
Target Version: Impala 3.2.0
   Description: 
When queries are executed from Hue and hit the idle query timeout then the 
query duration keeps going up even though the query was cancelled and it is not 
actually doing any more work. The end time is only set when the query is 
actually unregistered.

Queries below finished in 1s640ms while the reported time is much longer. 

|User||Default Db||Statement||Query Type||Start Time||Waiting 
Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource 
Pool||Details||Action|
|hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row 
fetched|1|root.default|Details|Close|
|hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( 
100%)|FINISHED|1|root.default|Details|

  was:
When queries are executed from Hue then the session is left idle and incorrect 
query duration is reported. 
As the session is left alive the query duration keeps going up even though the 
query stats is FINISHED.

Queries below finished in 1s640ms while the reported time is much longer. 

|User||Default Db||Statement||Query Type||Start Time||Waiting 
Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource 
Pool||Details||Action|
|hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row 
fetched|1|root.default|Details|Close|
|hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( 
100%)|FINISHED|1|root.default|Details|


> Queries/sessions that are left idle after executing a query report incorrect 
> duration 
> --
>
> Key: IMPALA-5397
> URL: https://issues.apache.org/jira/browse/IMPALA-5397
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Mostafa Mokhtar
>Priority: Major
>  Labels: query-lifecycle
>
> When queries are executed from Hue and hit the idle query timeout then the 
> query duration keeps going up even though the query was cancelled and it is 
> not actually doing any more work. The end time is only set when the query is 
> actually unregistered.
> Queries below finished in 1s640ms while the reported time is much longer. 
> |User||Default Db||Statement||Query Type||Start Time||Waiting 
> Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource 
> Pool||Details||Action|
> |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
> 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row 
> fetched|1|root.default|Details|Close|
> |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select 
> count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 
> 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( 
> 100%)|FINISHED|1|root.default|Details|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-5958) Remove duplication of 'yarn-extras' AllocationFileLoaderService

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-5958.
---
Resolution: Later

I don't think the code cleanup is worth tracking as a open JIRA

> Remove duplication of 'yarn-extras' AllocationFileLoaderService
> ---
>
> Key: IMPALA-5958
> URL: https://issues.apache.org/jira/browse/IMPALA-5958
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 2.11.0
>Reporter: Matthew Jacobs
>Priority: Trivial
>  Labels: admission-control, ramp-up
>
> In IMPALA-5920, some Yarn code that is used by Impala admission control is 
> brought into the Impala codebase.
> In the code review, [~zamsden] pointed out that the 
> AllocationFileLoaderService thread for monitoring the fair-scheduler.xml file 
> for changes could be removed if the RequestPoolService used the 
> impala.util.FileWatcherService. See 
> https://gerrit.cloudera.org/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java@103



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7312) Non-blocking mode for Fetch() RPC

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7312:
--
Description: Currently Fetch() can block for an arbitrary amount of time 
until a batch of rows is produced. It might be helpful to have a mode where it 
returns quickly when there is no data available, so that threads and RPC slots 
are not tied up.  (was: Currently Fetch() can block for an arbitrary amount of 
time until a batch of rows is produced. It might be helpful to have a mode 
where it returns quickly when there is no data available, that that threads and 
RPC slots are not tied up.)

> Non-blocking mode for Fetch() RPC
> -
>
> Key: IMPALA-7312
> URL: https://issues.apache.org/jira/browse/IMPALA-7312
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management
>
> Currently Fetch() can block for an arbitrary amount of time until a batch of 
> rows is produced. It might be helpful to have a mode where it returns quickly 
> when there is no data available, so that threads and RPC slots are not tied 
> up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7312) Non-blocking mode for Fetch() RPC

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7312:
--
Target Version: Impala 3.2.0

> Non-blocking mode for Fetch() RPC
> -
>
> Key: IMPALA-7312
> URL: https://issues.apache.org/jira/browse/IMPALA-7312
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management
>
> Currently Fetch() can block for an arbitrary amount of time until a batch of 
> rows is produced. It might be helpful to have a mode where it returns quickly 
> when there is no data available, so that threads and RPC slots are not tied 
> up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7672) Play nice with load balancers when shutting down coordinator

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7672:
--
Target Version: Impala 3.2.0  (was: Product Backlog)

> Play nice with load balancers when shutting down coordinator
> 
>
> Key: IMPALA-7672
> URL: https://issues.apache.org/jira/browse/IMPALA-7672
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management
>
> This is a placeholder to figure out what we need to do to get load balancers 
> like HAProxy and F5 to cleanly switch to alternative coordinators when we do 
> a graceful shutdown. E.g. do we need to stop accepting new TCP connections?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7814) AggregationNode's memory estimate should be based on NDV only for non-grouping aggs

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7814:
--
Labels: resource-management  (was: )

> AggregationNode's memory estimate should be based on NDV only for 
> non-grouping aggs 
> 
>
> Key: IMPALA-7814
> URL: https://issues.apache.org/jira/browse/IMPALA-7814
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Pooja Nilangekar
>Assignee: Pooja Nilangekar
>Priority: Major
>  Labels: resource-management
>
> Currently, the AggregationNode always computes the NDV to estimate the number 
> of rows. However, for grouping aggregates, the entire input has to be 
> consumed before the output can be produced, hence its memory estimate should 
> not consider the NDV.  This is acceptable for non-grouping aggregates because 
> it only need to store the value expression during the build phase, instead of 
> the entire tuple. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7814) AggregationNode's memory estimate should be based on NDV only for non-grouping aggs

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7814:
--
Component/s: Frontend

> AggregationNode's memory estimate should be based on NDV only for 
> non-grouping aggs 
> 
>
> Key: IMPALA-7814
> URL: https://issues.apache.org/jira/browse/IMPALA-7814
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Pooja Nilangekar
>Assignee: Pooja Nilangekar
>Priority: Major
>  Labels: resource-management
>
> Currently, the AggregationNode always computes the NDV to estimate the number 
> of rows. However, for grouping aggregates, the entire input has to be 
> consumed before the output can be produced, hence its memory estimate should 
> not consider the NDV.  This is acceptable for non-grouping aggregates because 
> it only need to store the value expression during the build phase, instead of 
> the entire tuple. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates

2018-12-05 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710758#comment-16710758
 ] 

Tim Armstrong commented on IMPALA-7351:
---

[~bikramjeet.vig] is there anything left to do here? I guess some of the sinks 
still have TODOs.

> Add memory estimates for plan nodes and sinks with missing estimates
> 
>
> Key: IMPALA-7351
> URL: https://issues.apache.org/jira/browse/IMPALA-7351
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
>
> Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, 
> etc are missing memory estimates entirely. 
> We should add a basic estimate for all these cases based on experiments and 
> data from real workloads. In some cases 0 may be the right estimate (e.g. for 
> streaming nodes like SelectNode that just pass through data) but we should 
> remove TODOs and document the reasoning in those cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7389) Admission control should set aside less memory on dedicated coordinator if coordinator fragment is lightweight

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7389:
--
Target Version: Impala 3.2.0  (was: Product Backlog)

> Admission control should set aside less memory on dedicated coordinator if 
> coordinator fragment is lightweight
> --
>
> Key: IMPALA-7389
> URL: https://issues.apache.org/jira/browse/IMPALA-7389
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: admission-control, resource-management
>
> The current admission control treats all backends symmetrically and sets 
> aside the mem_limit. This makes sense for now given that we have the same 
> mem_limit setting for all backends. 
> One case where this could be somewhat problematic is if you have dedicated 
> coordinators with less memory than the executors, because the coordinator's 
> process memory limit will be fully admitted before the executors.
> If you have multiple coordinators and queries are distributed between them 
> this is relatively unlikely to become a problem. If you have a single 
> coordinator this is more of an issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6032) Configuration knobs to automatically reject and fail queries

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-6032:
--
Priority: Minor  (was: Major)

> Configuration knobs to automatically reject and fail queries
> 
>
> Key: IMPALA-6032
> URL: https://issues.apache.org/jira/browse/IMPALA-6032
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Distributed Exec
>Reporter: Mostafa Mokhtar
>Priority: Minor
>  Labels: admission-control, resource-management
>
> Umbrella JIRA for Admission control enhancements.
> Query options would be set on a resource pool basis. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-5013) Re-evaluate our approach to per-operator memory estimates

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-5013.
---
   Resolution: Done
Fix Version/s: Not Applicable

With IMPALA-7349 the estimate is a guess at the "ideal" memory required to 
execute the query with full performance.

> Re-evaluate our approach to per-operator memory estimates
> -
>
> Key: IMPALA-5013
> URL: https://issues.apache.org/jira/browse/IMPALA-5013
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management
> Fix For: Not Applicable
>
>
> The way that memory estimates are computed for PlanNodes and Sinks are ad-hoc 
> and in some cases much less accurate than they could be. We should clarify 
> what the memory estimates mean, how they should be computed and then 
> systematically fix them.
> In general it's difficult to produce accurate memory estimates, because it 
> depends on having accurate estimates of cardinality and other runtime 
> parameters, so this JIRA isn't meant to guarantee any specific level of 
> accuracy of estimates, just to generally improve the estimates and clarify 
> what they mean and how they should be calculated
> We should also consider deprecating or removing these estimates, unless they 
> are useful for computing "ideal" memory in IMPALA-3706.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-5043) Flag when Impala daemon is disconnected from statestore

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-5043:
--
Target Version: Impala 3.2.0
   Summary: Flag when Impala daemon is disconnected from statestore  
(was: When daemons are disconnected from the Statestore they can show incorrect 
admission control limits)

> Flag when Impala daemon is disconnected from statestore
> ---
>
> Key: IMPALA-5043
> URL: https://issues.apache.org/jira/browse/IMPALA-5043
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.6.0
>Reporter: Thomas Scott
>Priority: Major
>  Labels: admission-control, resource-management, supportability
>
> When (for whatever reason) one or more daemons are disconnected from the 
> statestore the admission control data held on the daemon goes stale. This can 
> lead to the daemon accepting queries when there is not capacity or rejecting 
> queries when there is capacity. 
> For example, a pool somepool has a limit of 10 concurrent queries and is at 
> that limit when a daemon is disconnected from the statestore. Even when other 
> queries in somepool finish and the pool is now empty the disconnected daemon 
> will report the following when new queries are executed:
> ERROR: Admission for query exceeded timeout 6ms. Queued reason: number of 
> running queries 10 is over limit 10
> Could we have some warning to say that the admission control data is stale 
> here?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-5063) Enable monitoring of Admission Control queue information

2018-12-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-5063:
--
Target Version: Impala 3.2.0

> Enable monitoring of Admission Control queue information
> 
>
> Key: IMPALA-5063
> URL: https://issues.apache.org/jira/browse/IMPALA-5063
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 2.9.0
>Reporter: Miklos Szurap
>Priority: Major
>  Labels: admission-control, resource-management, supportability
>
> It would be nice if we could track the Admission Control / queue information 
> from the StateStore WebUI. 
> The topics page just shows a summary about "impala-request-queue" but nothing 
> on the details of the queues / number of queries / mem usage.
> Besides showing this on the WebUI, it would be nice to have it logged, so 
> there would be some kind of historical view. 
> These would enable to track issues when a query is rejected due to admission 
> control.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7929) Impala query on HBASE table failing with InternalException: Required field*

2018-12-05 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang reassigned IMPALA-7929:
-

Assignee: Yongjun Zhang

> Impala query on HBASE table failing with InternalException: Required field*
> ---
>
> Key: IMPALA-7929
> URL: https://issues.apache.org/jira/browse/IMPALA-7929
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Major
>
> This looks a corner case bug demonstrated at impala-hbase boundary.
> The way to reproduce:
> Create a table in hive shell,
> {code}
> create database abc;
> CREATE TABLE abc.test_hbase1 (k STRING, c STRING) STORED BY 
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('
> hbase.columns.mapping'=':key,cf:c', 'serialization.format'='1') TBLPROPERTIES 
> ('hbase.table.name'='test_hbase1', 'storage_handler'='o
> rg.apache.hadoop.hive.hbase.HBaseStorageHandler');
> {code}
> Then issue query at impala shell:
> {code}
> select * from abc.test_hbase1 where k != "row1"; 
> {code}
> Observe:
> {code}
> Query: select * from abc.test_hbase1 where k != "row1"
>  
> Query submitted at: 2018-12-04 17:02:42 (Coordinator: http://xyz:25000)
> ERROR: InternalException: Required field 'qualifier' was not present! Struct: 
> THBaseFilter(family::key, qualifier:null, op_ordinal:3, filter_constant:row1)
> {code}
> More observations:
> # Replacing {{k != "row1"}} with {{k <> "row1"}} fails the same way. However, 
> replacing it with other operators, such as ">", "<", "=", all works.
> # Replacing {{k != "row1}} with {{c != "row1"}}, it succeeded without the 
> error reported above.
> The above example uses a two-column table, creating a similar table with 
> three columns fails the same way: adding inequality predicate on the first 
> column fails, adding inequility predicate doesn't fail.
> The code that issues the error message is in HBase, it seems Impala did not 
> pass the needed info to HBase in this special case. Also wonder if it's 
> because the first column of the table is the key in hbase table that could 
> reveal the bug.
> {code}
> hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnIncrement.java:
>   throw new org.apache.thrift.protocol.TProtocolException("Required field 
> 'qualifier' was not present! Struct: " + toString());
> hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnValue.java:
>   throw new org.apache.thrift.protocol.TProtocolException("Required field 
> 'qualifier' was not present! Struct: " + toString());
> hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java:
> throw new org.apache.thrift.protocol.TProtocolException("Required 
> field 'qualifier' was not present! Struct: " + toString());
> hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java:
> throw new org.apache.thrift.protocol.TProtocolException("Required 
> field 'qualifier' was not present! Struct: " + toString());
> hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java:
> throw new org.apache.thrift.protocol.TProtocolException("Required 
> field 'qualifier' was not present! Struct: " + toString());
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7929) Impala query on HBASE table failing with InternalException: Required field*

2018-12-05 Thread Yongjun Zhang (JIRA)
Yongjun Zhang created IMPALA-7929:
-

 Summary: Impala query on HBASE table failing with 
InternalException: Required field*
 Key: IMPALA-7929
 URL: https://issues.apache.org/jira/browse/IMPALA-7929
 Project: IMPALA
  Issue Type: Bug
Reporter: Yongjun Zhang


This looks a corner case bug demonstrated at impala-hbase boundary.

The way to reproduce:

Create a table in hive shell,

{code}
create database abc;

CREATE TABLE abc.test_hbase1 (k STRING, c STRING) STORED BY 
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('
hbase.columns.mapping'=':key,cf:c', 'serialization.format'='1') TBLPROPERTIES 
('hbase.table.name'='test_hbase1', 'storage_handler'='o
rg.apache.hadoop.hive.hbase.HBaseStorageHandler');

{code}

Then issue query at impala shell:
{code}
select * from abc.test_hbase1 where k != "row1"; 
{code}

Observe:
{code}
Query: select * from abc.test_hbase1 where k != "row1"  
   
Query submitted at: 2018-12-04 17:02:42 (Coordinator: http://xyz:25000)
ERROR: InternalException: Required field 'qualifier' was not present! Struct: 
THBaseFilter(family::key, qualifier:null, op_ordinal:3, filter_constant:row1)
{code}

More observations:

# Replacing {{k != "row1"}} with {{k <> "row1"}} fails the same way. However, 
replacing it with other operators, such as ">", "<", "=", all works.
# Replacing {{k != "row1}} with {{c != "row1"}}, it succeeded without the error 
reported above.

The above example uses a two-column table, creating a similar table with three 
columns fails the same way: adding inequality predicate on the first column 
fails, adding inequility predicate doesn't fail.

The code that issues the error message is in HBase, it seems Impala did not 
pass the needed info to HBase in this special case. Also wonder if it's because 
the first column of the table is the key in hbase table that could reveal the 
bug.

{code}
hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnIncrement.java:
  throw new org.apache.thrift.protocol.TProtocolException("Required field 
'qualifier' was not present! Struct: " + toString());
hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnValue.java:
  throw new org.apache.thrift.protocol.TProtocolException("Required field 
'qualifier' was not present! Struct: " + toString());
hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java:
throw new org.apache.thrift.protocol.TProtocolException("Required field 
'qualifier' was not present! Struct: " + toString());
hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java:
throw new org.apache.thrift.protocol.TProtocolException("Required field 
'qualifier' was not present! Struct: " + toString());
hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java:
throw new org.apache.thrift.protocol.TProtocolException("Required field 
'qualifier' was not present! Struct: " + toString());
{code}







--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7802) Implement support for closing idle sessions

2018-12-05 Thread Zoram Thanga (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710704#comment-16710704
 ] 

Zoram Thanga commented on IMPALA-7802:
--

The documentation states that:

{quote}
Once a session is expired, you cannot issue any new query requests to it. The 
session remains open, but the only operation you can perform is to close it.
{quote}

This basically says that an expired session serves no useful purpose to any one 
- not to Impala as it consumes an fe_service_thread, and not to the client 
because the only operation allowed on it is to close it.

I would like to change the session expiry code to always force-close expired 
sessions from the server side by calling ImpalaServer::CloseSessionInternal() 
or a modified version of it. 

> Implement support for closing idle sessions
> ---
>
> Key: IMPALA-7802
> URL: https://issues.apache.org/jira/browse/IMPALA-7802
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Zoram Thanga
>Priority: Critical
>  Labels: supportability
>
> Currently, the query option {{idle_session_timeout}} specifies a timeout in 
> seconds after which all running queries of that idle session will be 
> cancelled and no new queries can be issued to it. However, the idle session 
> will remain open and it needs to be closed explicitly. Please see the 
> [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html]
>  for details.
> This behavior may be undesirable as each session still consumes an Impala 
> frontend service thread. The number of frontend service threads is bound by 
> the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala 
> server can have a lot of idle sessions but they still consume against the 
> quota of {{fe_service_threads}}. If the number of sessions established 
> reaches {{fe_service_threads}}, all new session creations will block until 
> some of the existing sessions exit. There may be no time bound on when these 
> zombie idle sessions will be closed and it's at the mercy of the client 
> implementation to close them. In some sense, leaving many idle sessions open 
> is a way to launch a denial of service attack on Impala.
> To fix this situation, we should have an option to forcefully close a session 
> when it's considered idle so it won't unnecessarily consume the limited 
> number of frontend service threads. cc'ing [~zoram]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-7802) Implement support for closing idle sessions

2018-12-05 Thread Zoram Thanga (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7802 started by Zoram Thanga.

> Implement support for closing idle sessions
> ---
>
> Key: IMPALA-7802
> URL: https://issues.apache.org/jira/browse/IMPALA-7802
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Zoram Thanga
>Priority: Critical
>  Labels: supportability
>
> Currently, the query option {{idle_session_timeout}} specifies a timeout in 
> seconds after which all running queries of that idle session will be 
> cancelled and no new queries can be issued to it. However, the idle session 
> will remain open and it needs to be closed explicitly. Please see the 
> [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html]
>  for details.
> This behavior may be undesirable as each session still consumes an Impala 
> frontend service thread. The number of frontend service threads is bound by 
> the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala 
> server can have a lot of idle sessions but they still consume against the 
> quota of {{fe_service_threads}}. If the number of sessions established 
> reaches {{fe_service_threads}}, all new session creations will block until 
> some of the existing sessions exit. There may be no time bound on when these 
> zombie idle sessions will be closed and it's at the mercy of the client 
> implementation to close them. In some sense, leaving many idle sessions open 
> is a way to launch a denial of service attack on Impala.
> To fix this situation, we should have an option to forcefully close a session 
> when it's considered idle so it won't unnecessarily consume the limited 
> number of frontend service threads. cc'ing [~zoram]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7928) Investigate consistent placement of remote scan ranges

2018-12-05 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710544#comment-16710544
 ] 

Philip Zeyliger commented on IMPALA-7928:
-

I'm interested in the results even in the currently common case of the number 
of nodes not changing, but I agree that we'll eventually want more stability 
than that. 

> Investigate consistent placement of remote scan ranges
> --
>
> Key: IMPALA-7928
> URL: https://issues.apache.org/jira/browse/IMPALA-7928
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Joe McDonnell
>Priority: Major
>
> With the file handle cache, it is useful for repeated scans of the same file 
> to go to the same node, as that node will already have a file handle cached.
> When scheduling remote ranges, the scheduler introduces randomness that can 
> spread reads across all of the nodes. Repeated executions of queries on the 
> same set of files will not schedule the remote reads on the same nodes. This 
> causes a large amount of duplication across file handle caches on different 
> nodes. This reduces the efficiency of the cache significantly.
> It may be useful for the scheduler to introduce some determinism in 
> scheduling remote reads to take advantage of the file handle cache. This is a 
> variation on the well-known tradeoff between skew and locality.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7928) Investigate consistent placement of remote scan ranges

2018-12-05 Thread Joe McDonnell (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710507#comment-16710507
 ] 

Joe McDonnell commented on IMPALA-7928:
---

One problem I ran into when implementing this is that a simple hash will have 
bad behavior if the number of nodes changes. I'm taking a look at consistent 
hashes to see if that makes sense. Example: 
http://highscalability.com/blog/2018/6/18/how-ably-efficiently-implemented-consistent-hashing.html

> Investigate consistent placement of remote scan ranges
> --
>
> Key: IMPALA-7928
> URL: https://issues.apache.org/jira/browse/IMPALA-7928
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Joe McDonnell
>Priority: Major
>
> With the file handle cache, it is useful for repeated scans of the same file 
> to go to the same node, as that node will already have a file handle cached.
> When scheduling remote ranges, the scheduler introduces randomness that can 
> spread reads across all of the nodes. Repeated executions of queries on the 
> same set of files will not schedule the remote reads on the same nodes. This 
> causes a large amount of duplication across file handle caches on different 
> nodes. This reduces the efficiency of the cache significantly.
> It may be useful for the scheduler to introduce some determinism in 
> scheduling remote reads to take advantage of the file handle cache. This is a 
> variation on the well-known tradeoff between skew and locality.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-5605) document how to increase thread resource limits

2018-12-05 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-5605.
---
   Resolution: Fixed
Fix Version/s: (was: Impala 2.10.0)
   Impala 3.2.0

> document how to increase thread resource limits
> ---
>
> Key: IMPALA-5605
> URL: https://issues.apache.org/jira/browse/IMPALA-5605
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Affects Versions: Impala 2.9.0
>Reporter: Matthew Mulder
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.2.0
>
>
> Depending on the workload, Impala may need to create a very large number of 
> threads. If so, it is necessary to configure the system correctly to prevent 
> Impala from crashing because of resource limitations. Such a crash would look 
> like this:{code}F0629 08:20:02.956413 29088 llvm-codegen.cc:111] LLVM hit 
> fatal error: Unable to allocate section memory!
> terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'{code}To prevent this, each Impala host should be configured like 
> this:{code}echo 200 > /proc/sys/kernel/threads-max
> echo 200 > /proc/sys/kernel/pid_max
> echo 800 > /proc/sys/vm/max_map_count{code}In /etc/security/limits.conf 
> add{code}impala soft nproc 262144
> impala hard nproc 262144{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-2424) Rack-aware scheduling

2018-12-05 Thread Peter Ebert (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710273#comment-16710273
 ] 

Peter Ebert commented on IMPALA-2424:
-

This is becoming increasingly important for scaling and separation of storage 
and compute.  If impala is installed on a subset of nodes, or distinct compute 
only nodes, remote reads would be essentially random and cross rack traffic may 
become saturated, especially at large scale where network over-subscription is 
common this could be a problem.  With rack aware scheduling and proper 
distribution of impala and storage nodes per rack, rack aware scheduling could 
keep traffic within the TOR switches and improve performance.

> Rack-aware scheduling
> -
>
> Key: IMPALA-2424
> URL: https://issues.apache.org/jira/browse/IMPALA-2424
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.2.4
>Reporter: Marcel Kornacker
>Priority: Minor
>  Labels: scalability, scheduling
>
> Currently, Impala makes an effort to schedule plan fragments local to the 
> data that is being scanned; when no collocated impalad is available, the plan 
> fragment is placed randomly.
> In order to support configurations where Impala is run on a subset of the 
> nodes in a cluster, we should schedule fragments within the same rack that 
> holds the assigned scan ranges (if a collocated impalad isn't available).
> See https://issues.apache.org/jira/browse/HADOOP-692 for details of how rack 
> locality is recorded in hdfs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org