[jira] [Assigned] (IMPALA-3926) Reconsider use of LD_LIBRARY_PATH for toolchain libraries

2019-09-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-3926:
-

Assignee: Tim Armstrong

> Reconsider use of LD_LIBRARY_PATH for toolchain libraries
> -
>
> Key: IMPALA-3926
> URL: https://issues.apache.org/jira/browse/IMPALA-3926
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.6.0
>Reporter: Matthew Jacobs
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: build, toolchain
>
> Right now, impala-config.sh puts a lot of libraries in LD_LIBRARY_PATH, but 
> this can be a problem for binaries that aren't from our builds or explicitly 
> built against these specific libraries. One solution is to move any tools we 
> need into the toolchain and build against these libraries. While this may be 
> a reasonable thing to do (i.e. moving all tools we need into the toolchain), 
> we should consider if setting LD_LIBRARY_PATH for the whole Impala 
> environment is really necessary and the right thing to do (e.g. [some people 
> say using LD_LIBRARY_PATH is 
> bad|http://xahlee.info/UnixResource_dir/_/ldpath.html]).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8957) TestFetchAndSpooling.test_rows_sent_counters is flaky

2019-09-19 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934020#comment-16934020
 ] 

Tim Armstrong commented on IMPALA-8957:
---

Also seen here - 
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/1246/testReport/junit/query_test.test_fetch/TestFetch/test_rows_sent_counters_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/

> TestFetchAndSpooling.test_rows_sent_counters is flaky
> -
>
> Key: IMPALA-8957
> URL: https://issues.apache.org/jira/browse/IMPALA-8957
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> Error Details
> {noformat}
> query_test/test_fetch.py:77: in test_rows_sent_counters assert 
> re.search("RowsSentRate: [1-9]", result.runtime_profile) E assert None E + 
> where None = ('RowsSentRate: [1-9]', 'Query 
> (id=3946b19649af9ce3:7f38be67):\n DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil... - OptimizationTime: 59.000ms\n - 
> PeakMemoryUsage: 213.50 KB (218624)\n - PrepareTime: 26.000ms\n') E + where 
>  = re.search E + and 'Query 
> (id=3946b19649af9ce3:7f38be67):\n DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil... - OptimizationTime: 59.000ms\n - 
> PeakMemoryUsage: 213.50 KB (218624)\n - PrepareTime: 26.000ms\n' = 
>  0xbbfa5d0>.runtime_profile{noformat}
> Stack Trace
> {noformat}
> query_test/test_fetch.py:77: in test_rows_sent_counters
> assert re.search("RowsSentRate: [1-9]", result.runtime_profile)
> E   assert None
> E+  where None = ('RowsSentRate: [1-9]', 
> 'Query (id=3946b19649af9ce3:7f38be67):\n  DEBUG MODE WARNING: Query 
> profile created while running a DEBUG buil...  - OptimizationTime: 59.000ms\n 
>   - PeakMemoryUsage: 213.50 KB (218624)\n   - PrepareTime: 
> 26.000ms\n')
> E+where  = re.search
> E+and   'Query (id=3946b19649af9ce3:7f38be67):\n  DEBUG MODE 
> WARNING: Query profile created while running a DEBUG buil...  - 
> OptimizationTime: 59.000ms\n   - PeakMemoryUsage: 213.50 KB 
> (218624)\n   - PrepareTime: 26.000ms\n' = 
>  0xbbfa5d0>.runtime_profile{noformat}
> Standard Error
> {noformat}
> SET 
> client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> select id from functional.alltypes limit 10;
> -- 2019-09-18 18:51:20,759 INFO MainThread: Started query 
> 3946b19649af9ce3:7f38be67{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8634) Catalog client should be resilient to temporary Catalog outage

2019-09-19 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8634.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Catalog client should be resilient to temporary Catalog outage
> --
>
> Key: IMPALA-8634
> URL: https://issues.apache.org/jira/browse/IMPALA-8634
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Sahil Takiar
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> Currently, when the catalog server is down, catalog clients will fail all 
> RPCs sent to it. In essence, DDL queries will fail and the Impala service 
> becomes a lot less functional. Catalog clients should consider retrying 
> failed RPCs with some exponential backoff in between while catalog server is 
> being restarted after crashing. We probably need to add [a test 
> |https://github.com/apache/impala/blob/master/tests/custom_cluster/test_restart_services.py]
>  to exercise the paths of catalog restart to verify coordinators are 
> resilient to it.
> cc'ing [~stakiar], [~joemcdonnell], [~twm378]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8634) Catalog client should be resilient to temporary Catalog outage

2019-09-19 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8634.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Catalog client should be resilient to temporary Catalog outage
> --
>
> Key: IMPALA-8634
> URL: https://issues.apache.org/jira/browse/IMPALA-8634
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Sahil Takiar
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> Currently, when the catalog server is down, catalog clients will fail all 
> RPCs sent to it. In essence, DDL queries will fail and the Impala service 
> becomes a lot less functional. Catalog clients should consider retrying 
> failed RPCs with some exponential backoff in between while catalog server is 
> being restarted after crashing. We probably need to add [a test 
> |https://github.com/apache/impala/blob/master/tests/custom_cluster/test_restart_services.py]
>  to exercise the paths of catalog restart to verify coordinators are 
> resilient to it.
> cc'ing [~stakiar], [~joemcdonnell], [~twm378]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8634) Catalog client should be resilient to temporary Catalog outage

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933844#comment-16933844
 ] 

ASF subversion and git services commented on IMPALA-8634:
-

Commit b96b3b0b1ca97e5d756392a159e22dfcd8bcae71 in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b96b3b0 ]

IMPALA-8634: Catalog client should retry RPCs

Add retries to catalogd RPCs. Previously, connection failures triggered
a retry, but failures on the actual RPC did not trigger a retry. This
change replaces all usages of ClientCache::DoRpc() in the
CatalogOpExecutor with ClientCache::DoRpcWithRetry(). This change moves
the connection retry loop to DoRpcWithRetry(), instead of relying on the
ClientCache to retry the connection.

This patch is based to IMPALA-8904, which adds similar functionality to
statestore RPCs.

Testing:
* Renamed test_statestore_rpc_errors.py to test_services_rpc_errors.py
and added new tests for catalogd RPC errors
* Added new tests to test_restart_services.py
* Ran core tests

Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Reviewed-on: http://gerrit.cloudera.org:8080/14246
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Catalog client should be resilient to temporary Catalog outage
> --
>
> Key: IMPALA-8634
> URL: https://issues.apache.org/jira/browse/IMPALA-8634
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Sahil Takiar
>Priority: Critical
>
> Currently, when the catalog server is down, catalog clients will fail all 
> RPCs sent to it. In essence, DDL queries will fail and the Impala service 
> becomes a lot less functional. Catalog clients should consider retrying 
> failed RPCs with some exponential backoff in between while catalog server is 
> being restarted after crashing. We probably need to add [a test 
> |https://github.com/apache/impala/blob/master/tests/custom_cluster/test_restart_services.py]
>  to exercise the paths of catalog restart to verify coordinators are 
> resilient to it.
> cc'ing [~stakiar], [~joemcdonnell], [~twm378]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8904) Daemons fails fast when statestore has not started up

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933845#comment-16933845
 ] 

ASF subversion and git services commented on IMPALA-8904:
-

Commit b96b3b0b1ca97e5d756392a159e22dfcd8bcae71 in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b96b3b0 ]

IMPALA-8634: Catalog client should retry RPCs

Add retries to catalogd RPCs. Previously, connection failures triggered
a retry, but failures on the actual RPC did not trigger a retry. This
change replaces all usages of ClientCache::DoRpc() in the
CatalogOpExecutor with ClientCache::DoRpcWithRetry(). This change moves
the connection retry loop to DoRpcWithRetry(), instead of relying on the
ClientCache to retry the connection.

This patch is based to IMPALA-8904, which adds similar functionality to
statestore RPCs.

Testing:
* Renamed test_statestore_rpc_errors.py to test_services_rpc_errors.py
and added new tests for catalogd RPC errors
* Added new tests to test_restart_services.py
* Ran core tests

Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Reviewed-on: http://gerrit.cloudera.org:8080/14246
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Daemons fails fast when statestore has not started up
> -
>
> Key: IMPALA-8904
> URL: https://issues.apache.org/jira/browse/IMPALA-8904
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0, Impala 3.2.0, Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> If you start the statestored and the other services at the same time, there 
> is a race between the statestore starting and the other services trying to 
> register with it. If the other services "win" the race, they abort startup 
> because they can't register with the statestore.
> The log looks like.
> {noformat}
> │ I0828 00:19:10.46 1 statestore-subscriber.cc:219] Starting 
> statestore subscriber 
>   
>  ││ I0828 
> 00:19:10.461310 1 thrift-server.cc:451] ThriftServer 
> 'StatestoreSubscriber' started on port: 23000 
>   
>  │
> │ I0828 00:19:10.461320 1 statestore-subscriber.cc:247] Registering with 
> statestore
>   
>  ││ I0828 00:19:10.461309   
> 299 TAcceptQueueServer.cpp:314] connection_setup_thread_pool_size is set to 2 
>   
>   
>   │
> │ I0828 00:19:10.462744 1 statestore-subscriber.cc:253] statestore 
> registration unsuccessful: RPC Error: Client for statestored:24000 hit an 
> unexpected exception: No more data to read., type: 
> N6apache6thrift9transport19TTransportExceptionE, rpc: 
> N6impala27TRegisterSubscriberRe ││ sponseE, send: done
>   
>   
>   
>│
> │ E0828 00:19:10.462818 1 impalad-main.cc:90] Impalad services did not 
> start correctly, exiting.  Error: RPC Error: Client for statestored:24000 hit 
> an unexpected exception: No more data to read., type: 
> N6apache6thrift9transport19TTransportExceptionE, rpc: N6impala27TRegisterS ││ 
> ubscriberResponseE, send: done
>   
>   
>   │
> │ Statestore subscriber did not start up. 
>   
> {noformat}
> Most management systems will automatically restart failed processes, so 
> typically the impalads will come back up and find the statestore, but the 
> crash loop is unnecessary.
> I propose that the services should retry for a while before giving up (we 
> still want the services to fail when there genuinely isn't a statestore 
> 

[jira] [Created] (IMPALA-8957) TestFetchAndSpooling.test_rows_sent_counters is flaky

2019-09-19 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-8957:


 Summary: TestFetchAndSpooling.test_rows_sent_counters is flaky
 Key: IMPALA-8957
 URL: https://issues.apache.org/jira/browse/IMPALA-8957
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Sahil Takiar
Assignee: Sahil Takiar


Error Details
{noformat}
query_test/test_fetch.py:77: in test_rows_sent_counters assert 
re.search("RowsSentRate: [1-9]", result.runtime_profile) E assert None E + 
where None = ('RowsSentRate: [1-9]', 'Query 
(id=3946b19649af9ce3:7f38be67):\n DEBUG MODE WARNING: Query profile 
created while running a DEBUG buil... - OptimizationTime: 59.000ms\n - 
PeakMemoryUsage: 213.50 KB (218624)\n - PrepareTime: 26.000ms\n') E + where 
 = re.search E + and 'Query 
(id=3946b19649af9ce3:7f38be67):\n DEBUG MODE WARNING: Query profile 
created while running a DEBUG buil... - OptimizationTime: 59.000ms\n - 
PeakMemoryUsage: 213.50 KB (218624)\n - PrepareTime: 26.000ms\n' = 
.runtime_profile{noformat}
Stack Trace
{noformat}
query_test/test_fetch.py:77: in test_rows_sent_counters
assert re.search("RowsSentRate: [1-9]", result.runtime_profile)
E   assert None
E+  where None = ('RowsSentRate: [1-9]', 
'Query (id=3946b19649af9ce3:7f38be67):\n  DEBUG MODE WARNING: Query 
profile created while running a DEBUG buil...  - OptimizationTime: 59.000ms\n   
- PeakMemoryUsage: 213.50 KB (218624)\n   - PrepareTime: 
26.000ms\n')
E+where  = re.search
E+and   'Query (id=3946b19649af9ce3:7f38be67):\n  DEBUG MODE 
WARNING: Query profile created while running a DEBUG buil...  - 
OptimizationTime: 59.000ms\n   - PeakMemoryUsage: 213.50 KB (218624)\n  
 - PrepareTime: 26.000ms\n' = 
.runtime_profile{noformat}
Standard Error
{noformat}
SET 
client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000

select id from functional.alltypes limit 10;

-- 2019-09-18 18:51:20,759 INFO MainThread: Started query 
3946b19649af9ce3:7f38be67{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8957) TestFetchAndSpooling.test_rows_sent_counters is flaky

2019-09-19 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-8957:


 Summary: TestFetchAndSpooling.test_rows_sent_counters is flaky
 Key: IMPALA-8957
 URL: https://issues.apache.org/jira/browse/IMPALA-8957
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Sahil Takiar
Assignee: Sahil Takiar


Error Details
{noformat}
query_test/test_fetch.py:77: in test_rows_sent_counters assert 
re.search("RowsSentRate: [1-9]", result.runtime_profile) E assert None E + 
where None = ('RowsSentRate: [1-9]', 'Query 
(id=3946b19649af9ce3:7f38be67):\n DEBUG MODE WARNING: Query profile 
created while running a DEBUG buil... - OptimizationTime: 59.000ms\n - 
PeakMemoryUsage: 213.50 KB (218624)\n - PrepareTime: 26.000ms\n') E + where 
 = re.search E + and 'Query 
(id=3946b19649af9ce3:7f38be67):\n DEBUG MODE WARNING: Query profile 
created while running a DEBUG buil... - OptimizationTime: 59.000ms\n - 
PeakMemoryUsage: 213.50 KB (218624)\n - PrepareTime: 26.000ms\n' = 
.runtime_profile{noformat}
Stack Trace
{noformat}
query_test/test_fetch.py:77: in test_rows_sent_counters
assert re.search("RowsSentRate: [1-9]", result.runtime_profile)
E   assert None
E+  where None = ('RowsSentRate: [1-9]', 
'Query (id=3946b19649af9ce3:7f38be67):\n  DEBUG MODE WARNING: Query 
profile created while running a DEBUG buil...  - OptimizationTime: 59.000ms\n   
- PeakMemoryUsage: 213.50 KB (218624)\n   - PrepareTime: 
26.000ms\n')
E+where  = re.search
E+and   'Query (id=3946b19649af9ce3:7f38be67):\n  DEBUG MODE 
WARNING: Query profile created while running a DEBUG buil...  - 
OptimizationTime: 59.000ms\n   - PeakMemoryUsage: 213.50 KB (218624)\n  
 - PrepareTime: 26.000ms\n' = 
.runtime_profile{noformat}
Standard Error
{noformat}
SET 
client_identifier=query_test/test_fetch.py::TestFetchAndSpooling::()::test_rows_sent_counters[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|table;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET exec_single_node_rows_threshold=0;
-- executing against localhost:21000

select id from functional.alltypes limit 10;

-- 2019-09-18 18:51:20,759 INFO MainThread: Started query 
3946b19649af9ce3:7f38be67{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8939) TestResultSpooling.test_full_queue_large_fetch is flaky

2019-09-19 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8939.
--
Resolution: Duplicate

> TestResultSpooling.test_full_queue_large_fetch is flaky
> ---
>
> Key: IMPALA-8939
> URL: https://issues.apache.org/jira/browse/IMPALA-8939
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Csaba Ringhofer
>Priority: Critical
>
> The query profile contains RowBatchSendWaitTime: 0.000ns time to time, which 
> causes this test to fail.
> This seems to be common when USE_CDP_HIVE=true, but I also seen it in non-CDP 
> builds.
> I did not investigate the cause, so I don't know whether CDPness should have 
> any effect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8939) TestResultSpooling.test_full_queue_large_fetch is flaky

2019-09-19 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8939.
--
Resolution: Duplicate

> TestResultSpooling.test_full_queue_large_fetch is flaky
> ---
>
> Key: IMPALA-8939
> URL: https://issues.apache.org/jira/browse/IMPALA-8939
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Csaba Ringhofer
>Priority: Critical
>
> The query profile contains RowBatchSendWaitTime: 0.000ns time to time, which 
> causes this test to fail.
> This seems to be common when USE_CDP_HIVE=true, but I also seen it in non-CDP 
> builds.
> I did not investigate the cause, so I don't know whether CDPness should have 
> any effect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8935) Add links to other daemons from webui

2019-09-19 Thread Thomas Tauber-Marshall (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-8935.

Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Add links to other daemons from webui
> -
>
> Key: IMPALA-8935
> URL: https://issues.apache.org/jira/browse/IMPALA-8935
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> It would be convenient for all of the debug webuis to have links to the other 
> debug webuis within a single cluster.
> For impalads, it would be easy to add links to each other impalad on the 
> /backends page (from IMPALA-210 it looks like this even used to be the case, 
> but everything has changed a ton since then, eg. we weren't even using 
> templates at the time, so it got lost somewhere along the way). Its also 
> fairly straight forward to add a link to the statestored and catalogd, eg. 
> maybe on the index page or else on the nav bar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8935) Add links to other daemons from webui

2019-09-19 Thread Thomas Tauber-Marshall (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933738#comment-16933738
 ] 

Thomas Tauber-Marshall commented on IMPALA-8935:


So the use-case you're talking about, accessing the minicluster webui in a dev 
environment from another machine, I think is difficult to make work in all 
cases.

Rather than using ip addresses, we could specify hostnames for these links, but 
that's not guaranteed to always work - I think its common for people to develop 
on machines that don't have DNS-resolvable hostnames (eg see IMPALA-8917). And 
its always going to be the case that the machine's hostname resolves to 
127.0.0.1 locally due to 
https://github.com/apache/impala/blob/master/bin/bootstrap_system.sh#L362

I don't think this is too big of a deal - it should work in any sort of real, 
non-development environment, and of course its possible in a dev environment to 
access the other webuis by manually specifying the right host:port instead of 
clicking the link, as its always been.

One option if you really want this to work in a dev environment is to specify 
the --webserver_interface flag to some public IP on minicluster startup. This 
flag is currently broken, but I have a patch out to get it working: 
https://gerrit.cloudera.org/#/c/14266/

> Add links to other daemons from webui
> -
>
> Key: IMPALA-8935
> URL: https://issues.apache.org/jira/browse/IMPALA-8935
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>
> It would be convenient for all of the debug webuis to have links to the other 
> debug webuis within a single cluster.
> For impalads, it would be easy to add links to each other impalad on the 
> /backends page (from IMPALA-210 it looks like this even used to be the case, 
> but everything has changed a ton since then, eg. we weren't even using 
> templates at the time, so it got lost somewhere along the way). Its also 
> fairly straight forward to add a link to the statestored and catalogd, eg. 
> maybe on the index page or else on the nav bar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8935) Add links to other daemons from webui

2019-09-19 Thread Thomas Tauber-Marshall (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-8935.

Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Add links to other daemons from webui
> -
>
> Key: IMPALA-8935
> URL: https://issues.apache.org/jira/browse/IMPALA-8935
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> It would be convenient for all of the debug webuis to have links to the other 
> debug webuis within a single cluster.
> For impalads, it would be easy to add links to each other impalad on the 
> /backends page (from IMPALA-210 it looks like this even used to be the case, 
> but everything has changed a ton since then, eg. we weren't even using 
> templates at the time, so it got lost somewhere along the way). Its also 
> fairly straight forward to add a link to the statestored and catalogd, eg. 
> maybe on the index page or else on the nav bar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8703) SQL:2016 datetime patterns - Milestone 1

2019-09-19 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab resolved IMPALA-8703.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> SQL:2016 datetime patterns - Milestone 1
> 
>
> Key: IMPALA-8703
> URL: https://issues.apache.org/jira/browse/IMPALA-8703
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Affects Versions: Impala 2.2.4
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> Design doc for SQL:2016 datetime patterns:
> https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/
> Milestone 1 content:
> - Introduce FORMAT clause for CAST()
> - Implement basic SQL:2016 datetime patterns to comply with the standard. For 
> more details check the document above.
> - Use the new SQL:2016 pattern handling for CAST(.. FORMAT..)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8703) SQL:2016 datetime patterns - Milestone 1

2019-09-19 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab resolved IMPALA-8703.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> SQL:2016 datetime patterns - Milestone 1
> 
>
> Key: IMPALA-8703
> URL: https://issues.apache.org/jira/browse/IMPALA-8703
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Affects Versions: Impala 2.2.4
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> Design doc for SQL:2016 datetime patterns:
> https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/
> Milestone 1 content:
> - Introduce FORMAT clause for CAST()
> - Implement basic SQL:2016 datetime patterns to comply with the standard. For 
> more details check the document above.
> - Use the new SQL:2016 pattern handling for CAST(.. FORMAT..)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8934) Add failpoint tests to result spooling code

2019-09-19 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8934.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Add failpoint tests to result spooling code
> ---
>
> Key: IMPALA-8934
> URL: https://issues.apache.org/jira/browse/IMPALA-8934
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 3.2.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> IMPALA-8924 was discovered while running {{test_failpoints.py}} with results 
> spooling enabled. The goal of this JIRA is to add similar failpoint coverage 
> to {{test_result_spooling.py}} so that we have sufficient coverage for the 
> various failure paths when result spooling is enabled.
> The failure paths that should be covered include:
> * Failures while executing the exec tree should be handled correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8924) DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty

2019-09-19 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8924.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty
> ---
>
> Key: IMPALA-8924
> URL: https://issues.apache.org/jira/browse/IMPALA-8924
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Blocker
> Fix For: Impala 3.4.0
>
>
> When running exhaustive tests with result spooling enabled, there are several 
> impalad crashes with the following stack:
> {code:java}
> #0  0x7f5e797541f7 in raise () from /lib64/libc.so.6
> #1  0x7f5e797558e8 in abort () from /lib64/libc.so.6
> #2  0x04cc5834 in google::DumpStackTraceAndExit() ()
> #3  0x04cbc28d in google::LogMessage::Fail() ()
> #4  0x04cbdb32 in google::LogMessage::SendToLog() ()
> #5  0x04cbbc67 in google::LogMessage::Flush() ()
> #6  0x04cbf22e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x029a16cd in impala::SpillableRowBatchQueue::IsEmpty 
> (this=0x13d504e0) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/spillable-row-batch-queue.cc:128
> #8  0x025f5610 in impala::BufferedPlanRootSink::IsQueueEmpty 
> (this=0x13943000) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.h:147
> #9  0x025f4e81 in impala::BufferedPlanRootSink::GetNext 
> (this=0x13943000, state=0x13d2a1c0, results=0x173c8520, num_results=-1, 
> eos=0xd30cde1) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.cc:158
> #10 0x0294ef4d in impala::Coordinator::GetNext (this=0xe4ed180, 
> results=0x173c8520, max_rows=-1, eos=0xd30cde1) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/coordinator.cc:683
> #11 0x02251043 in impala::ClientRequestState::FetchRowsInternal 
> (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:959
> #12 0x022503e7 in impala::ClientRequestState::FetchRows 
> (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:851
> #13 0x0226a36d in impala::ImpalaServer::FetchInternal 
> (this=0x12d14800, request_state=0xd30c800, start_over=false, fetch_size=-1, 
> query_results=0x7f5daf861138) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:582
> #14 0x02264970 in impala::ImpalaServer::fetch (this=0x12d14800, 
> query_results=..., query_handle=..., start_over=false, fetch_size=-1) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:188
> #15 0x027caf09 in beeswax::BeeswaxServiceProcessor::process_fetch 
> (this=0x12d6fc20, seqid=0, iprot=0x119f5780, oprot=0x119f56c0, 
> callContext=0xdf92060) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:3398
> #16 0x027c94e6 in beeswax::BeeswaxServiceProcessor::dispatchCall 
> (this=0x12d6fc20, iprot=0x119f5780, oprot=0x119f56c0, fname=..., seqid=0, 
> callContext=0xdf92060) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:3200
> #17 0x02796f13 in impala::ImpalaServiceProcessor::dispatchCall 
> (this=0x12d6fc20, iprot=0x119f5780, oprot=0x119f56c0, fname=..., seqid=0, 
> callContext=0xdf92060) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/ImpalaService.cpp:1824
> #18 0x01b3cee4 in apache::thrift::TDispatchProcessor::process 
> (this=0x12d6fc20, in=..., out=..., connectionContext=0xdf92060) at 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/thrift-0.9.3-p7/include/thrift/TDispatchProcessor.h:121
> #19 0x01f9bf28 in 
> apache::thrift::server::TAcceptQueueServer::Task::run (this=0xdf92000) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/rpc/TAcceptQueueServer.cpp:84
> #20 0x01f9166d in impala::ThriftThread::RunRunnable (this=0x116ddfc0, 
> runnable=..., promise=0x7f5db0862e90) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/rpc/thrift-thread.cc:74
> #21 0x01f92d93 in boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise*>::operator() 
> 

[jira] [Resolved] (IMPALA-8934) Add failpoint tests to result spooling code

2019-09-19 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8934.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Add failpoint tests to result spooling code
> ---
>
> Key: IMPALA-8934
> URL: https://issues.apache.org/jira/browse/IMPALA-8934
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 3.2.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> IMPALA-8924 was discovered while running {{test_failpoints.py}} with results 
> spooling enabled. The goal of this JIRA is to add similar failpoint coverage 
> to {{test_result_spooling.py}} so that we have sufficient coverage for the 
> various failure paths when result spooling is enabled.
> The failure paths that should be covered include:
> * Failures while executing the exec tree should be handled correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8924) DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933711#comment-16933711
 ] 

ASF subversion and git services commented on IMPALA-8924:
-

Commit 391942d79dec18b086a85e29a75a873e74955518 in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=391942d ]

IMPALA-8924, IMPALA-8934: Result spooling failpoint tests, fix DCHECKs

Adds several "failpoint" tests to test_result_spooling.py. These tests
use debug_actions spread throughout buffered-plan-root-sink.cc to
trigger failures while result spooling is running. The tests validate
that all queries gracefully fail and do not cause any impalad crashes.

Fixed a few bugs that came up when adding these tests, as well as the
crash reported in IMPALA-8924 (which is now covered by the failpoint
tests added in this patch).

The first bug fixed was a DCHECK in SpillableRowBatchQueue::IsEmpty()
where the method was being called after the queue had been closed. The
fix is to only call IsEmpty() if IsOpen() returns true.

The second bug was an issue in the cancellation path where
BufferedPlanRootSink::GetNext would enter an infinite loop if the query
was cancelled and then GetNext was called. The fix is to check the
cancellation state in the outer while loop.

Testing:
* Added new tests to test_result_spooling.py
* Ran core tests

Change-Id: Ib96f797bc8a5ba8baf9fb28abd1f645345bbe932
Reviewed-on: http://gerrit.cloudera.org:8080/14214
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty
> ---
>
> Key: IMPALA-8924
> URL: https://issues.apache.org/jira/browse/IMPALA-8924
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Blocker
>
> When running exhaustive tests with result spooling enabled, there are several 
> impalad crashes with the following stack:
> {code:java}
> #0  0x7f5e797541f7 in raise () from /lib64/libc.so.6
> #1  0x7f5e797558e8 in abort () from /lib64/libc.so.6
> #2  0x04cc5834 in google::DumpStackTraceAndExit() ()
> #3  0x04cbc28d in google::LogMessage::Fail() ()
> #4  0x04cbdb32 in google::LogMessage::SendToLog() ()
> #5  0x04cbbc67 in google::LogMessage::Flush() ()
> #6  0x04cbf22e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x029a16cd in impala::SpillableRowBatchQueue::IsEmpty 
> (this=0x13d504e0) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/spillable-row-batch-queue.cc:128
> #8  0x025f5610 in impala::BufferedPlanRootSink::IsQueueEmpty 
> (this=0x13943000) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.h:147
> #9  0x025f4e81 in impala::BufferedPlanRootSink::GetNext 
> (this=0x13943000, state=0x13d2a1c0, results=0x173c8520, num_results=-1, 
> eos=0xd30cde1) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.cc:158
> #10 0x0294ef4d in impala::Coordinator::GetNext (this=0xe4ed180, 
> results=0x173c8520, max_rows=-1, eos=0xd30cde1) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/coordinator.cc:683
> #11 0x02251043 in impala::ClientRequestState::FetchRowsInternal 
> (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:959
> #12 0x022503e7 in impala::ClientRequestState::FetchRows 
> (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:851
> #13 0x0226a36d in impala::ImpalaServer::FetchInternal 
> (this=0x12d14800, request_state=0xd30c800, start_over=false, fetch_size=-1, 
> query_results=0x7f5daf861138) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:582
> #14 0x02264970 in impala::ImpalaServer::fetch (this=0x12d14800, 
> query_results=..., query_handle=..., start_over=false, fetch_size=-1) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:188
> #15 0x027caf09 in beeswax::BeeswaxServiceProcessor::process_fetch 
> (this=0x12d6fc20, seqid=0, iprot=0x119f5780, oprot=0x119f56c0, 
> callContext=0xdf92060) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:3398
> #16 0x027c94e6 in 

[jira] [Commented] (IMPALA-8934) Add failpoint tests to result spooling code

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933710#comment-16933710
 ] 

ASF subversion and git services commented on IMPALA-8934:
-

Commit 391942d79dec18b086a85e29a75a873e74955518 in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=391942d ]

IMPALA-8924, IMPALA-8934: Result spooling failpoint tests, fix DCHECKs

Adds several "failpoint" tests to test_result_spooling.py. These tests
use debug_actions spread throughout buffered-plan-root-sink.cc to
trigger failures while result spooling is running. The tests validate
that all queries gracefully fail and do not cause any impalad crashes.

Fixed a few bugs that came up when adding these tests, as well as the
crash reported in IMPALA-8924 (which is now covered by the failpoint
tests added in this patch).

The first bug fixed was a DCHECK in SpillableRowBatchQueue::IsEmpty()
where the method was being called after the queue had been closed. The
fix is to only call IsEmpty() if IsOpen() returns true.

The second bug was an issue in the cancellation path where
BufferedPlanRootSink::GetNext would enter an infinite loop if the query
was cancelled and then GetNext was called. The fix is to check the
cancellation state in the outer while loop.

Testing:
* Added new tests to test_result_spooling.py
* Ran core tests

Change-Id: Ib96f797bc8a5ba8baf9fb28abd1f645345bbe932
Reviewed-on: http://gerrit.cloudera.org:8080/14214
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add failpoint tests to result spooling code
> ---
>
> Key: IMPALA-8934
> URL: https://issues.apache.org/jira/browse/IMPALA-8934
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 3.2.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> IMPALA-8924 was discovered while running {{test_failpoints.py}} with results 
> spooling enabled. The goal of this JIRA is to add similar failpoint coverage 
> to {{test_result_spooling.py}} so that we have sufficient coverage for the 
> various failure paths when result spooling is enabled.
> The failure paths that should be covered include:
> * Failures while executing the exec tree should be handled correctly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8924) DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933709#comment-16933709
 ] 

ASF subversion and git services commented on IMPALA-8924:
-

Commit 391942d79dec18b086a85e29a75a873e74955518 in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=391942d ]

IMPALA-8924, IMPALA-8934: Result spooling failpoint tests, fix DCHECKs

Adds several "failpoint" tests to test_result_spooling.py. These tests
use debug_actions spread throughout buffered-plan-root-sink.cc to
trigger failures while result spooling is running. The tests validate
that all queries gracefully fail and do not cause any impalad crashes.

Fixed a few bugs that came up when adding these tests, as well as the
crash reported in IMPALA-8924 (which is now covered by the failpoint
tests added in this patch).

The first bug fixed was a DCHECK in SpillableRowBatchQueue::IsEmpty()
where the method was being called after the queue had been closed. The
fix is to only call IsEmpty() if IsOpen() returns true.

The second bug was an issue in the cancellation path where
BufferedPlanRootSink::GetNext would enter an infinite loop if the query
was cancelled and then GetNext was called. The fix is to check the
cancellation state in the outer while loop.

Testing:
* Added new tests to test_result_spooling.py
* Ran core tests

Change-Id: Ib96f797bc8a5ba8baf9fb28abd1f645345bbe932
Reviewed-on: http://gerrit.cloudera.org:8080/14214
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> DCHECK(!closed_) in SpillableRowBatchQueue::IsEmpty
> ---
>
> Key: IMPALA-8924
> URL: https://issues.apache.org/jira/browse/IMPALA-8924
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Blocker
>
> When running exhaustive tests with result spooling enabled, there are several 
> impalad crashes with the following stack:
> {code:java}
> #0  0x7f5e797541f7 in raise () from /lib64/libc.so.6
> #1  0x7f5e797558e8 in abort () from /lib64/libc.so.6
> #2  0x04cc5834 in google::DumpStackTraceAndExit() ()
> #3  0x04cbc28d in google::LogMessage::Fail() ()
> #4  0x04cbdb32 in google::LogMessage::SendToLog() ()
> #5  0x04cbbc67 in google::LogMessage::Flush() ()
> #6  0x04cbf22e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x029a16cd in impala::SpillableRowBatchQueue::IsEmpty 
> (this=0x13d504e0) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/spillable-row-batch-queue.cc:128
> #8  0x025f5610 in impala::BufferedPlanRootSink::IsQueueEmpty 
> (this=0x13943000) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.h:147
> #9  0x025f4e81 in impala::BufferedPlanRootSink::GetNext 
> (this=0x13943000, state=0x13d2a1c0, results=0x173c8520, num_results=-1, 
> eos=0xd30cde1) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/buffered-plan-root-sink.cc:158
> #10 0x0294ef4d in impala::Coordinator::GetNext (this=0xe4ed180, 
> results=0x173c8520, max_rows=-1, eos=0xd30cde1) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/coordinator.cc:683
> #11 0x02251043 in impala::ClientRequestState::FetchRowsInternal 
> (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:959
> #12 0x022503e7 in impala::ClientRequestState::FetchRows 
> (this=0xd30c800, max_rows=-1, fetched_rows=0x173c8520) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/client-request-state.cc:851
> #13 0x0226a36d in impala::ImpalaServer::FetchInternal 
> (this=0x12d14800, request_state=0xd30c800, start_over=false, fetch_size=-1, 
> query_results=0x7f5daf861138) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:582
> #14 0x02264970 in impala::ImpalaServer::fetch (this=0x12d14800, 
> query_results=..., query_handle=..., start_over=false, fetch_size=-1) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/service/impala-beeswax-server.cc:188
> #15 0x027caf09 in beeswax::BeeswaxServiceProcessor::process_fetch 
> (this=0x12d6fc20, seqid=0, iprot=0x119f5780, oprot=0x119f56c0, 
> callContext=0xdf92060) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:3398
> #16 0x027c94e6 in 

[jira] [Commented] (IMPALA-8703) SQL:2016 datetime patterns - Milestone 1

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933681#comment-16933681
 ] 

ASF subversion and git services commented on IMPALA-8703:
-

Commit bca1b43efb72d839962f9e6999ca747fca7f17d9 in impala's branch 
refs/heads/master from Gabor Kaszab
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=bca1b43 ]

IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1

This enhancement introduces FORMAT clause for CAST() operator that is
applicable for casts between string types and timestamp types. Instead
of accepting SimpleDateFormat patterns the FORMAT clause supports
datetime patterns following the ISO:SQL:2016 standard.
Note, the CAST() operator without the FORMAT clause still uses
Impala's implementation of SimpleDateFormat handling. Similarly, the
existing conversion functions such as to_timestamp(), from_timestamp()
etc. remain unchanged and use SimpleDateFormat. Contrary to how these
functions work the FORMAT clause must specify a string literal and
cannot be used with any other kind of a string expression.

Milestone 1 contains all the format tokens covered by the SQL
standard. Further milestones will add more functionality on top of
this list to cover functionality provided by other RDBMS systems.

List of tokens implemented by this change:
- , YYY, YY, Y: Year tokens
- , RR: Round year tokens
- MM: Month (1-12)
- DD: Day (1-31)
- DDD: Day of year (1-366)
- HH, HH12: Hour of day (1-12)
- HH24: Hour of day (0-23)
- MI: Minute (0-59)
- SS: Second (0-59)
- S: Second of day (0-86399)
- FF, FF1, ..., FF9: Fractional second
- AM, PM, A.M., P.M.: Meridiem indicators
- TZH: Timezone hour (-99-+99)
- TZM: Timezone minute (0-99)
- Separators: - . / , ' ; : space
- ISO8601 date indicators (T, Z)

Some notes about the matching algorithm:
- The parsing algorithm uses these tokens in a case insensitive
  manner.
- The separators are interchangeable with each other. For example a
  '-' separator in the format will match with a '.' character in the
  input.
- The length of the separator sequences is handled flexibly meaning
  that a single separator character in the format for instance would
  match with a multi-separator sequence in the input.
- In a string type to timestamp conversion the timezone offset tokens
  are parsed, expected to match with the input but they don't adjust
  the result as the input is already expected to be in UTC format.

Usage example:
SELECT CAST('01-02-2019' AS TIMESTAMP FORMAT 'MM-DD-');
SELECT CAST('2019.10.10 13:30:40.123456 +01:30' AS TIMESTAMP
FORMAT '-MM-DD HH24:MI:SS.FF9 TZH:TZM');
SELECT CAST(timestamp_column as STRING
FORMAT " MM HH12 YY") from some_table;

Change-Id: I19d8d097a45ae6f103b6cd1b2d81aad38dfd9e23
Reviewed-on: http://gerrit.cloudera.org:8080/13722
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> SQL:2016 datetime patterns - Milestone 1
> 
>
> Key: IMPALA-8703
> URL: https://issues.apache.org/jira/browse/IMPALA-8703
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend, Frontend
>Affects Versions: Impala 2.2.4
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Critical
>
> Design doc for SQL:2016 datetime patterns:
> https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/
> Milestone 1 content:
> - Introduce FORMAT clause for CAST()
> - Implement basic SQL:2016 datetime patterns to comply with the standard. For 
> more details check the document above.
> - Use the new SQL:2016 pattern handling for CAST(.. FORMAT..)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8944) Update and re-enable S3PlannerTest

2019-09-19 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8944.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Update and re-enable S3PlannerTest
> --
>
> Key: IMPALA-8944
> URL: https://issues.apache.org/jira/browse/IMPALA-8944
> Project: IMPALA
>  Issue Type: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> It looks like we don't run {{S3PlannerTest}} in our regular Jenkins jobs. 
> When run against a HDFS mini-cluster, they are skipped because the 
> {{TARGET_FILESYSTEM}} is not S3. On our S3 jobs, they don't run either 
> because we skip all fe/ tests (most of them don't work against S3 / assume 
> they are running on HDFS).
> A few things need to be fixed to get this working:
> * The test cases in {{S3PlannerTest}} need to be fixed
> * The Jenkins jobs that runs the S3 tests needs the ability to run specific 
> fe/ tests (e.g. just the {{S3PlannerTest}} and to skip the rest)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMPALA-5931) Don't synthesize block metadata in the catalog for S3/ADLS

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933615#comment-16933615
 ] 

ASF subversion and git services commented on IMPALA-5931:
-

Commit feed25084a999fe0a4e7b58b5264fce5829c43e7 in impala's branch 
refs/heads/master from stakiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=feed250 ]

IMPALA-8944: Update and re-enable S3PlannerTest

Addresses several test infra issues that were preventing the
S3PlannerTest from running successfully. Disables a few tests that are
no longer working, and removes some planner checks that are no longer
applicable when running on S3. Specifically, this patch removes the
checks in PlannerTestBase#checkScanRangeLocations when running against
S3, because the planner no longer generates scan ranges; generation is
deferred to the scheduler (IMPALA-5931).

Replaces the old logic of specifying S3-specific fe/ tests with a
combination of JUnit Categories and Maven Profiles. The previous method
was broken and assumed that all S3-specific fe/ tests started with S3*.
The new approach removes that restriction and only requires S3-specific
JUnit tests to be tagged with the Java annotation
'@Category(S3Tests.class)' (entire classes or individual tests can be
tagged with the annotation).

Testing:
* Ran fe/ tests with TARGET_FILESYSTEM=s3

Change-Id: I1690b6c5346376cfd4845c72062cc237e0f9
Reviewed-on: http://gerrit.cloudera.org:8080/14248
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Don't synthesize block metadata in the catalog for S3/ADLS
> --
>
> Key: IMPALA-5931
> URL: https://issues.apache.org/jira/browse/IMPALA-5931
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Dan Hecht
>Assignee: Vuk Ercegovac
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Today, the catalog synthesizes block metadata for S3/ADLS by just breaking up 
> splittable files into "blocks" with the FileSystem's default block size. 
> Rather than carrying these blocks around in the catalog and distributing them 
> to all impalad's, we might as well generate the scan ranges on-the-fly during 
> planning. That would save the memory and network bandwidth of blocks.
> That does mean that the planner will have to instantiate and call the 
> filesystem to get the default block size, but for these FileSystem's, that's 
> just a matter of reading the config.
> Perhaps the same can be done for HDFS erasure coding, though that depends on 
> what a block location actually means in that context and whether they contain 
> useful info.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8761) Configuration validation introduced in IMPALA-8559 can be improved

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933612#comment-16933612
 ] 

ASF subversion and git services commented on IMPALA-8761:
-

Commit b2e5e942867ee1eeca754fd383075e7db3c8b590 in impala's branch 
refs/heads/master from Anurag Mantripragada
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b2e5e94 ]

IMPALA-8761: Improve events processor configuration validation.

This patch aims to improve the validation of configuration keys
from the HMS.

IMPALA-8559 introduced configuration validation for events processor
configurations. In the existing implementation, the events processor
errors out as soon as it sees a validation failure. If there are
more than one configuration errors, the users may have to restart
HMS each time they fix a configuration error. This is bad user
experience. This change collects all the configuration issues and
logs expected values before erroring out. Users can now fix all
issues in one go.

Testing:
Added testValidateConfigs() to assert if multiple incorrect values
are detected at once.

Change-Id: I73480872ef93215d05c1fd922e64eb68a8a63a42
Reviewed-on: http://gerrit.cloudera.org:8080/14240
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 


> Configuration validation introduced in IMPALA-8559 can be improved
> --
>
> Key: IMPALA-8761
> URL: https://issues.apache.org/jira/browse/IMPALA-8761
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Anurag Mantripragada
>Priority: Major
>
> The issue with configuration validation in IMPALA-8559 is that it validates 
> one configuration at a time and fails as soon as there is a validation error. 
> Since there are more than one configuration keys to validate, user may have 
> to restart HMS again and again if there are multiple configuration changes 
> which are needed. This is not a great user experience. A simple improvement 
> that can be made is do all the configuration validations together and then 
> present the results together in case of failures so that user can change all 
> the required changes in one go.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8948) [DOCS] Review "How Impala Works with Hadoop File Formats"

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933611#comment-16933611
 ] 

ASF subversion and git services commented on IMPALA-8948:
-

Commit c0646a6c2f7f177e3eb2994be9597a04e72ee198 in impala's branch 
refs/heads/master from Alex Rodoni
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c0646a6 ]

IMPALA-8948: [DOCS] Impala cannot write to compressed text file

Change-Id: I7eac0431f3daeb1c3102c6a58670bce0e899d5f2
Reviewed-on: http://gerrit.cloudera.org:8080/14237
Tested-by: Impala Public Jenkins 
Reviewed-by: Vincent Tran 
Reviewed-by: Tim Armstrong 


> [DOCS]  Review "How Impala Works with Hadoop File Formats"
> --
>
> Key: IMPALA-8948
> URL: https://issues.apache.org/jira/browse/IMPALA-8948
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 3.2.0
>Reporter: Vincent Tran
>Assignee: Alex Rodoni
>Priority: Minor
> Fix For: Impala 3.4.0
>
>
> Ref: 
> [https://impala.apache.org/docs/build/html/topics/impala_file_formats.html]
>  
> In the "Impala Can INSERT?" column of the file type support matrix for Text, 
> we claim that Impala can insert into a compressed-text table: "Yes: {{CREATE 
> TABLE}}, {{INSERT}}, {{LOAD DATA}}, and query."
>  
> This doesn't appear to be the case as Impala does not support the writing of 
> compressed text in any version at the time of this writing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8944) Update and re-enable S3PlannerTest

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933614#comment-16933614
 ] 

ASF subversion and git services commented on IMPALA-8944:
-

Commit feed25084a999fe0a4e7b58b5264fce5829c43e7 in impala's branch 
refs/heads/master from stakiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=feed250 ]

IMPALA-8944: Update and re-enable S3PlannerTest

Addresses several test infra issues that were preventing the
S3PlannerTest from running successfully. Disables a few tests that are
no longer working, and removes some planner checks that are no longer
applicable when running on S3. Specifically, this patch removes the
checks in PlannerTestBase#checkScanRangeLocations when running against
S3, because the planner no longer generates scan ranges; generation is
deferred to the scheduler (IMPALA-5931).

Replaces the old logic of specifying S3-specific fe/ tests with a
combination of JUnit Categories and Maven Profiles. The previous method
was broken and assumed that all S3-specific fe/ tests started with S3*.
The new approach removes that restriction and only requires S3-specific
JUnit tests to be tagged with the Java annotation
'@Category(S3Tests.class)' (entire classes or individual tests can be
tagged with the annotation).

Testing:
* Ran fe/ tests with TARGET_FILESYSTEM=s3

Change-Id: I1690b6c5346376cfd4845c72062cc237e0f9
Reviewed-on: http://gerrit.cloudera.org:8080/14248
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Update and re-enable S3PlannerTest
> --
>
> Key: IMPALA-8944
> URL: https://issues.apache.org/jira/browse/IMPALA-8944
> Project: IMPALA
>  Issue Type: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> It looks like we don't run {{S3PlannerTest}} in our regular Jenkins jobs. 
> When run against a HDFS mini-cluster, they are skipped because the 
> {{TARGET_FILESYSTEM}} is not S3. On our S3 jobs, they don't run either 
> because we skip all fe/ tests (most of them don't work against S3 / assume 
> they are running on HDFS).
> A few things need to be fixed to get this working:
> * The test cases in {{S3PlannerTest}} need to be fixed
> * The Jenkins jobs that runs the S3 tests needs the ability to run specific 
> fe/ tests (e.g. just the {{S3PlannerTest}} and to skip the rest)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8559) Support config validation for event processor on HMS-3

2019-09-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933613#comment-16933613
 ] 

ASF subversion and git services commented on IMPALA-8559:
-

Commit b2e5e942867ee1eeca754fd383075e7db3c8b590 in impala's branch 
refs/heads/master from Anurag Mantripragada
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b2e5e94 ]

IMPALA-8761: Improve events processor configuration validation.

This patch aims to improve the validation of configuration keys
from the HMS.

IMPALA-8559 introduced configuration validation for events processor
configurations. In the existing implementation, the events processor
errors out as soon as it sees a validation failure. If there are
more than one configuration errors, the users may have to restart
HMS each time they fix a configuration error. This is bad user
experience. This change collects all the configuration issues and
logs expected values before erroring out. Users can now fix all
issues in one go.

Testing:
Added testValidateConfigs() to assert if multiple incorrect values
are detected at once.

Change-Id: I73480872ef93215d05c1fd922e64eb68a8a63a42
Reviewed-on: http://gerrit.cloudera.org:8080/14240
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 


> Support config validation for event processor on HMS-3
> --
>
> Key: IMPALA-8559
> URL: https://issues.apache.org/jira/browse/IMPALA-8559
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> HMS-3 does not need certain configuration validations which are currently 
> added to {{EventProcessorConfigValidation}}. We should use the metastore shim 
> to select the ones which are interesting when running against HMS-3.
> Also, it looks like HMS-3 adds authorization against the notification API so 
> we should include {{hive.metastore.event.db.notification.api.auth}} to 
> {{false}} to the mini-cluster hive-site.xml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8953) Tables and Databases sharing same name can cause query failures if table is not readable by Impala

2019-09-19 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933609#comment-16933609
 ] 

Tim Armstrong commented on IMPALA-8953:
---

[~vihangk1] fyi

> Tables and Databases sharing same name can cause query failures if table is 
> not readable by Impala
> --
>
> Key: IMPALA-8953
> URL: https://issues.apache.org/jira/browse/IMPALA-8953
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Karan Datta
>Priority: Minor
>
> Please see below scenario:
> 1) CREATE DATABASE a;
> 2) Create tables "a.b" and "default.a". However, create "default.a" in a 
> format which Impala cannot read. For example: Open CSV Serde from Hive.
> 3) Connect to Impala and execute below statements -
> a) INVALIDATE METADATA;
> b) use default;
> c) select * from a.b;
> Above query will lead to following exception:
> CAUSED BY: TableLoadingException: Unrecognized table type for table: default.a
> If the database of query "select * from a.b;" is changed to any other 
> database then, the query executes successfully. For example:
> use test1 or use test;
> select * from a.b;
> Also, if the table "default.a" is created with a format that Impala can 
> read/load, query "select * from a.b" executes successfully.
> Above scenario occurs as Impala can refer to the collection column b in table 
> a in database default, or it can refer to table b in database a. As Impala 
> cannot disambiguate between the two cases unless it knows that there is a 
> collection column b be in default.a, it loads metadata of table "default.a" 
> while executing query on table "a.b". If the table "default.a" exists and is 
> not readable by Impala, any query executed on tables available in database 
> "a" from default database will fail. For a better UX, we should 
> improve/change this behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8953) Tables and Databases sharing same name can cause query failures if table is not readable by Impala

2019-09-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-8953:
--
Component/s: (was: Frontend)
 Catalog

> Tables and Databases sharing same name can cause query failures if table is 
> not readable by Impala
> --
>
> Key: IMPALA-8953
> URL: https://issues.apache.org/jira/browse/IMPALA-8953
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Karan Datta
>Priority: Minor
>
> Please see below scenario:
> 1) CREATE DATABASE a;
> 2) Create tables "a.b" and "default.a". However, create "default.a" in a 
> format which Impala cannot read. For example: Open CSV Serde from Hive.
> 3) Connect to Impala and execute below statements -
> a) INVALIDATE METADATA;
> b) use default;
> c) select * from a.b;
> Above query will lead to following exception:
> CAUSED BY: TableLoadingException: Unrecognized table type for table: default.a
> If the database of query "select * from a.b;" is changed to any other 
> database then, the query executes successfully. For example:
> use test1 or use test;
> select * from a.b;
> Also, if the table "default.a" is created with a format that Impala can 
> read/load, query "select * from a.b" executes successfully.
> Above scenario occurs as Impala can refer to the collection column b in table 
> a in database default, or it can refer to table b in database a. As Impala 
> cannot disambiguate between the two cases unless it knows that there is a 
> collection column b be in default.a, it loads metadata of table "default.a" 
> while executing query on table "a.b". If the table "default.a" exists and is 
> not readable by Impala, any query executed on tables available in database 
> "a" from default database will fail. For a better UX, we should 
> improve/change this behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8953) Tables and Databases sharing same name can cause query failures if table is not readable by Impala

2019-09-19 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-8953:
--
Component/s: Frontend

> Tables and Databases sharing same name can cause query failures if table is 
> not readable by Impala
> --
>
> Key: IMPALA-8953
> URL: https://issues.apache.org/jira/browse/IMPALA-8953
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Karan Datta
>Priority: Minor
>
> Please see below scenario:
> 1) CREATE DATABASE a;
> 2) Create tables "a.b" and "default.a". However, create "default.a" in a 
> format which Impala cannot read. For example: Open CSV Serde from Hive.
> 3) Connect to Impala and execute below statements -
> a) INVALIDATE METADATA;
> b) use default;
> c) select * from a.b;
> Above query will lead to following exception:
> CAUSED BY: TableLoadingException: Unrecognized table type for table: default.a
> If the database of query "select * from a.b;" is changed to any other 
> database then, the query executes successfully. For example:
> use test1 or use test;
> select * from a.b;
> Also, if the table "default.a" is created with a format that Impala can 
> read/load, query "select * from a.b" executes successfully.
> Above scenario occurs as Impala can refer to the collection column b in table 
> a in database default, or it can refer to table b in database a. As Impala 
> cannot disambiguate between the two cases unless it knows that there is a 
> collection column b be in default.a, it loads metadata of table "default.a" 
> while executing query on table "a.b". If the table "default.a" exists and is 
> not readable by Impala, any query executed on tables available in database 
> "a" from default database will fail. For a better UX, we should 
> improve/change this behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8956) Row count incorrect in summary while query running

2019-09-19 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933515#comment-16933515
 ] 

Tim Armstrong commented on IMPALA-8956:
---

IMPALA-8026 should have fixed it

> Row count incorrect in summary while query running
> --
>
> Key: IMPALA-8956
> URL: https://issues.apache.org/jira/browse/IMPALA-8956
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Peter Ebert
>Priority: Major
> Attachments: image-2019-09-18-20-48-39-069.png, 
> image-2019-09-18-21-28-33-720.png
>
>
> For a query (below) the summary is incorrect on the row count for the nested 
> loop join, however when the query is complete its corrected:
> select g.name, g.start, g.aa, sum(case when p.pn < 125.0 then 1 else 0 end) / 
> cast(max(lf_count) as decimal(10,6)) as lf, sum(case when p.pn >= 125.0 then 
> 1 else 0 end) / cast(max(hf_count) as decimal(10,6)) as hff from g join p on 
> (g.sampleid = p.sample_id) cross join (select count(*) as lf_count from p 
> where pn < 125.0) lff cross join (select count(*) as hf_count from p where pn 
> >= 125.0) hf group by g.name, g.start, g.aa order by name desc
>  
> !image-2019-09-18-20-48-39-069.png!
> compared to the finished query
> !image-2019-09-18-21-28-33-720.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-2138) Get rid of unused columns by upstream operators at points of materialization

2019-09-19 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933503#comment-16933503
 ] 

Tim Armstrong commented on IMPALA-2138:
---

[~stiga-huang] I used the bin/single_node_perf_run.py script in the Impala 
repository.

> Get rid of unused columns by upstream operators at points of materialization
> 
>
> Key: IMPALA-2138
> URL: https://issues.apache.org/jira/browse/IMPALA-2138
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 1.4, Impala 2.0, Impala 2.2
>Reporter: Ippokratis Pandis
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: performance
> Attachments: 0001-Projection-prototype.patch, performance_result.txt
>
>
> It would be a very good performance improvement if we were able to get rid of 
> columns as soon as we know that they are not going to be used from any other 
> operators upstream. The amount of data we are handling will reduce making the 
> network and I/O (spilling) transfers more efficient. It will also improve 
> cache performance. 
> The current row-wise in-memory format does not make it very easy to get rid 
> of such unused columns. However, there are points of materialization where we 
> copy-out the tuples and we can actually perform these projections. There are 
> multiple points of materialization, notably:
> * The exchange operator
> * The build side of hash join
> * The probe side of hash join when we have spilling
> * The aggregation
> * Sorts and analytic function evaluation
> In order to do these projections we need to modify the FE and know at each 
> operator what's the minimum set of columns that are being referenced by this 
> operator and all the upstream ones. (That minimum set is very easy to be 
> calculated during an additional top-down traversal of the plan.) We also need 
> to modify the BE and make the copy-out operation aware of such projections.
> Assigning first to Alex, because of the needed FE changes. Happy to take care 
> of the needed BE changes. Perhaps we could split this issue into 2 sub-tasks, 
> the FE and the BE changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8006) Consider replacing the modulus in KrpcDatastreamSender with fast mod

2019-09-19 Thread Norbert Luksa (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933446#comment-16933446
 ] 

Norbert Luksa commented on IMPALA-8006:
---

Run a few tests on a mini-benchmark, with dummy data on my PC (Intel(R) 
Core(TM) i7-8700 CPU @ 3.20GHz), just to see if there's a real difference 
between the ordinary modulo and Lemire's alternative, but could find any 
significant difference (or at least it seems insignificant to me).

Called each version a billion times with values ranging between 2^31 and 2^32, 
fast mod running for 22.730s and modulo for 23.732s.

(Also had a look at the generated assembly to make sure the compiler didn't 
optimize them out.)

> Consider replacing the modulus in KrpcDatastreamSender with fast mod 
> -
>
> Key: IMPALA-8006
> URL: https://issues.apache.org/jira/browse/IMPALA-8006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Michael Ho
>Priority: Major
>  Labels: ramp-up
>
> [~tlipcon]  pointed out that there is potential improvement which can be 
> implemented for the modulus used in our sender for the partitioning 
> exchanges: 
> http://www.idryman.org/blog/2017/05/03/writing-a-damn-fast-hash-table-with-tiny-memory-footprints/
>  (Optimizing Division for Hash Table Size).
> We should evaluate its effectiveness and implement it for 
> KrpcDataStreamSender if appropriate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8875) TestHmsIntegration.test_drop_column_maintains_stats seems flaky

2019-09-19 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer resolved IMPALA-8875.
-
Resolution: Fixed

> TestHmsIntegration.test_drop_column_maintains_stats seems flaky
> ---
>
> Key: IMPALA-8875
> URL: https://issues.apache.org/jira/browse/IMPALA-8875
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, impala-stats
>
> The test of TestHmsIntegration.test_drop_column_maintains_stats seems flaky. 
> The related test file was updated recently due to 
> https://issues.apache.org/jira/browse/IMPALA-8823. Create this JIRA to track 
> this failed test. Maybe [~gaborkaszab] you could take a brief look at this? 
> Thanks!
> The error messages are provided in the following.
> {code:java}
> Error Message 
> assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...} 
> Common items: {'avg_col_len': '', 'bitVector': '', 'col_name': 'x', 
> 'comment': 'from deserializer', 'data_type': 'int', 'distinct_count': '0', 
> 'max': '0', 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': 
> '0', 'num_trues': ''} Right contains more items: {'COLUMN_STATS_ACCURATE': 
> '{}'} Full diff: + {'COLUMN_STATS_ACCURATE': '{}', - {'avg_col_len': '', ? ^ 
> + 'avg_col_len': '', ? ^ 'bitVector': '', 'col_name': 'x', 'comment': 'from 
> deserializer', 'data_type': 'int', 'distinct_count': '0', 'max': '0', 
> 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': '0', 
> 'num_trues': ''}
> {code}
> The stack trace is given as follows.
> {code:java}
> Stacktrace
> metadata/test_hms_integration.py:390: in test_drop_column_maintains_stats
> assert hive_x_stats == self.hive_column_stats(table_name, 'x')
> E   assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...}
> E Common items:
> E {'avg_col_len': '',
> E  'bitVector': '',
> E  'col_name': 'x',
> E  'comment': 'from deserializer',
> E  'data_type': 'int',
> E  'distinct_count': '0',
> E  'max': '0',
> E  'max_col_len': '',
> E  'min': '0',
> E  'num_falses': '',
> E  'num_nulls': '0',
> E  'num_trues': ''}
> E Right contains more items:
> E {'COLUMN_STATS_ACCURATE': '{}'}
> E Full diff:
> E + {'COLUMN_STATS_ACCURATE': '{}',
> E - {'avg_col_len': '',
> E ? ^
> E +  'avg_col_len': '',
> E ? ^
> E 'bitVector': '',
> E 'col_name': 'x',
> E 'comment': 'from deserializer',
> E 'data_type': 'int',
> E 'distinct_count': '0',
> E 'max': '0',
> E 'max_col_len': '',
> E 'min': '0',
> E 'num_falses': '',
> E 'num_nulls': '0',
> E 'num_trues': ''}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8875) TestHmsIntegration.test_drop_column_maintains_stats seems flaky

2019-09-19 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933390#comment-16933390
 ] 

Csaba Ringhofer commented on IMPALA-8875:
-

[~boroknagyz] Yes, I just forgot to resolve it, thanks for reminding me.

> TestHmsIntegration.test_drop_column_maintains_stats seems flaky
> ---
>
> Key: IMPALA-8875
> URL: https://issues.apache.org/jira/browse/IMPALA-8875
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, impala-stats
>
> The test of TestHmsIntegration.test_drop_column_maintains_stats seems flaky. 
> The related test file was updated recently due to 
> https://issues.apache.org/jira/browse/IMPALA-8823. Create this JIRA to track 
> this failed test. Maybe [~gaborkaszab] you could take a brief look at this? 
> Thanks!
> The error messages are provided in the following.
> {code:java}
> Error Message 
> assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...} 
> Common items: {'avg_col_len': '', 'bitVector': '', 'col_name': 'x', 
> 'comment': 'from deserializer', 'data_type': 'int', 'distinct_count': '0', 
> 'max': '0', 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': 
> '0', 'num_trues': ''} Right contains more items: {'COLUMN_STATS_ACCURATE': 
> '{}'} Full diff: + {'COLUMN_STATS_ACCURATE': '{}', - {'avg_col_len': '', ? ^ 
> + 'avg_col_len': '', ? ^ 'bitVector': '', 'col_name': 'x', 'comment': 'from 
> deserializer', 'data_type': 'int', 'distinct_count': '0', 'max': '0', 
> 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': '0', 
> 'num_trues': ''}
> {code}
> The stack trace is given as follows.
> {code:java}
> Stacktrace
> metadata/test_hms_integration.py:390: in test_drop_column_maintains_stats
> assert hive_x_stats == self.hive_column_stats(table_name, 'x')
> E   assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...}
> E Common items:
> E {'avg_col_len': '',
> E  'bitVector': '',
> E  'col_name': 'x',
> E  'comment': 'from deserializer',
> E  'data_type': 'int',
> E  'distinct_count': '0',
> E  'max': '0',
> E  'max_col_len': '',
> E  'min': '0',
> E  'num_falses': '',
> E  'num_nulls': '0',
> E  'num_trues': ''}
> E Right contains more items:
> E {'COLUMN_STATS_ACCURATE': '{}'}
> E Full diff:
> E + {'COLUMN_STATS_ACCURATE': '{}',
> E - {'avg_col_len': '',
> E ? ^
> E +  'avg_col_len': '',
> E ? ^
> E 'bitVector': '',
> E 'col_name': 'x',
> E 'comment': 'from deserializer',
> E 'data_type': 'int',
> E 'distinct_count': '0',
> E 'max': '0',
> E 'max_col_len': '',
> E 'min': '0',
> E 'num_falses': '',
> E 'num_nulls': '0',
> E 'num_trues': ''}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8875) TestHmsIntegration.test_drop_column_maintains_stats seems flaky

2019-09-19 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer resolved IMPALA-8875.
-
Resolution: Fixed

> TestHmsIntegration.test_drop_column_maintains_stats seems flaky
> ---
>
> Key: IMPALA-8875
> URL: https://issues.apache.org/jira/browse/IMPALA-8875
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, impala-stats
>
> The test of TestHmsIntegration.test_drop_column_maintains_stats seems flaky. 
> The related test file was updated recently due to 
> https://issues.apache.org/jira/browse/IMPALA-8823. Create this JIRA to track 
> this failed test. Maybe [~gaborkaszab] you could take a brief look at this? 
> Thanks!
> The error messages are provided in the following.
> {code:java}
> Error Message 
> assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...} 
> Common items: {'avg_col_len': '', 'bitVector': '', 'col_name': 'x', 
> 'comment': 'from deserializer', 'data_type': 'int', 'distinct_count': '0', 
> 'max': '0', 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': 
> '0', 'num_trues': ''} Right contains more items: {'COLUMN_STATS_ACCURATE': 
> '{}'} Full diff: + {'COLUMN_STATS_ACCURATE': '{}', - {'avg_col_len': '', ? ^ 
> + 'avg_col_len': '', ? ^ 'bitVector': '', 'col_name': 'x', 'comment': 'from 
> deserializer', 'data_type': 'int', 'distinct_count': '0', 'max': '0', 
> 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': '0', 
> 'num_trues': ''}
> {code}
> The stack trace is given as follows.
> {code:java}
> Stacktrace
> metadata/test_hms_integration.py:390: in test_drop_column_maintains_stats
> assert hive_x_stats == self.hive_column_stats(table_name, 'x')
> E   assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...}
> E Common items:
> E {'avg_col_len': '',
> E  'bitVector': '',
> E  'col_name': 'x',
> E  'comment': 'from deserializer',
> E  'data_type': 'int',
> E  'distinct_count': '0',
> E  'max': '0',
> E  'max_col_len': '',
> E  'min': '0',
> E  'num_falses': '',
> E  'num_nulls': '0',
> E  'num_trues': ''}
> E Right contains more items:
> E {'COLUMN_STATS_ACCURATE': '{}'}
> E Full diff:
> E + {'COLUMN_STATS_ACCURATE': '{}',
> E - {'avg_col_len': '',
> E ? ^
> E +  'avg_col_len': '',
> E ? ^
> E 'bitVector': '',
> E 'col_name': 'x',
> E 'comment': 'from deserializer',
> E 'data_type': 'int',
> E 'distinct_count': '0',
> E 'max': '0',
> E 'max_col_len': '',
> E 'min': '0',
> E 'num_falses': '',
> E 'num_nulls': '0',
> E 'num_trues': ''}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMPALA-2138) Get rid of unused columns by upstream operators at points of materialization

2019-09-19 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933386#comment-16933386
 ] 

Quanlong Huang commented on IMPALA-2138:


[~tarmstrong]. That's awesome! Is the tool for perf tests open-sourced?

> Get rid of unused columns by upstream operators at points of materialization
> 
>
> Key: IMPALA-2138
> URL: https://issues.apache.org/jira/browse/IMPALA-2138
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 1.4, Impala 2.0, Impala 2.2
>Reporter: Ippokratis Pandis
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: performance
> Attachments: 0001-Projection-prototype.patch, performance_result.txt
>
>
> It would be a very good performance improvement if we were able to get rid of 
> columns as soon as we know that they are not going to be used from any other 
> operators upstream. The amount of data we are handling will reduce making the 
> network and I/O (spilling) transfers more efficient. It will also improve 
> cache performance. 
> The current row-wise in-memory format does not make it very easy to get rid 
> of such unused columns. However, there are points of materialization where we 
> copy-out the tuples and we can actually perform these projections. There are 
> multiple points of materialization, notably:
> * The exchange operator
> * The build side of hash join
> * The probe side of hash join when we have spilling
> * The aggregation
> * Sorts and analytic function evaluation
> In order to do these projections we need to modify the FE and know at each 
> operator what's the minimum set of columns that are being referenced by this 
> operator and all the upstream ones. (That minimum set is very easy to be 
> calculated during an additional top-down traversal of the plan.) We also need 
> to modify the BE and make the copy-out operation aware of such projections.
> Assigning first to Alex, because of the needed FE changes. Happy to take care 
> of the needed BE changes. Perhaps we could split this issue into 2 sub-tasks, 
> the FE and the BE changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8875) TestHmsIntegration.test_drop_column_maintains_stats seems flaky

2019-09-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933242#comment-16933242
 ] 

Zoltán Borók-Nagy commented on IMPALA-8875:
---

[~csringhofer], can we close this?

> TestHmsIntegration.test_drop_column_maintains_stats seems flaky
> ---
>
> Key: IMPALA-8875
> URL: https://issues.apache.org/jira/browse/IMPALA-8875
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, impala-stats
>
> The test of TestHmsIntegration.test_drop_column_maintains_stats seems flaky. 
> The related test file was updated recently due to 
> https://issues.apache.org/jira/browse/IMPALA-8823. Create this JIRA to track 
> this failed test. Maybe [~gaborkaszab] you could take a brief look at this? 
> Thanks!
> The error messages are provided in the following.
> {code:java}
> Error Message 
> assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...} 
> Common items: {'avg_col_len': '', 'bitVector': '', 'col_name': 'x', 
> 'comment': 'from deserializer', 'data_type': 'int', 'distinct_count': '0', 
> 'max': '0', 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': 
> '0', 'num_trues': ''} Right contains more items: {'COLUMN_STATS_ACCURATE': 
> '{}'} Full diff: + {'COLUMN_STATS_ACCURATE': '{}', - {'avg_col_len': '', ? ^ 
> + 'avg_col_len': '', ? ^ 'bitVector': '', 'col_name': 'x', 'comment': 'from 
> deserializer', 'data_type': 'int', 'distinct_count': '0', 'max': '0', 
> 'max_col_len': '', 'min': '0', 'num_falses': '', 'num_nulls': '0', 
> 'num_trues': ''}
> {code}
> The stack trace is given as follows.
> {code:java}
> Stacktrace
> metadata/test_hms_integration.py:390: in test_drop_column_maintains_stats
> assert hive_x_stats == self.hive_column_stats(table_name, 'x')
> E   assert {'avg_col_len...ializer', ...} == {'COLUMN_STATS...me': 'x', ...}
> E Common items:
> E {'avg_col_len': '',
> E  'bitVector': '',
> E  'col_name': 'x',
> E  'comment': 'from deserializer',
> E  'data_type': 'int',
> E  'distinct_count': '0',
> E  'max': '0',
> E  'max_col_len': '',
> E  'min': '0',
> E  'num_falses': '',
> E  'num_nulls': '0',
> E  'num_trues': ''}
> E Right contains more items:
> E {'COLUMN_STATS_ACCURATE': '{}'}
> E Full diff:
> E + {'COLUMN_STATS_ACCURATE': '{}',
> E - {'avg_col_len': '',
> E ? ^
> E +  'avg_col_len': '',
> E ? ^
> E 'bitVector': '',
> E 'col_name': 'x',
> E 'comment': 'from deserializer',
> E 'data_type': 'int',
> E 'distinct_count': '0',
> E 'max': '0',
> E 'max_col_len': '',
> E 'min': '0',
> E 'num_falses': '',
> E 'num_nulls': '0',
> E 'num_trues': ''}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org