[jira] [Commented] (IMPALA-2563) Support LDAP search bind operations

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072339#comment-17072339
 ] 

ASF subversion and git services commented on IMPALA-2563:
-

Commit 4e6780ebf1dfa90aea01b3e35d3dc9ceb100eaee in impala's branch 
refs/heads/master from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=4e6780e ]

IMPALA-2563: Support LDAP search bind operations

This patch adds a number of new options for controlling LDAP
by restricting authentication to particular users and/or members of
particular groups:
--ldap_group_filter: comma separated list of authorized groups
--ldap_user_filter: comma separated list of authorized users

There are also options to control how LDAP is searched when applying
these filters:
--ldap_group_dn_pattern
--ldap_group_membership_key
--ldap_group_membership_class

These options were modelled on equivalent options in Hive, see:
https://cwiki.apache.org/confluence/display/Hive/User+and+Group+Filter+Support+with+LDAP+Atn+Provider+in+HiveServer2
https://github.com/apache/hive/tree/master/service/src/java/org/apache/hive/service/auth/ldap

This patch also refactors LDAP related functionality into a utility
class, both to make authentication.cc more manageable and to
facilitate follow up work that will add LDAP authentication options
for the webserver.

Testing:
- Added a FE custom cluster test that sets --ldap_group_filter and
  --ldap_user_filter and verifies expected behavior.

Change-Id: I7502a96e9a3c16faa67c03ffac54df2bdebbca8c
Reviewed-on: http://gerrit.cloudera.org:8080/15570
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Support LDAP search bind operations
> ---
>
> Key: IMPALA-2563
> URL: https://issues.apache.org/jira/browse/IMPALA-2563
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Security
>Affects Versions: Impala 2.2.4
>Reporter: Mike Yoder
>Assignee: Thomas Tauber-Marshall
>Priority: Minor
>  Labels: security
>
> Today Impala supports a simple direct bind model. This improvement jira is to 
> bring Impala's LDAP model to be in line with Hive's. Please see in particular 
> https://issues.apache.org/jira/browse/HIVE-7193 and 
> https://cwiki.apache.org/confluence/display/Hive/User+and+Group+Filter+Support+with+LDAP+Atn+Provider+in+HiveServer2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072338#comment-17072338
 ] 

ASF subversion and git services commented on IMPALA-9584:
-

Commit a08cd7f49bb9c69b05fabe9ccd18577cfd300b4e in impala's branch 
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a08cd7f ]

IMPALA-9584: remove flaky avg(TIMESTAMP) aggregates from test_analytic_fns

AVG(TIMESTAMP) is not deterministic, because it uses a double to sum
the timestamps, and adding doubles in different order can lead to
different results. This does not cause problems for DOUBLE columns,
because the test framework does not require exact match if the result
is double. As AVG is the only function for TIMESTAMP with this problem,
reducing the precision of all timestamps checks seemed like an
overkill.

As a short term solution I removed the problematic aggregates from the
tests.

Testing:
- ran only the related tests

Change-Id: I10e0027a64a4e430b7db3ed7c8d0cc8cdcb202e0
Reviewed-on: http://gerrit.cloudera.org:8080/15621
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9590) Resolve error when build tsan and ubsan on arm64

2020-03-31 Thread zhaorenhai (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaorenhai updated IMPALA-9590:
---
Description: 
Tsan build will fail on atomicops-internals-x86.cc build,

so if on arm64, just don't build it.

And Ubsan build should link to libclang_rt.ubsan_standalone's

 aarch 64 version, not x86 version.

  was:
Tsan build will fail on atomicops-internals-x86.cc build,

so if on arm64, just don't build it.


> Resolve error when build tsan and ubsan on arm64
> 
>
> Key: IMPALA-9590
> URL: https://issues.apache.org/jira/browse/IMPALA-9590
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: zhaorenhai
>Assignee: zhaorenhai
>Priority: Major
>
> Tsan build will fail on atomicops-internals-x86.cc build,
> so if on arm64, just don't build it.
> And Ubsan build should link to libclang_rt.ubsan_standalone's
>  aarch 64 version, not x86 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9590) Resolve error when build tsan and ubsan on arm64

2020-03-31 Thread zhaorenhai (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaorenhai updated IMPALA-9590:
---
Summary: Resolve error when build tsan and ubsan on arm64  (was: Resolve 
error when build tsan on arm64)

> Resolve error when build tsan and ubsan on arm64
> 
>
> Key: IMPALA-9590
> URL: https://issues.apache.org/jira/browse/IMPALA-9590
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: zhaorenhai
>Assignee: zhaorenhai
>Priority: Major
>
> Tsan build will fail on atomicops-internals-x86.cc build,
> so if on arm64, just don't build it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9590) Resolve error when build tsan on arm64

2020-03-31 Thread zhaorenhai (Jira)
zhaorenhai created IMPALA-9590:
--

 Summary: Resolve error when build tsan on arm64
 Key: IMPALA-9590
 URL: https://issues.apache.org/jira/browse/IMPALA-9590
 Project: IMPALA
  Issue Type: Sub-task
Reporter: zhaorenhai
Assignee: zhaorenhai


Tsan build will fail on atomicops-internals-x86.cc build,

so if on arm64, just don't build it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9589) Parser failures on valid SQL

2020-03-31 Thread David Rorke (Jira)
David Rorke created IMPALA-9589:
---

 Summary: Parser failures on valid SQL
 Key: IMPALA-9589
 URL: https://issues.apache.org/jira/browse/IMPALA-9589
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.3.0
Reporter: David Rorke


The impala parser fails to parse the following valid SQL:

 
{noformat}
SELECT 
  * 
FROM 
  (
(
  SELECT 
`sr_item_sk`, 
`sr_reason_sk`, 
`sr_ticket_number`, 
`sr_return_quantity` 
  FROM 
`tpcds_1_decimal_parquet`.`store_returns` 
  WHERE 
`sr_reason_sk` IS NOT NULL
) AS `t1` 
INNER JOIN (
  SELECT 
`r_reason_sk` 
  FROM 
`tpcds_1_decimal_parquet`.`reason` 
  WHERE 
`r_reason_desc` = 'reason 66'
) AS `t3` ON `t1`.`sr_reason_sk` = `t3`.`r_reason_sk`
  );

ERROR: ParseException: Syntax error in line 14:
) AS `t1`
  ^
Encountered: AS
Expected: LIMIT, ORDER, UNION
{noformat}
 

The query parses fine when you remove the outer set of parentheses:
{noformat}
SELECT 
  * 
FROM 
  (
SELECT 
  `sr_item_sk`, 
  `sr_reason_sk`, 
  `sr_ticket_number`, 
  `sr_return_quantity` 
FROM 
  `tpcds_1_decimal_parquet`.`store_returns` 
WHERE 
  `sr_reason_sk` IS NOT NULL
  ) AS `t1` 
  INNER JOIN (
SELECT 
  `r_reason_sk` 
FROM 
  `tpcds_1_decimal_parquet`.`reason` 
WHERE 
  `r_reason_desc` = 'reason 66'
  ) AS `t3` ON `t1`.`sr_reason_sk` = `t3`.`r_reason_sk`;
{noformat}
 

The failing query is a simplified subset of the following query (a rewrite of 
TPC-DS query 93) which also fails to parse:
{noformat}
Query: SELECT `t`.`ss_customer_sk`,
  SUM(
CASE WHEN `t1`.`sr_return_quantity` IS NOT NULL THEN CAST(
  `t`.`ss_quantity` - `t1`.`sr_return_quantity` AS DECIMAL(10, 0)
) * `t`.`ss_sales_price` ELSE CAST(
  `t`.`ss_quantity` AS DECIMAL(10, 0)
) * `t`.`ss_sales_price` END
  ) AS `$f1`
FROM
  (
SELECT
  `ss_item_sk`,
  `ss_customer_sk`,
  `ss_ticket_number`,
  `ss_quantity`,
  `ss_sales_price`
FROM
  `tpcds_1_decimal_parquet`.`store_sales`
  ) AS `t`
  INNER JOIN (
(
  SELECT
`sr_item_sk`,
`sr_reason_sk`,
`sr_ticket_number`,
`sr_return_quantity`
  FROM
`tpcds_1_decimal_parquet`.`store_returns`
  WHERE
`sr_reason_sk` IS NOT NULL
) AS `t1`
INNER JOIN (
  SELECT
`r_reason_sk`
  FROM
`tpcds_1_decimal_parquet`.`reason`
  WHERE
`r_reason_desc` = 'reason 66'
) AS `t3` ON `t1`.`sr_reason_sk` = `t3`.`r_reason_sk`
  ) ON `t`.`ss_item_sk` = `t1`.`sr_item_sk`
  AND `t`.`ss_ticket_number` = `t1`.`sr_ticket_number`
GROUP BY
  `t`.`ss_customer_sk`
ORDER BY
  SUM(
CASE WHEN `t1`.`sr_return_quantity` IS NOT NULL THEN CAST(
  `t`.`ss_quantity` - `t1`.`sr_return_quantity` AS DECIMAL(10, 0)
) * `t`.`ss_sales_price` ELSE CAST(
  `t`.`ss_quantity` AS DECIMAL(10, 0)
) * `t`.`ss_sales_price` END
  ),
  `t`.`ss_customer_sk`
LIMIT
  100
Query submitted at: 2020-03-31 18:04:21 (Coordinator: 
https://drorke-dm-perf-coordinator3.drorke-d.xcu2-8y8x.dev.cldr.work:25000)
ERROR: ParseException: Syntax error in line 31:
) AS `t1`
  ^
Encountered: AS
Expected: LIMIT, ORDER, UNION
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9588) test_cancel_insert failed with ImpalaBeeswaxException

2020-03-31 Thread Yongzhi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated IMPALA-9588:
-
Affects Version/s: Impala 4.0

> test_cancel_insert failed with ImpalaBeeswaxException
> -
>
> Key: IMPALA-9588
> URL: https://issues.apache.org/jira/browse/IMPALA-9588
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Yongzhi Chen
>Priority: Major
>
> test_cancellation.TestCancellationSerial.test_cancel_insert failed in 
> impala-asf-master-core-s3 build:
> query_test.test_cancellation.TestCancellationSerial.test_cancel_insert[protocol:
>  beeswax | table_format: parquet/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | 
> query_type: CTAS | mt_dop: 0 | wait_action: None | cancel_delay: 1 | 
> cpu_limit_s: 0 | query: select * from lineitem order by l_orderkey | 
> fail_rpc_action: None | join_before_close: False | buffer_pool_limit: 0]
> Error Message
> ImpalaBeeswaxException: ImpalaBeeswaxException:  INNER EXCEPTION:  'thrift.Thrift.TApplicationException'>  MESSAGE: TException - service has 
> thrown: BeeswaxException(message=Invalid query handle: 
> cc4c5258e88c790b:8db6f87d, log_context=, handle=QueryHandle(id=, 
> log_context=), errorCode=0, SQLState=HY000)
> Stacktrace
> query_test/test_cancellation.py:248: in test_cancel_insert
> self.execute_cancel_test(vector)
> query_test/test_cancellation.py:167: in execute_cancel_test
> vector.get_value('cancel_delay'), vector.get_value('join_before_close'))
> /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/tests/util/cancel_util.py:41:
>  in cancel_query_and_validate_state
> assert client.get_state(handle) != client.QUERY_STATES['EXCEPTION']
> /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/tests/common/impala_connection.py:219:
>  in get_state
> return self.__beeswax_client.get_state(operation_handle.get_handle())
> /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:433:
>  in get_state
> return self.__do_rpc(lambda: self.imp_service.get_state(query_handle))
> /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:525:
>  in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(t), t)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: TException - service has thrown: 
> BeeswaxException(message=Invalid query handle: 
> cc4c5258e88c790b:8db6f87d, log_context=, handle=QueryHandle(id=, 
> log_context=), errorCode=0, SQLState=HY000)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9588) test_cancel_insert failed with ImpalaBeeswaxException

2020-03-31 Thread Yongzhi Chen (Jira)
Yongzhi Chen created IMPALA-9588:


 Summary: test_cancel_insert failed with ImpalaBeeswaxException
 Key: IMPALA-9588
 URL: https://issues.apache.org/jira/browse/IMPALA-9588
 Project: IMPALA
  Issue Type: Bug
Reporter: Yongzhi Chen


test_cancellation.TestCancellationSerial.test_cancel_insert failed in 
impala-asf-master-core-s3 build:
query_test.test_cancellation.TestCancellationSerial.test_cancel_insert[protocol:
 beeswax | table_format: parquet/none | exec_option: {'batch_size': 0, 
'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | query_type: CTAS | 
mt_dop: 0 | wait_action: None | cancel_delay: 1 | cpu_limit_s: 0 | query: 
select * from lineitem order by l_orderkey | fail_rpc_action: None | 
join_before_close: False | buffer_pool_limit: 0]

Error Message
ImpalaBeeswaxException: ImpalaBeeswaxException:  INNER EXCEPTION:   MESSAGE: TException - service has 
thrown: BeeswaxException(message=Invalid query handle: 
cc4c5258e88c790b:8db6f87d, log_context=, handle=QueryHandle(id=, 
log_context=), errorCode=0, SQLState=HY000)
Stacktrace
query_test/test_cancellation.py:248: in test_cancel_insert
self.execute_cancel_test(vector)
query_test/test_cancellation.py:167: in execute_cancel_test
vector.get_value('cancel_delay'), vector.get_value('join_before_close'))
/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/tests/util/cancel_util.py:41:
 in cancel_query_and_validate_state
assert client.get_state(handle) != client.QUERY_STATES['EXCEPTION']
/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/tests/common/impala_connection.py:219:
 in get_state
return self.__beeswax_client.get_state(operation_handle.get_handle())
/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:433:
 in get_state
return self.__do_rpc(lambda: self.imp_service.get_state(query_handle))
/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/tests/beeswax/impala_beeswax.py:525:
 in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(t), t)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: TException - service has thrown: BeeswaxException(message=Invalid 
query handle: cc4c5258e88c790b:8db6f87d, log_context=, 
handle=QueryHandle(id=, log_context=), errorCode=0, SQLState=HY000)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9202) Fix flakiness in TestExecutorGroups

2020-03-31 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-9202:
---
Fix Version/s: Impala 3.4.0

> Fix flakiness in TestExecutorGroups
> ---
>
> Key: IMPALA-9202
> URL: https://issues.apache.org/jira/browse/IMPALA-9202
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Minor
>  Labels: broken-build, flaky, flaky-test
> Fix For: Impala 3.4.0
>
>
> test_executor_groups.TestExecutorGroups.test_admission_slots failed on an 
> assertion recently with the following stacktrace
> {noformat}
> custom_cluster/test_executor_groups.py:215: in test_admission_slots
> assert ("Initial admission queue reason: Not enough admission control 
> slots "
> E   assert 'Initial admission queue reason: Not enough admission control 
> slots available on host' in 'Query (id=0445eb75d4842dce:219df01e):\n  
> DEBUG MODE WARNING: Query profile created while running a DEBUG buil...0)\n   
>   - NumRowsFetchedFromCache: 0 (0)\n - RowMaterializationRate: 0\n - 
> RowMaterializationTimer: 0.000ns\n
> {noformat}
> On investigating the logs, it seems like the query did in fact get queued 
> with the expected reason. The only reason I can think of that it failed to 
> appear on profile is that the profile was fetched before the admission reason 
> could be added to the profile. This happened in an ASAN build so I am 
> assuming the slowness in execution contributed to widening the window in 
> which this can happen.
> from the logs:
> {noformat}
> I1104 18:18:34.144309 113361 impala-server.cc:1046] 
> 0445eb75d4842dce:219df01e] Registered query 
> query_id=0445eb75d4842dce:219df01e 
> session_id=da467385483f4fb3:16683a81d25fe79e
> I1104 18:18:34.144951 113361 Frontend.java:1256] 
> 0445eb75d4842dce:219df01e] Analyzing query: select * from 
> functional_parquet.alltypestiny  where month < 3 and id + 
> random() < sleep(500); db: default
> I1104 18:18:34.149049 113361 BaseAuthorizationChecker.java:96] 
> 0445eb75d4842dce:219df01e] Authorization check took 4 ms
> I1104 18:18:34.149219 113361 Frontend.java:1297] 
> 0445eb75d4842dce:219df01e] Analysis and authorization finished.
> I1104 18:18:34.163229 113885 scheduler.cc:548] 
> 0445eb75d4842dce:219df01e] Exec at coord is false
> I1104 18:18:34.163945 113885 admission-controller.cc:1295] 
> 0445eb75d4842dce:219df01e] Trying to admit 
> id=0445eb75d4842dce:219df01e in pool_name=default-pool 
> executor_group_name=default-pool-group1 per_host_mem_estimate=176.02 MB 
> dedicated_coord_mem_estimate=100.02 MB max_requests=-1 (configured 
> statically) max_queued=200 (configured statically) max_mem=-1.00 B 
> (configured statically)
> I1104 18:18:34.164203 113885 admission-controller.cc:1307] 
> 0445eb75d4842dce:219df01e] Stats: agg_num_running=1, 
> agg_num_queued=0, agg_mem_reserved=8.00 KB,  
> local_host(local_mem_admitted=452.05 MB, num_admitted_running=1, 
> num_queued=0, backend_mem_reserved=8.00 KB)
> I1104 18:18:34.164383 113885 admission-controller.cc:902] 
> 0445eb75d4842dce:219df01e] Queuing, query 
> id=0445eb75d4842dce:219df01e reason: Not enough admission control 
> slots available on host 
> impala-ec2-centos74-r4-4xlarge-ondemand-1f88.vpc.cloudera.com:22002. Needed 1 
> slots but 1/1 are already in use.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9202) Fix flakiness in TestExecutorGroups

2020-03-31 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-9202.

Target Version: Impala 3.4.0
Resolution: Fixed

> Fix flakiness in TestExecutorGroups
> ---
>
> Key: IMPALA-9202
> URL: https://issues.apache.org/jira/browse/IMPALA-9202
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Minor
>  Labels: broken-build, flaky, flaky-test
>
> test_executor_groups.TestExecutorGroups.test_admission_slots failed on an 
> assertion recently with the following stacktrace
> {noformat}
> custom_cluster/test_executor_groups.py:215: in test_admission_slots
> assert ("Initial admission queue reason: Not enough admission control 
> slots "
> E   assert 'Initial admission queue reason: Not enough admission control 
> slots available on host' in 'Query (id=0445eb75d4842dce:219df01e):\n  
> DEBUG MODE WARNING: Query profile created while running a DEBUG buil...0)\n   
>   - NumRowsFetchedFromCache: 0 (0)\n - RowMaterializationRate: 0\n - 
> RowMaterializationTimer: 0.000ns\n
> {noformat}
> On investigating the logs, it seems like the query did in fact get queued 
> with the expected reason. The only reason I can think of that it failed to 
> appear on profile is that the profile was fetched before the admission reason 
> could be added to the profile. This happened in an ASAN build so I am 
> assuming the slowness in execution contributed to widening the window in 
> which this can happen.
> from the logs:
> {noformat}
> I1104 18:18:34.144309 113361 impala-server.cc:1046] 
> 0445eb75d4842dce:219df01e] Registered query 
> query_id=0445eb75d4842dce:219df01e 
> session_id=da467385483f4fb3:16683a81d25fe79e
> I1104 18:18:34.144951 113361 Frontend.java:1256] 
> 0445eb75d4842dce:219df01e] Analyzing query: select * from 
> functional_parquet.alltypestiny  where month < 3 and id + 
> random() < sleep(500); db: default
> I1104 18:18:34.149049 113361 BaseAuthorizationChecker.java:96] 
> 0445eb75d4842dce:219df01e] Authorization check took 4 ms
> I1104 18:18:34.149219 113361 Frontend.java:1297] 
> 0445eb75d4842dce:219df01e] Analysis and authorization finished.
> I1104 18:18:34.163229 113885 scheduler.cc:548] 
> 0445eb75d4842dce:219df01e] Exec at coord is false
> I1104 18:18:34.163945 113885 admission-controller.cc:1295] 
> 0445eb75d4842dce:219df01e] Trying to admit 
> id=0445eb75d4842dce:219df01e in pool_name=default-pool 
> executor_group_name=default-pool-group1 per_host_mem_estimate=176.02 MB 
> dedicated_coord_mem_estimate=100.02 MB max_requests=-1 (configured 
> statically) max_queued=200 (configured statically) max_mem=-1.00 B 
> (configured statically)
> I1104 18:18:34.164203 113885 admission-controller.cc:1307] 
> 0445eb75d4842dce:219df01e] Stats: agg_num_running=1, 
> agg_num_queued=0, agg_mem_reserved=8.00 KB,  
> local_host(local_mem_admitted=452.05 MB, num_admitted_running=1, 
> num_queued=0, backend_mem_reserved=8.00 KB)
> I1104 18:18:34.164383 113885 admission-controller.cc:902] 
> 0445eb75d4842dce:219df01e] Queuing, query 
> id=0445eb75d4842dce:219df01e reason: Not enough admission control 
> slots available on host 
> impala-ec2-centos74-r4-4xlarge-ondemand-1f88.vpc.cloudera.com:22002. Needed 1 
> slots but 1/1 are already in use.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9099) Allow setting mt_dop manually for queries with joins

2020-03-31 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9099.
---
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Allow setting mt_dop manually for queries with joins
> 
>
> Key: IMPALA-9099
> URL: https://issues.apache.org/jira/browse/IMPALA-9099
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: multithreading
> Fix For: Impala 4.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9029) Impala Doc: 3.4 Release Notes

2020-03-31 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-9029.
---
 Fix Version/s: Impala 3.4.0
Target Version: Impala 3.4.0
Resolution: Fixed

> Impala Doc: 3.4 Release Notes
> -
>
> Key: IMPALA-9029
> URL: https://issues.apache.org/jira/browse/IMPALA-9029
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alexandra Rodoni
>Assignee: Joe McDonnell
>Priority: Major
>  Labels: future_release_doc, impala_user_docs_open, in_34
> Fix For: Impala 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9549) Impalad startup fails to wait for catalogd to startup when using local catalog

2020-03-31 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-9549.
---
 Fix Version/s: Impala 4.0
Target Version: Impala 4.0
Resolution: Fixed

> Impalad startup fails to wait for catalogd to startup when using local catalog
> --
>
> Key: IMPALA-9549
> URL: https://issues.apache.org/jira/browse/IMPALA-9549
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
> Fix For: Impala 4.0
>
>
> Since Impala coordinators and executors may be starting up at the same time 
> as the catalogd, they should be tolerant of delays in the catalogd starting 
> up. When using local catalog (use_local_catalog=true), the Impalads fail with 
> the following error if the catalogd startup is delayed:
> {noformat}
> I0323 14:22:03.151849 29565 jni-util.cc:288] 
> org.apache.impala.catalog.local.LocalCatalogException: Unable to load 
> database names
> I0323 14:22:03.151849 29565 jni-util.cc:288] 
> org.apache.impala.catalog.local.LocalCatalogException: Unable to load 
> database names
>  at org.apache.impala.catalog.local.LocalCatalog.loadDbs(LocalCatalog.java:94)
>  at org.apache.impala.catalog.local.LocalCatalog.getDbs(LocalCatalog.java:83)
>  at org.apache.impala.service.Frontend.getCatalogMetrics(Frontend.java:753)
>  at 
> org.apache.impala.service.JniFrontend.getCatalogMetrics(JniFrontend.java:220)
> Caused by: org.apache.thrift.TException: 
> org.apache.impala.common.InternalException: Couldn't open transport for 
> localhost:26000 (connect() failed: Connection refused)
>  at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:382)
>  at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.access$100(CatalogdMetaProvider.java:174)
>  at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$1.call(CatalogdMetaProvider.java:583)
>  at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$1.call(CatalogdMetaProvider.java:578)
>  at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadWithCaching(CatalogdMetaProvider.java:509)
>  at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadDbList(CatalogdMetaProvider.java:577)
>  at org.apache.impala.catalog.local.LocalCatalog.loadDbs(LocalCatalog.java:92)
>  ... 3 more
> Caused by: org.apache.impala.common.InternalException: Couldn't open 
> transport for localhost:26000 (connect() failed: Connection refused)
>  at org.apache.impala.service.FeSupport.NativeGetPartialCatalogObject(Native 
> Method)
>  at 
> org.apache.impala.service.FeSupport.GetPartialCatalogObject(FeSupport.java:440)
>  at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:380)
>  ... 9 more
> I0323 14:22:03.217051 29565 status.cc:126] LocalCatalogException: Unable to 
> load database names
> CAUSED BY: TException: org.apache.impala.common.InternalException: Couldn't 
> open transport for localhost:26000 (connect() failed: Connection 
> refused){noformat}
> What happens is that the ImpalaServer constructor calls 
> ImpalaServer::UpdateCatalogMetrics() 
> ([https://github.com/apache/impala/blob/3b833902519fb8f0ef9b5fd20919c5fd85d22fcf/be/src/service/impala-server.cc#L452]
>  ). UpdateCatalogMetrics() is maintaining two metrics that track the number 
> of databases and the number of tables. This ends up calling 
> org.apache.impala.catalog.local.LocalCatalog.getDbs(), which calls loadDbs() 
> ([https://github.com/apache/impala/blob/ca0785ec206f27f06d8d6fd1b710779e548bbd8e/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java#L83]
>  ). loadDbs() requires a connection to catalogd and will fail if it cannot 
> connect.
> Importantly, this all happens before waiting for the catalogd to start up in 
> the regular ImpalaServer::Start():
> {code:java}
> if (FLAGS_is_coordinator) exec_env_->frontend()->WaitForCatalog();
> {code}
>  
> In the old catalog implementation (use_local_catalog=false), the getDbs() 
> call on the catalog returns whatever values it has, and it does not try to 
> contact the catalogd. This is why the regular case does not see this problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-9194) Add support for Debian 9

2020-03-31 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell closed IMPALA-9194.
-
Resolution: Won't Fix

> Add support for Debian 9
> 
>
> Key: IMPALA-9194
> URL: https://issues.apache.org/jira/browse/IMPALA-9194
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Priority: Major
>
> Debian 9 has been available for a couple years and seems like a useful 
> addition to our Debian 8 support. 
>  
> This will require a corresponding change in 
> [https://github.com/cloudera/native-toolchain] to support Debian 9.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9466) impala-shell client retry for hs2-http protocol.

2020-03-31 Thread Abhishek Rawat (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rawat resolved IMPALA-9466.

Target Version: Impala 4.0
Resolution: Fixed

> impala-shell client retry for hs2-http protocol.
> 
>
> Key: IMPALA-9466
> URL: https://issues.apache.org/jira/browse/IMPALA-9466
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: Abhishek Rawat
>Priority: Major
>  Labels: Client
>
> Add retry for idempotent rpcs. http transport could be unreliable. The goal 
> is to make hs2-http rpcs more robust.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9156) Share broadcast join builds between fragments

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072147#comment-17072147
 ] 

ASF subversion and git services commented on IMPALA-9156:
-

Commit ab7e209d1bf8f8d61e092b6de9078c0d2e8c657d in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ab7e209 ]

IMPALA-9099: allow mt_dop for joins without feature flag

This allows running *any* read-only query with mt_dop > 0.
Before this patch, no joins were allowed with mt_dop > 0.

Previous patches, particularly IMPALA-9156, added significantly
more code coverage for multithreading+joins. It should be safe to
allow enabling on a query-by-query basis. Many improvements are
still planned - see IMPALA-3902. So behaviour and performance
characteristics of mt_dop > 0 with more complex plans and joins
will continue to change.

Testing:
Updated the mt_dop validation tests and remove redundant planner test
that doesn't provide much additional coverage of the validation
support.

Ran exhaustive tests.

Change-Id: I9c6566abb239db0e775f2beaa25a62c36313cd6f
Reviewed-on: http://gerrit.cloudera.org:8080/15545
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Share broadcast join builds between fragments
> -
>
> Key: IMPALA-9156
> URL: https://issues.apache.org/jira/browse/IMPALA-9156
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: multithreading
> Fix For: Impala 4.0
>
>
> Following on from IMPALA-4224, which should add the logic to share a single 
> builder between multiple probe sides.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-3902) Multi-threaded query execution

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072148#comment-17072148
 ] 

ASF subversion and git services commented on IMPALA-3902:
-

Commit ab7e209d1bf8f8d61e092b6de9078c0d2e8c657d in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ab7e209 ]

IMPALA-9099: allow mt_dop for joins without feature flag

This allows running *any* read-only query with mt_dop > 0.
Before this patch, no joins were allowed with mt_dop > 0.

Previous patches, particularly IMPALA-9156, added significantly
more code coverage for multithreading+joins. It should be safe to
allow enabling on a query-by-query basis. Many improvements are
still planned - see IMPALA-3902. So behaviour and performance
characteristics of mt_dop > 0 with more complex plans and joins
will continue to change.

Testing:
Updated the mt_dop validation tests and remove redundant planner test
that doesn't provide much additional coverage of the validation
support.

Ran exhaustive tests.

Change-Id: I9c6566abb239db0e775f2beaa25a62c36313cd6f
Reviewed-on: http://gerrit.cloudera.org:8080/15545
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Multi-threaded query execution
> --
>
> Key: IMPALA-3902
> URL: https://issues.apache.org/jira/browse/IMPALA-3902
> Project: IMPALA
>  Issue Type: Epic
>  Components: Backend
>Affects Versions: Impala 2.6.0
>Reporter: Marcel Kinard
>Assignee: Tim Armstrong
>Priority: Minor
>  Labels: multithreading
>
> Currently, a single query fragment is run in a quasi-single threaded manner 
> on a node: the scanners are run in multiple threads, but all other operators 
> (joins, aggregation) are run in the main thread.
> The goal is to add multi-threaded execution on a single node by running 
> multiple fragment instances (each of which runs in a single thread).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9099) Allow setting mt_dop manually for queries with joins

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072146#comment-17072146
 ] 

ASF subversion and git services commented on IMPALA-9099:
-

Commit ab7e209d1bf8f8d61e092b6de9078c0d2e8c657d in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ab7e209 ]

IMPALA-9099: allow mt_dop for joins without feature flag

This allows running *any* read-only query with mt_dop > 0.
Before this patch, no joins were allowed with mt_dop > 0.

Previous patches, particularly IMPALA-9156, added significantly
more code coverage for multithreading+joins. It should be safe to
allow enabling on a query-by-query basis. Many improvements are
still planned - see IMPALA-3902. So behaviour and performance
characteristics of mt_dop > 0 with more complex plans and joins
will continue to change.

Testing:
Updated the mt_dop validation tests and remove redundant planner test
that doesn't provide much additional coverage of the validation
support.

Ran exhaustive tests.

Change-Id: I9c6566abb239db0e775f2beaa25a62c36313cd6f
Reviewed-on: http://gerrit.cloudera.org:8080/15545
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Allow setting mt_dop manually for queries with joins
> 
>
> Key: IMPALA-9099
> URL: https://issues.apache.org/jira/browse/IMPALA-9099
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: multithreading
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9587) Handle special characters for LDAP

2020-03-31 Thread Thomas Tauber-Marshall (Jira)
Thomas Tauber-Marshall created IMPALA-9587:
--

 Summary: Handle special characters for LDAP
 Key: IMPALA-9587
 URL: https://issues.apache.org/jira/browse/IMPALA-9587
 Project: IMPALA
  Issue Type: Improvement
  Components: Security
Affects Versions: Impala 4.0
Reporter: Thomas Tauber-Marshall


Impala currently does not do any escaping of special characters when 
interacting with LDAP, meaning that it will not be able handle usernames with 
characters such as ','. It would be useful to fix this.

We would also need to handle potentially escaped characters in parameters such 
as --ldap_group_filter



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9568) Template tuples are initialized multiple times

2020-03-31 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-9568.
--
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Template tuples are initialized multiple times
> --
>
> Key: IMPALA-9568
> URL: https://issues.apache.org/jira/browse/IMPALA-9568
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 4.0
>
>
> TSAN is reporting the following data race with template tuple initialization:
> {code:java}
> WARNING: ThreadSanitizer: data race (pid=8124)
>   Write of size 4 at 0x7b30001692f4 by thread T355:
> #0 impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
> impala::TupleRow const*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exprs/scalar-expr-evaluator.cc:279:23
>  (impalad+0x2553330)
> #1 impala::ScalarExprEvaluator::GetValue(impala::TupleRow const*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exprs/scalar-expr-evaluator.cc:253:10
>  (impalad+0x255357d)
> #2 
> impala::HdfsScanNodeBase::InitTemplateTuple(std::vector  std::allocator > const&, impala::MemPool*, 
> impala::RuntimeState*) const 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node-base.cc:900:27
>  (impalad+0x237c74a)
> #3 impala::HdfsScanner::Open(impala::ScannerContext*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scanner.cc:110:33
>  (impalad+0x23a6437)
> #4 impala::HdfsTextScanner::Open(impala::ScannerContext*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-text-scanner.cc:780:3
>  (impalad+0x23e6121)
> #5 
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper(impala::HdfsPartitionDescriptor*,
>  impala::ScannerContext*, boost::scoped_ptr*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node-base.cc:887:3
>  (impalad+0x237fc76)
> #6 impala::HdfsScanNode::ProcessSplit(std::vector std::allocator > const&, impala::MemPool*, 
> impala::io::ScanRange*, long*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node.cc:479:19
>  (impalad+0x24daeba)
> #7 impala::HdfsScanNode::ScannerThread(bool, long) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node.cc:417:7
>  (impalad+0x24da84e)
> #8 
> impala::HdfsScanNode::ThreadTokenAvailableCb(impala::ThreadResourcePool*)::$_0::operator()()
>  const 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node.cc:338:13
>  (impalad+0x24dbd06)
> #9 
> boost::detail::function::void_function_obj_invoker0  void>::invoke(boost::detail::function::function_buffer&) 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:159:11
>  (impalad+0x24dbb19)
> #10 boost::function0::operator()() const 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
>  (impalad+0x1d0af41)
> #11 impala::Thread::SuperviseThread(std::string const&, std::string 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/thread.cc:360:3
>  (impalad+0x22c3c66)
> #12 void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list0&, int) 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:531:9
>  (impalad+0x22cbe2c)
> #13 boost::_bi::bind_t const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > 
> >::operator()() 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
>  (impalad+0x22cbd43)
> #14 boost::detail::thread_data (*)(std::string const&, std::string const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise (impala::PromiseMode)0>*), boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > > 
> >::run() 
> 

[jira] [Commented] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072127#comment-17072127
 ] 

ASF subversion and git services commented on IMPALA-9555:
-

Commit 1cfc31c84fce543b6cf75341a699772ffde5e3bd in impala's branch 
refs/heads/master from Attila Jeges
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1cfc31c ]

IMPALA-9555 part 2: [Hive3] Fix test failure introduced by HIVE-22589

This patch is a continuation of IMPALA-9555. It makes Avro DATE
tests more resilient by using regex for expected error messages
instead of using concrete error messages.

Change-Id: I36340be70a37b75997cf49625a173ec2690ed9b8
Reviewed-on: http://gerrit.cloudera.org:8080/15618
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> TestDateQueries.test_queries failing because Hive3 switched back to the 
> hybrid Julian Gregorian calendar
> 
>
> Key: IMPALA-9555
> URL: https://issues.apache.org/jira/browse/IMPALA-9555
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
> following error:
> {code}
> query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
> exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
> 'batch_size': 1} | table_format: avro/snap/block] (from pytest)
> Error Message
> query_test/test_date_queries.py:60: in test_queries 
> self.run_test_case('QueryTest/avro_date', vector) 
> common/impala_test_suite.py:690: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:523: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 0,0001-01-01,0001-01-01 != 
> 10,1399-06-27,2017-11-28 E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL 
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 
> 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 
> != 21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
> 22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
> 23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
> 24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
> 25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
> 26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
> 27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
> 28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
> 29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
> 30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
> 31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
> 3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
> 31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
> 5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number 
> of rows returned (expected vs actual): 22 != 15
> Stacktrace
> query_test/test_date_queries.py:60: in test_queries
> self.run_test_case('QueryTest/avro_date', vector)
> common/impala_test_suite.py:690: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
> E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
> E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
> E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
> E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
> E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
> E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
> E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
> E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
> E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
> E 

[jira] [Commented] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072128#comment-17072128
 ] 

ASF subversion and git services commented on IMPALA-9555:
-

Commit 1cfc31c84fce543b6cf75341a699772ffde5e3bd in impala's branch 
refs/heads/master from Attila Jeges
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1cfc31c ]

IMPALA-9555 part 2: [Hive3] Fix test failure introduced by HIVE-22589

This patch is a continuation of IMPALA-9555. It makes Avro DATE
tests more resilient by using regex for expected error messages
instead of using concrete error messages.

Change-Id: I36340be70a37b75997cf49625a173ec2690ed9b8
Reviewed-on: http://gerrit.cloudera.org:8080/15618
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> TestDateQueries.test_queries failing because Hive3 switched back to the 
> hybrid Julian Gregorian calendar
> 
>
> Key: IMPALA-9555
> URL: https://issues.apache.org/jira/browse/IMPALA-9555
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
> following error:
> {code}
> query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
> exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
> 'batch_size': 1} | table_format: avro/snap/block] (from pytest)
> Error Message
> query_test/test_date_queries.py:60: in test_queries 
> self.run_test_case('QueryTest/avro_date', vector) 
> common/impala_test_suite.py:690: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:523: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 0,0001-01-01,0001-01-01 != 
> 10,1399-06-27,2017-11-28 E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL 
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 
> 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 
> != 21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
> 22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
> 23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
> 24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
> 25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
> 26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
> 27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
> 28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
> 29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
> 30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
> 31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
> 3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
> 31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
> 5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number 
> of rows returned (expected vs actual): 22 != 15
> Stacktrace
> query_test/test_date_queries.py:60: in test_queries
> self.run_test_case('QueryTest/avro_date', vector)
> common/impala_test_suite.py:690: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
> E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
> E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
> E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
> E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
> E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
> E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
> E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
> E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
> E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
> E 

[jira] [Commented] (IMPALA-9568) Template tuples are initialized multiple times

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072130#comment-17072130
 ] 

ASF subversion and git services commented on IMPALA-9568:
-

Commit 0038487267ccba77e6cb17a582039d6b35ff947c in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0038487 ]

IMPALA-9568: Template tuples are initialized multiple times

Template tuples are initialized multiple times, concurrently, which
leads to a data race reported by TSAN. The template tuples should just
be initialized once. This patch removes the template tuple
initialization done in the HDFS scanners and just copies them from the
HdfsScanNodeBase.

After this patch, data-load is now TSAN clean. I removed the TSAN flag
'halt_on_error=0' so that TSAN now crashes the process if a TSAN bug is
discovered. This will allow us to setup a Jenkins job that runs dataload
+ be/ tests, and will crash if any new TSAN bugs are discovered.

Testing:
* Ran exhaustive tests

Change-Id: I3bd3554b2b919b117a0c9ae86dfc0a75ae4129d2
Reviewed-on: http://gerrit.cloudera.org:8080/15604
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Template tuples are initialized multiple times
> --
>
> Key: IMPALA-9568
> URL: https://issues.apache.org/jira/browse/IMPALA-9568
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> TSAN is reporting the following data race with template tuple initialization:
> {code:java}
> WARNING: ThreadSanitizer: data race (pid=8124)
>   Write of size 4 at 0x7b30001692f4 by thread T355:
> #0 impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
> impala::TupleRow const*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exprs/scalar-expr-evaluator.cc:279:23
>  (impalad+0x2553330)
> #1 impala::ScalarExprEvaluator::GetValue(impala::TupleRow const*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exprs/scalar-expr-evaluator.cc:253:10
>  (impalad+0x255357d)
> #2 
> impala::HdfsScanNodeBase::InitTemplateTuple(std::vector  std::allocator > const&, impala::MemPool*, 
> impala::RuntimeState*) const 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node-base.cc:900:27
>  (impalad+0x237c74a)
> #3 impala::HdfsScanner::Open(impala::ScannerContext*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scanner.cc:110:33
>  (impalad+0x23a6437)
> #4 impala::HdfsTextScanner::Open(impala::ScannerContext*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-text-scanner.cc:780:3
>  (impalad+0x23e6121)
> #5 
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper(impala::HdfsPartitionDescriptor*,
>  impala::ScannerContext*, boost::scoped_ptr*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node-base.cc:887:3
>  (impalad+0x237fc76)
> #6 impala::HdfsScanNode::ProcessSplit(std::vector std::allocator > const&, impala::MemPool*, 
> impala::io::ScanRange*, long*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node.cc:479:19
>  (impalad+0x24daeba)
> #7 impala::HdfsScanNode::ScannerThread(bool, long) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node.cc:417:7
>  (impalad+0x24da84e)
> #8 
> impala::HdfsScanNode::ThreadTokenAvailableCb(impala::ThreadResourcePool*)::$_0::operator()()
>  const 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/exec/hdfs-scan-node.cc:338:13
>  (impalad+0x24dbd06)
> #9 
> boost::detail::function::void_function_obj_invoker0  void>::invoke(boost::detail::function::function_buffer&) 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:159:11
>  (impalad+0x24dbb19)
> #10 boost::function0::operator()() const 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
>  (impalad+0x1d0af41)
> #11 impala::Thread::SuperviseThread(std::string const&, std::string 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/thread.cc:360:3
>  (impalad+0x22c3c66)
> #12 void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> 

[jira] [Commented] (IMPALA-9577) Use `system_unsync` time for Kudu test clusters

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072129#comment-17072129
 ] 

ASF subversion and git services commented on IMPALA-9577:
-

Commit 208d9d6896f39a25be00ae4a4ce4679aa5ecd636 in impala's branch 
refs/heads/master from Grant Henke
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=208d9d6 ]

IMPALA-9577: [test] Use `system_unsync` time for Kudu test clusters

Recently Kudu made enhancements to time source configuration and
adjusted the time source for local clusters/tests to `system_unsync`.

This patch mirrors that behavior in Impala test clusters given there is no
need to require NTP-synchronized clock for a test where all the
participating Kudu masters and tablet servers are run at the same node
using the same local wallclock.

See the Kudu commit here for details:
https://github.com/apache/kudu/commit/eb2b70d4b96be2fc2fdd6b3625acc284ac5774be

While making this change, I removed all ntp related packages and special
handling as they should not be needed in a development environment
any more. I also added curl and gawk which were missing in my
Docker ubuntu environment and broke my testing.

Testing:
I tested with the steps below using Docker for Mac:

  docker rm impala-dev
  docker volume rm impala
  docker run --privileged --interactive --tty --name impala-dev -v impala:/home 
-p 25000:25000 -p 25010:25010 -p 25020:25020 ubuntu:16.04 /bin/bash

  apt-get update
  apt-get install sudo
  adduser --disabled-password --gecos '' impdev
  echo 'impdev ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
  su - impdev
  cd ~

  sudo apt-get --yes install git
  git clone https://git-wip-us.apache.org/repos/asf/impala.git ~/Impala
  cd ~/Impala
  export IMPALA_HOME=`pwd`
  git remote add fork https://github.com/granthenke/impala.git
  git fetch fork
  git checkout kudu-system-time

  $IMPALA_HOME/bin/bootstrap_development.sh

  source $IMPALA_HOME/bin/impala-config.sh
  (pushd fe && mvn -fae test -Dtest=AnalyzeDDLTest)
  (pushd fe && mvn -fae test -Dtest=AnalyzeKuduDDLTest)

  $IMPALA_HOME/bin/start-impala-cluster.py
  ./tests/run-tests.py query_test/test_kudu.py

Change-Id: Id99e5cb58ab988c3ad4f98484be8db193d5eaf99
Reviewed-on: http://gerrit.cloudera.org:8080/15568
Reviewed-by: Impala Public Jenkins 
Reviewed-by: Alexey Serbin 
Tested-by: Impala Public Jenkins 


> Use `system_unsync` time for Kudu test clusters
> ---
>
> Key: IMPALA-9577
> URL: https://issues.apache.org/jira/browse/IMPALA-9577
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Grant Henke
>Assignee: Grant Henke
>Priority: Major
>
> Recently Kudu made enhancements to time source configuration and adjusted the 
> time source for local clusters/tests to `system_unsync`. Impala should mirror 
> that behavior in Impala test clusters given there is no need to require 
> NTP-synchronized clock for a test where all the participating Kudu masters 
> and tablet servers are run at the same node using the same local wallclock.
>  
> See the Kudu commit here for details: 
> [https://github.com/apache/kudu/commit/eb2b70d4b96be2fc2fdd6b3625acc284ac5774be]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9586) Update query option docs to account for interactions with mt_dop

2020-03-31 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9586:
-

 Summary: Update query option docs to account for interactions with 
mt_dop
 Key: IMPALA-9586
 URL: https://issues.apache.org/jira/browse/IMPALA-9586
 Project: IMPALA
  Issue Type: Improvement
  Components: Docs
Reporter: Tim Armstrong


in some cases mt_dop changes the behaviour of other options or makes them a 
no-op. E.g. num_scanner_threads has no effect. We need to update docs to 
reflect this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9585) Update docs about mt_dop for IMPALA-9099

2020-03-31 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9585:
-

 Summary: Update docs about mt_dop for IMPALA-9099
 Key: IMPALA-9585
 URL: https://issues.apache.org/jira/browse/IMPALA-9585
 Project: IMPALA
  Issue Type: Sub-task
  Components: Docs
Reporter: Tim Armstrong
Assignee: Tim Armstrong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-3766) Optionally compress spilled data before writing it to disk

2020-03-31 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-3766.
---
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Optionally compress spilled data before writing it to disk
> --
>
> Key: IMPALA-3766
> URL: https://issues.apache.org/jira/browse/IMPALA-3766
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.7.0
>Reporter: Mostafa Mokhtar
>Assignee: Tim Armstrong
>Priority: Minor
>  Labels: performance
> Fix For: Impala 4.0
>
>
> Evaluate compressing the buffers before writing them to disk for spilling 
> operators. 
> Applying LZ4 on row batches before sending them over the network as part of 
> exchange provides around 2x compression. 
> {code}
>  - BytesSent: 612.87 MB (642635712)
>  - NetworkThroughput(*): 1.88 GB/sec
>  - OverallThroughput: 1.21 GB/sec
>  - PeakMemoryUsage: 51.00 KB (52224)
>  - RowsReturned: 360.00K (36)
>  - SerializeBatchTime: 176.002ms
>  - TransmitDataRPCTime: 319.005ms
>  - UncompressedRowBatchSize: 1.47 GB (1573356320)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9572) Impalad crash when process decimal value

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072091#comment-17072091
 ] 

ASF subversion and git services commented on IMPALA-9572:
-

Commit e8f604a2139be2ee3f011e6f2ce71fa0dde26492 in impala's branch 
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e8f604a ]

IMPALA-9572: Fix DCHECK in nested Parquet scanning

The issue occurred when there were skipped pages and a column
inside a collection was scanned, but its position was not needed.
The repetition level still needs to be read in this case, as the
skipped ranges are set in top level rows, so collection items
need to know which top level row do they belong to.

A DCHECK in StrideWriter's constructor was hit, otherwise the
code ran correctly in release mode. The DCHECK is moved to
functions where the condition would actually cause problems.

Testing:
- added and ran a regression test

Change-Id: I5e8ef514ead71f732c73f910af7fd1aecd37bb81
Reviewed-on: http://gerrit.cloudera.org:8080/15598
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Impalad crash when process decimal value
> 
>
> Key: IMPALA-9572
> URL: https://issues.apache.org/jira/browse/IMPALA-9572
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Yongzhi Chen
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, crash
> Attachments: 
> impalad.impala-ec2-centos74-m5-4xlarge-ondemand-1a73.vpc.cloudera.com.jenkins.log.ERROR.20200328-031659.18222.gz
>
>
> In impala-asf-master-exhaustive build, minidump shows impalad crashed:
> Crash reason:  SIGABRT
> Crash address: 0x7d1472e
> Process uptime: not available
> Thread 396 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x07246600
> rsi = 0x4273   rdi = 0x472e
> rbp = 0x7f6e43ee7050   rsp = 0x7f6e43ee6ce8
>  r8 = 0xr9 = 0x7f6e43ee6b60
> r10 = 0x0008   r11 = 0x0206
> r12 = 0x07246680   r13 = 0x0044
> r14 = 0x0724dfc4   r15 = 0x07246600
> rip = 0x7f6f2b1da1f7
> Found by: given as instruction pointer in context
>  1  libc-2.17.so + 0x368e8
> rbp = 0x7f6e43ee7050   rsp = 0x7f6e43ee6cf0
> rip = 0x7f6f2b1db8e8
> Found by: stack scanning
>  2  impalad!google_breakpad::ExceptionHandler::HandleSignal(int, siginfo_t*, 
> void*) + 0x1e0
> rbp = 0x7f6e43ee7050   rsp = 0x7f6e43ee6d78
> rip = 0x04ed0840
> Found by: stack scanning
>  3  impalad!google::DumpStackTraceAndExit() + 0x24
> rbp = 0x7f6e43ee7050   rsp = 0x7f6e43ee6e20
> rip = 0x04ea2554
> Found by: stack scanning
>  4  impalad!google::LogMessage::Fail() + 0xd
> rbx = 0x07246600   rbp = 0x7f6e43ee7050
> rsp = 0x7f6e43ee6ed0   rip = 0x04e98fad
> Found by: call frame info
>  5  impalad!google::LogMessage::SendToLog() + 0x2b2
> rbx = 0x07246600   rbp = 0x7f6e43ee7050
> rsp = 0x7f6e43ee6ee0   rip = 0x04e9a852
> Found by: call frame info
>  6  impalad!google::LogMessage::Flush() + 0x157
> rbx = 0x7f6e43ee7090   rbp = 0x7f6f2bd8a6a0
> rsp = 0x7f6e43ee7060   r12 = 0x7f6e43ee707f
> r13 = 0x072554f8   r14 = 0x7f6e43ee7120
> r15 = 0x18954c50   rip = 0x04e98987
> Found by: call frame info
>  7  impalad!google::LogMessageFatal::~LogMessageFatal() + 0xe
> rbx = 0x7f6e43ee7120   rbp = 0x7f6e43ee7160
> rsp = 0x7f6e43ee70e0   r12 = 0x0001
> r13 = 0x072554f8   r14 = 0x000118ba
> r15 = 0x18954c50   rip = 0x04e9bf4e
> Found by: call frame info
>  8  impalad!impala::StrideWriter::StrideWriter(long*, long) [mem-util.h 
> : 40 + 0xd]
> rbx = 0x0001   rbp = 0x7f6e43ee7160
> rsp = 0x7f6e43ee7100   r12 = 0x0001
> r13 = 0x072554f8   r14 = 0x000118ba
> r15 = 0x18954c50   rip = 0x02c07535
> Found by: call frame info
>  9  
> impalad!impala::BaseScalarColumnReader::FillPositionsInCandidateRange(int, 
> int, unsigned char*, int) [parquet-column-readers.cc : 1319 + 0x1f]
> rbx = 0x17d4   rbp = 0x7f6e43ee7270
> rsp = 0x7f6e43ee7170   r12 = 0x
> r13 = 0x0039   r14 = 0x000118ba
> r15 = 0x18954c50   rip = 0x02c008d3
> Found by: call frame info
> 10  impalad!impala::ScalarColumnReader, 
> 

[jira] [Commented] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072008#comment-17072008
 ] 

Csaba Ringhofer commented on IMPALA-9584:
-

https://gerrit.cloudera.org/#/c/15621/

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8690) Better eviction algorithm for data cache

2020-03-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071979#comment-17071979
 ] 

ASF subversion and git services commented on IMPALA-8690:
-

Commit 01691b998a96fd29575a3c3912377bd18fdd712f in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=01691b9 ]

IMPALA-8690: Add LIRS cache eviction algorithm

One concern for the data cache is that the LRU eviction
algorithm is suceptible to being flushed by large
scans of low priority data. This implements the LIRS algorithm
described in "LIRS: An Efficient Low Inter-reference Recency
Set Replacement Policy to Improve Buffer Cache Performance"
by Song Jiang / Xiaodon Xhang 2002. LIRS is a scan-resistent
eviction algorithm with low performance penalty to LRU.

This introduces the startup flag data_cache_eviction_policy to
control which eviction policy to use. The only two options are
LRU and LIRS, with the default continuing to be LRU.

To accomodate the new algorithm and associated tests, some
code moved around:
1. The RLCacheShard implementation moved from util/cache/cache.cc
   to util/cache/rl-cache.cc.
2. The backend cache tests were split into multiple files.
   util/cache/cache-test.h contains shared cache testing code.
   util/cache/cache-test.cc contains generic tests that should
   work for any algorithm.
   util/cache/rl-cache-test.cc are RLCacheShard specific tests
   util/cache/lirs-cache-test.cc are LIRS specific tests
3. To make it easy for clients of the cache code to customize
   the cache eviction algorithm, the public interface changed
   from using a template to taking the policy as an argument.
4. Cache::MemoryType is removed.
5. Cache adds an Init() method to verify the validity of
   startup flags

Testing:
 - Added LIRS specific backend cache tests (lirs-cache-test)
 - Ran TPC-DS with a very small cache and concurrency to test
   corner cases with the LIRS eviction policy
 - Parameterized data-cache-test to run for both LRU and LIRS
 - Added LIRS equivalents for tests in custom_cluster/test_data_cache.py
 - Ran cache-bench with LRU and LIRS. The results are:
   Test case   | Algorithm | Lookups / sec | Hit rate
   ZIPFIAN ratio=1.00x | LRU   | 11.31M| 99.9%
   ZIPFIAN ratio=1.00x | LIRS  | 10.09M| 99.8%
   ZIPFIAN ratio=3.00x | LRU   | 11.36M| 95.9%
   ZIPFIAN ratio=3.00x | LIRS  |  9.27M| 96.4%
   UNIFORM ratio=1.00x | LRU   |  7.46M| 99.8%
   UNIFORM ratio=1.00x | LIRS  |  6.93M| 99.8%
   UNIFORM ratio=3.00x | LRU   |  5.63M| 33.3%
   UNIFORM ratio=3.00x | LIRS  |  3.24M| 33.3%
   The takeaway is that LIRS is a bit slower on lookups and
   quite a bit slower on inserts. However, they both are still
   doing millions of operations per second, so it should not
   be a bottleneck for the data cache.

Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8
Reviewed-on: http://gerrit.cloudera.org:8080/15306
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Better eviction algorithm for data cache
> 
>
> Key: IMPALA-8690
> URL: https://issues.apache.org/jira/browse/IMPALA-8690
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Michael Ho
>Assignee: Joe McDonnell
>Priority: Major
>
> With the current implementation of data cache, all data access will be cached 
> regardless of the access pattern. The current LRU eviction algorithm is not 
> resistant to scan traffic so in case some users scan a big fact table, a lot 
> of the heavily accessed items will be evicted inevitably. We should adopt 
> better eviction algorithm (e.g. LRFU or some other well known ones in the 
> literature). Would be nice to evaluate it against some users' traces now that 
> IMPALA-8542 is fixed.
> In the short run, we probably need some workaround (e.g. query hints to 
> disable caching for certain tables). Will file a separate jira for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer reassigned IMPALA-9584:
---

Assignee: Csaba Ringhofer

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071958#comment-17071958
 ] 

Tim Armstrong commented on IMPALA-9584:
---

[~csringhofer] yeah let's just remove this from the test for now to unblock, 
this is causing a lot of failures

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-9584:
--
Labels: broken-build flaky-test  (was: flaky-test)

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-9584:
--
Target Version: Impala 4.0

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-9584:
--
Priority: Blocker  (was: Major)

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Blocker
>  Labels: flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071956#comment-17071956
 ] 

Csaba Ringhofer commented on IMPALA-9584:
-

>I think this is probably somehow indirectly caused by IMPALA-8005 randomising 
>the exchanges.
Ah, that explains why it came up often recently.

>We're checking too many significant figures, really
In the AVG() case this is true, as double is involved, but in all other cases 
timestamp functions should be deterministic with nanosec precision. I wouldn't 
make the tests "weaker" for the sake of AVG, which is probably not used too 
often with timestamps anyway.

I see two solutions:
- short term: Remove avg(timestamp_col) from the test, I don't think that it is 
crucial here, e.g. max could be also used. I grepped for avg(timestamp_col) and 
it is only used in this file in 2 tests.
- long term: Change the accumulator in the intermediate format of 
AVG(TIMESTAMP) see IMPALA-7472. (I wrote decimal there, but two integers for 
day_sum and nano_sum could be probably faster).

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-1995) Flaky test: PlannerTest.testHbase: splits for HBASE KEYRANGE not set up correctly.

2020-03-31 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071939#comment-17071939
 ] 

Tim Armstrong commented on IMPALA-1995:
---

[https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/10021/testReport/junit/org.apache.impala.planner/PlannerTest/testHbaseNoKeyEstimate/]

> Flaky test: PlannerTest.testHbase: splits for HBASE KEYRANGE not set up 
> correctly.
> --
>
> Key: IMPALA-1995
> URL: https://issues.apache.org/jira/browse/IMPALA-1995
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Affects Versions: Impala 2.3.0, Impala 3.2.0
>Reporter: Alexander Behm
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: broken-build, flaky, test-infra
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Looks like testdata/bin/split-hbase.sh does not always set up the ranges in 
> the way we want. See failure below.
> Example run:
> http://sandbox.jenkins.cloudera.com/job/impala-master-cdh5-trunk/1090/
> {code}
> Stack Trace:
> java.lang.AssertionError: section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypessmall
> where id < 5
> actual result doesn't match expected result:
>   HBASE KEYRANGE port=16202 3:7
> ^^^
>   HBASE KEYRANGE port=16203 7:
>   HBASE KEYRANGE port=16203 :3
> NODE 0:
> expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id < '5'
> and tinyint_col = 5
> actual result doesn't match expected result:
>   HBASE KEYRANGE port=16201 1:3
> ^^^
>   HBASE KEYRANGE port=16202 3:5
>   HBASE KEYRANGE port=16203 :1
> NODE 0:
> expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypesagg
> where bigint_col is not null and bool_col = true
> actual result doesn't match expected result:
>   HBASE KEYRANGE port=16201 1:3
> ^^^
>   HBASE KEYRANGE port=16202 3:7
>   HBASE KEYRANGE port=16203 7:
>   HBASE KEYRANGE port=16203 :1
> NODE 0:
> expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071937#comment-17071937
 ] 

Tim Armstrong edited comment on IMPALA-9584 at 3/31/20, 4:24 PM:
-

I also saw this here: 
[https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2048/#showFailuresLink]

[~csringhofer]'s theory seems correct. We're checking too many significant 
figures, really. I think this is probably somehow indirectly caused by 
IMPALA-8005 randomising the exchanges.


was (Author: tarmstrong):
I also saw this here: 
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2048/#showFailuresLink

[~csringhofer]'s theory seems correct. -It's unclear why this would just start 
happening now when it had previously been stable, but probably - we're checking 
too many significant figures. I think this is probably somehow indirectly 
caused by IMPALA-8005 randomising the exchanges.

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071937#comment-17071937
 ] 

Tim Armstrong commented on IMPALA-9584:
---

I also saw this here: 
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2048/#showFailuresLink

[~csringhofer]'s theory seems correct. -It's unclear why this would just start 
happening now when it had previously been stable, but probably - we're checking 
too many significant figures. I think this is probably somehow indirectly 
caused by IMPALA-8005 randomising the exchanges.

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071931#comment-17071931
 ] 

Csaba Ringhofer edited comment on IMPALA-9584 at 3/31/20, 4:19 PM:
---

I do have a theory here, simply avg(timestamp_col) is not deterministic:
select avg(timestamp_col) from functional_parquet.alltypes;
on my machine it returns 
2009-12-31 14:30:20.341989994 or 
2009-12-31 14:30:20.341990232

I think that the underlying cause is "double trouble": 
- AVG() on timestamps converts timestamps to double and stores their sum in 
double (this is a bad idea in my opinion, another reason to change to a 
different mechanism is mentioned in IMPALA-7472)
- addition for doubles is not associative due to precision loss

So merging aggregates for more than 2 subsets (e.g splits) can lead to 
different results depending on the order of the merges. 


was (Author: csringhofer):
I do have a theory here, simply avg(timestamp_col) is not deterministic:
select avg(timestamp_col) from functional_parquet.alltypes;
on my machine it returns 
2009-12-31 14:30:20.341989994 or 
2009-12-31 14:30:20.341990232

I think that the underlying cause is "double trouble": 
- AVG() on timestamps converts timestamps to double and stores their sum in 
double
- addition for doubles is not associative due to precision loss
So merging aggregates for more than 2 subsets (e.g splits) can lead to 
different results depending on the order of the merges. 

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071931#comment-17071931
 ] 

Csaba Ringhofer commented on IMPALA-9584:
-

I do have a theory here, simply avg(timestamp_col) is not deterministic:
select avg(timestamp_col) from functional_parquet.alltypes;
on my machine it returns 
2009-12-31 14:30:20.341989994 or 
2009-12-31 14:30:20.341990232

I think that the underlying cause is "double trouble": 
- AVG() on timestamps converts timestamps to double and stores their sum in 
double
- addition for doubles is not associative due to precision loss
So merging aggregates for more than 2 subsets (e.g splits) can lead to 
different results depending on the order of the merges. 

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9582) Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell

2020-03-31 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp updated IMPALA-9582:

Priority: Blocker  (was: Critical)

> Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell
> --
>
> Key: IMPALA-9582
> URL: https://issues.apache.org/jira/browse/IMPALA-9582
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.4.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Blocker
> Fix For: Impala 3.4.0
>
>
> thrift_sasl 0.4.1 introduced a regression whereby the Thrift transport was 
> not reading all data, causing clients to hang.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9582) Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell

2020-03-31 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp updated IMPALA-9582:

Issue Type: Bug  (was: Improvement)

> Update thift_sasl 0.4.1 --> 0.4.2 for impala-shell
> --
>
> Key: IMPALA-9582
> URL: https://issues.apache.org/jira/browse/IMPALA-9582
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.4.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> thrift_sasl 0.4.1 introduced a regression whereby the Thrift transport was 
> not reading all data, causing clients to hang.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-9584:

Description: 
The issue occurred here:
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/

The same test failed with all protocols.

Failing query:
https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2

Errors are like (hs2):
2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'

I think that the problem is not the difference in the 4th double column, as 
that depends on the client used and we do not require complete match during 
comparison. So the problem is likely to be the timestamp in the 5th column.

  was:
The issue occurred here:
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/

Failing query:
https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2

Errors are like:
2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'


> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> The same test failed with all protocols.
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like (hs2):
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'
> I think that the problem is not the difference in the 4th double column, as 
> that depends on the client used and we do not require complete match during 
> comparison. So the problem is likely to be the timestamp in the 5th column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071911#comment-17071911
 ] 

Sahil Takiar commented on IMPALA-9584:
--

Just saw this here as well: 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/10016/

> test_analytic_fns is flaky (small fractional differences in AVG)
> 
>
> Key: IMPALA-9584
> URL: https://issues.apache.org/jira/browse/IMPALA-9584
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: flaky-test
>
> The issue occurred here:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> Failing query:
> https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2
> Errors are like:
> 2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
> 2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9561) Change hadoop-ozone-filesystem dependency to hadoop-ozone-filesystem-lib-current

2020-03-31 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-9561.
--
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Change hadoop-ozone-filesystem dependency to 
> hadoop-ozone-filesystem-lib-current
> 
>
> Key: IMPALA-9561
> URL: https://issues.apache.org/jira/browse/IMPALA-9561
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 4.0
>
>
> Ozone publishes a client fat-jar that shades all client jar dependencies. 
> This is nice because it doesn't pollute the client classpath with unnecessary 
> jars / jar conflicts. Impala should use the fat-jar rather than the regular 
> jar. It's also the recommended way for clients to interact with Ozone 
> clusters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9584) test_analytic_fns is flaky (small fractional differences in AVG)

2020-03-31 Thread Csaba Ringhofer (Jira)
Csaba Ringhofer created IMPALA-9584:
---

 Summary: test_analytic_fns is flaky (small fractional differences 
in AVG)
 Key: IMPALA-9584
 URL: https://issues.apache.org/jira/browse/IMPALA-9584
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Csaba Ringhofer


The issue occurred here:
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2065/testReport/query_test.test_queries/TestQueries/test_analytic_fns_protocol__hs2___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/

Failing query:
https://github.com/apache/impala/blob/ebbe52b4bed944d3012e3679dc984827ce11d5a8/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test#L2

Errors are like:
2009,3,6,3.667,2009-03-01 20:12:00.475000,'0','8' != 
2009,3,6,3.667,2009-03-01 20:12:00.474999,'0','8'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9583) Add automated tests for Kudu VARCHAR multibyte truncation

2020-03-31 Thread Grant Henke (Jira)
Grant Henke created IMPALA-9583:
---

 Summary: Add automated tests for Kudu VARCHAR multibyte truncation
 Key: IMPALA-9583
 URL: https://issues.apache.org/jira/browse/IMPALA-9583
 Project: IMPALA
  Issue Type: Improvement
Reporter: Grant Henke


Kudu VARCHAR support is added in IMPALA-5092, however adding an automated test 
to validate that multibyte characters are truncated when too wide was not 
added. Instead a manual test was performed. 

 

Something like below should be added to _test_kudu.py_ along with updates to 
the test framework to support non-ascii characters:
{code:java}
  @SkipIfKudu.no_hybrid_clock
  def test_kudu_multibyte_vc(self, vector, cursor, kudu_client, 
unique_database):
"""Test multibyte Kudu VARCHAR values that are wider than Impala's Varchar 
length."""
cursor.execute("""CREATE TABLE %s.multibyte (a INT PRIMARY KEY, vc 
VARCHAR(8))
PARTITION BY HASH(a) PARTITIONS 3 STORED AS KUDU""" % unique_database)
assert kudu_client.table_exists(
KuduTestSuite.to_kudu_table_name(unique_database, "multibyte"))
table = kudu_client.table(KuduTestSuite.to_kudu_table_name(unique_database, 
"multibyte"))
session = kudu_client.new_session()
# Not truncated: 1 character in Kudu, 4 bytes in Impala.
session.apply(table.new_insert((0, "测")))
# Truncated: 2 characters in Kudu, 8 bytes in Impala.
session.apply(table.new_insert((1, "测试")))
session.flush()self.run_test_case('QueryTest/kudu_multibyte_vc', 
vector, use_db=unique_database){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-31 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges reopened IMPALA-9555:
--

> TestDateQueries.test_queries failing because Hive3 switched back to the 
> hybrid Julian Gregorian calendar
> 
>
> Key: IMPALA-9555
> URL: https://issues.apache.org/jira/browse/IMPALA-9555
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
> following error:
> {code}
> query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
> exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
> 'batch_size': 1} | table_format: avro/snap/block] (from pytest)
> Error Message
> query_test/test_date_queries.py:60: in test_queries 
> self.run_test_case('QueryTest/avro_date', vector) 
> common/impala_test_suite.py:690: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:523: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 0,0001-01-01,0001-01-01 != 
> 10,1399-06-27,2017-11-28 E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL 
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 
> 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 
> != 21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
> 22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
> 23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
> 24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
> 25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
> 26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
> 27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
> 28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
> 29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
> 30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
> 31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
> 3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
> 31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
> 5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number 
> of rows returned (expected vs actual): 22 != 15
> Stacktrace
> query_test/test_date_queries.py:60: in test_queries
> self.run_test_case('QueryTest/avro_date', vector)
> common/impala_test_suite.py:690: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
> E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
> E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
> E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
> E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
> E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
> E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
> E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
> E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
> E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
> E 25,2017-11-27,0001-06-26 != 28,2017-11-27,0001-06-27
> E 26,2017-11-27,0001-06-27 != 29,2017-11-27,2017-11-28
> E 27,2017-11-27,0001-06-28 != 30,-12-31,-12-01
> E 28,2017-11-27,0001-06-29 != 31,-12-31,-12-31
> E 29,2017-11-27,2017-11-28 != None
> E 3,0001-01-01,1399-12-31 != None
> E 30,-12-31,-12-01 != None
> E 31,-12-31,-12-31 != None
> E 4,0001-01-01,2017-11-28 != None
> E 5,0001-01-01,-12-31 != None
> E 6,0001-01-01,NULL != None
> E Number of rows returned (expected vs actual): 22 != 15
> Standard Error
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 

[jira] [Work started] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-31 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9555 started by Attila Jeges.

> TestDateQueries.test_queries failing because Hive3 switched back to the 
> hybrid Julian Gregorian calendar
> 
>
> Key: IMPALA-9555
> URL: https://issues.apache.org/jira/browse/IMPALA-9555
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
> following error:
> {code}
> query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
> exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
> 'batch_size': 1} | table_format: avro/snap/block] (from pytest)
> Error Message
> query_test/test_date_queries.py:60: in test_queries 
> self.run_test_case('QueryTest/avro_date', vector) 
> common/impala_test_suite.py:690: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:523: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 0,0001-01-01,0001-01-01 != 
> 10,1399-06-27,2017-11-28 E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL 
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 
> 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 
> != 21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
> 22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
> 23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
> 24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
> 25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
> 26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
> 27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
> 28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
> 29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
> 30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
> 31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
> 3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
> 31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
> 5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number 
> of rows returned (expected vs actual): 22 != 15
> Stacktrace
> query_test/test_date_queries.py:60: in test_queries
> self.run_test_case('QueryTest/avro_date', vector)
> common/impala_test_suite.py:690: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
> E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
> E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
> E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
> E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
> E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
> E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
> E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
> E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
> E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
> E 25,2017-11-27,0001-06-26 != 28,2017-11-27,0001-06-27
> E 26,2017-11-27,0001-06-27 != 29,2017-11-27,2017-11-28
> E 27,2017-11-27,0001-06-28 != 30,-12-31,-12-01
> E 28,2017-11-27,0001-06-29 != 31,-12-31,-12-31
> E 29,2017-11-27,2017-11-28 != None
> E 3,0001-01-01,1399-12-31 != None
> E 30,-12-31,-12-01 != None
> E 31,-12-31,-12-31 != None
> E 4,0001-01-01,2017-11-28 != None
> E 5,0001-01-01,-12-31 != None
> E 6,0001-01-01,NULL != None
> E Number of rows returned (expected vs actual): 22 != 15
> Standard Error
> ERROR:test_configuration:Comparing QueryTestResults (expected vs 

[jira] [Issue Comment Deleted] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-31 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-9555:
-
Comment: was deleted

(was: [~stakiar] You need to do a fresh data load with Hive3, otherwise the 
test won't work.)

> TestDateQueries.test_queries failing because Hive3 switched back to the 
> hybrid Julian Gregorian calendar
> 
>
> Key: IMPALA-9555
> URL: https://issues.apache.org/jira/browse/IMPALA-9555
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
> following error:
> {code}
> query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
> exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
> 'batch_size': 1} | table_format: avro/snap/block] (from pytest)
> Error Message
> query_test/test_date_queries.py:60: in test_queries 
> self.run_test_case('QueryTest/avro_date', vector) 
> common/impala_test_suite.py:690: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:523: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 0,0001-01-01,0001-01-01 != 
> 10,1399-06-27,2017-11-28 E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL 
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 
> 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 
> != 21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
> 22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
> 23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
> 24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
> 25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
> 26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
> 27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
> 28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
> 29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
> 30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
> 31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
> 3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
> 31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
> 5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number 
> of rows returned (expected vs actual): 22 != 15
> Stacktrace
> query_test/test_date_queries.py:60: in test_queries
> self.run_test_case('QueryTest/avro_date', vector)
> common/impala_test_suite.py:690: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
> E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
> E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
> E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
> E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
> E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
> E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
> E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
> E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
> E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
> E 25,2017-11-27,0001-06-26 != 28,2017-11-27,0001-06-27
> E 26,2017-11-27,0001-06-27 != 29,2017-11-27,2017-11-28
> E 27,2017-11-27,0001-06-28 != 30,-12-31,-12-01
> E 28,2017-11-27,0001-06-29 != 31,-12-31,-12-31
> E 29,2017-11-27,2017-11-28 != None
> E 3,0001-01-01,1399-12-31 != None
> E 30,-12-31,-12-01 != None
> E 31,-12-31,-12-31 != None
> E 4,0001-01-01,2017-11-28 != None
> E 5,0001-01-01,-12-31 != None
> E 6,0001-01-01,NULL != None
> E Number of rows returned (expected vs 

[jira] [Commented] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-31 Thread Attila Jeges (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071602#comment-17071602
 ] 

Attila Jeges commented on IMPALA-9555:
--

[~stakiar] You need to do a fresh data load with Hive3, otherwise the test 
won't work.

> TestDateQueries.test_queries failing because Hive3 switched back to the 
> hybrid Julian Gregorian calendar
> 
>
> Key: IMPALA-9555
> URL: https://issues.apache.org/jira/browse/IMPALA-9555
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
> following error:
> {code}
> query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
> exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
> 'batch_size': 1} | table_format: avro/snap/block] (from pytest)
> Error Message
> query_test/test_date_queries.py:60: in test_queries 
> self.run_test_case('QueryTest/avro_date', vector) 
> common/impala_test_suite.py:690: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:523: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 0,0001-01-01,0001-01-01 != 
> 10,1399-06-27,2017-11-28 E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL 
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 
> 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 
> != 21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
> 22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
> 23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
> 24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
> 25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
> 26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
> 27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
> 28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
> 29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
> 30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
> 31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
> 3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
> 31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
> 5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number 
> of rows returned (expected vs actual): 22 != 15
> Stacktrace
> query_test/test_date_queries.py:60: in test_queries
> self.run_test_case('QueryTest/avro_date', vector)
> common/impala_test_suite.py:690: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
> E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
> E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
> E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
> E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
> E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
> E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
> E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
> E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
> E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
> E 25,2017-11-27,0001-06-26 != 28,2017-11-27,0001-06-27
> E 26,2017-11-27,0001-06-27 != 29,2017-11-27,2017-11-28
> E 27,2017-11-27,0001-06-28 != 30,-12-31,-12-01
> E 28,2017-11-27,0001-06-29 != 31,-12-31,-12-31
> E 29,2017-11-27,2017-11-28 != None
> E 3,0001-01-01,1399-12-31 != None
> E 30,-12-31,-12-01 != None
> E 31,-12-31,-12-31 != None
> E 4,0001-01-01,2017-11-28 != None
> E 5,0001-01-01,-12-31 != None
> E 6,0001-01-01,NULL != None
> E Number of rows returned (expected 

[jira] [Updated] (IMPALA-9558) Add m2 mountpoint to the Docker-based test infra

2020-03-31 Thread Laszlo Gaal (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Gaal updated IMPALA-9558:

Summary: Add m2 mountpoint to the Docker-based test infra  (was: Consider 
adding an m2 mountpoint to the Docker-based test infra)

> Add m2 mountpoint to the Docker-based test infra
> 
>
> Key: IMPALA-9558
> URL: https://issues.apache.org/jira/browse/IMPALA-9558
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Laszlo Gaal
>Priority: Major
>
> This is similar to the ccache-directory mount point: the goal would be to 
> accelerate subsequent Docker-based builds on the same host by keeping 
> downloaded Java artifacts in a cache that can be shared between Docker runs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org