[jira] [Commented] (IMPALA-7537) REVOKE GRANT OPTION regression

2018-09-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628282#comment-16628282
 ] 

ASF subversion and git services commented on IMPALA-7537:
-

Commit c5dc6ded68c62f9f2138ab3376531c6292d1df78 in impala's branch 
refs/heads/master from [~aholley]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c5dc6de ]

IMPALA-7537: REVOKE GRANT OPTION regression

This patch fixes several issues around granting and revoking of
privileges.  This includes:
- REVOKE ALL ON SERVER where the privilege has the grant option was
  removing from the cache but not Sentry.
- With the addition of the grantoption to the name in the catalog
  object, refactoring was required to make grants and revokes work
  correctly.

Assertions with regard to granting and revoking:
- If there is a privilege that has the grant option, that privilege
  can be revoked simply with "REVOKE privilege..." or the grant option
  can be removed with "REVOKE GRANT OPTION ON..."
- We should not limit the privilege being revoked simply because it
  has the grant option.
- If a privilege already exists without the grant option, granting the
  privilege with the grant option should add the grant option to it.
- If a privilege already exists with the grant option, granting the
  privilege without the grant option will not change anything as the
  expectation is if you want to remove the grant option, you should
  explicitly use the "REVOKE GRANT OPTION ON...".

Testing:
- Added new grant/revoke tests that validate cache and Sentry refresh
- Ran all FE, E2E, and custom-cluster tests.

Change-Id: I3be5c8f15e9bc53e9661347578832bf446abaedc
Reviewed-on: http://gerrit.cloudera.org:8080/11483
Reviewed-by: Fredy Wijaya 
Tested-by: Impala Public Jenkins 


> REVOKE GRANT OPTION regression
> --
>
> Key: IMPALA-7537
> URL: https://issues.apache.org/jira/browse/IMPALA-7537
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Adam Holley
>Assignee: Adam Holley
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Recent commit ec88aa2 added 'grantoption' to the privilege name.  This name 
> is used by the catalog cache which broke "revoke grant option" since the 
> privilege names do not match.
> [localhost:21000] default> create role foo_role;
> [localhost:21000] default> grant all on server to foo_role with grant option;
> [localhost:21000] default> revoke grant option for all on server from 
> foo_role;
> ERROR: IllegalStateException: null



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-1760) Add decommissioning support / graceful shutdown / quiesce

2018-09-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628284#comment-16628284
 ] 

ASF subversion and git services commented on IMPALA-1760:
-

Commit f46de21140f3bb483884fc49f5ded7afc466faac in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=f46de21 ]

IMPALA-1760: Implement shutdown command

This is the same patch except with fixes for the test failures
on EC and S3 noted in the JIRA.

This allows graceful shutdown of executors and partially graceful
shutdown of coordinators (new operations fail, old operations can
continue).

Details:
* In order to allow future admin commands, this is implemented with
  function-like syntax and does not add any reserved words.
* ALL privilege is required on the server
* The coordinator impalad that the client is connected to can be shut
  down directly with ":shutdown()".
* Remote shutdown of another impalad is supported, e.g. with
  ":shutdown('hostname')", so that non-coordinators can be shut down
  and for the convenience of the client, which does not have to
  connect to the specific impalad. There is no assumption that the
  other impalad is registered in the statestore; just that the
  coordinator can connect to the other daemon's thrift endpoint.
  This simplifies things and allows shutdown in various important
  cases, e.g. statestore down.
* The shutdown time limit can be overridden to force a quicker or
  slower shutdown by specifying a deadline in seconds after the
  statement is executed.
* If shutting down, a banner is shown on the root debug page.

Workflow:
1. (if a coordinator) clients are prevented from submitting
  queries to this coordinator via some out-of-band mechanism,
  e.g. load balancer
2. the shutdown process is started via ":shutdown()"
3. a bit is set in the statestore and propagated to coordinators,
  which stop scheduling fragment instances on this daemon
  (if an executor).
4. the query startup grace period (which is ideally set to the AC
  queueing delay plus some additional leeway) expires
5. once the daemon is quiesced (i.e. no fragments, no registered
  queries), it shuts itself down.
6. If the daemon does not successfully quiesce (e.g. rogue clients,
  long-running queries), after a longer timeout (counted from the start
  of the shutdown process) it will shut down anyway.

What this does:
* Executors can be shut down without causing a service-wide outage
* Shutting down an executor will not disrupt any short-running queries
  and will wait for long-running queries up to a threshold.
* Coordinators can be shut down without query failures only if
  there is an out-of-band mechanism to prevent submission of more
  queries to the shut down coordinator. If queries are submitted to
  a coordinator after shutdown has started, they will fail.
* Long running queries or other issues (e.g. stuck fragments) will
  slow down but not prevent eventual shutdown.

Limitations:
* The startup grace period needs to be configured to be greater than
  the latency of statestore updates + scheduling + admission +
  coordinator startup. Otherwise a coordinator may send a
  fragment instance to the shutting down impalad. (We could
  automate this configuration as a follow-on)
* The startup grace period means a minimum latency for shutdown,
  even if the cluster is idle.
* We depend on the statestore detecting the process going down
  if queries are still running on that backend when the timeout
  expires. This may still be subject to existing problems,
  e.g. IMPALA-2990.

Tests:
* Added parser, analysis and authorization tests.
* End-to-end test of shutting down impalads.
* End-to-end test of shutting down then restarting an executor while
  queries are running.
* End-to-end test of shutting down a coordinator
  - New queries cannot be started on coord, existing queries continue to run
  - Exercises various Beeswax and HS2 operations.

Change-Id: I8f3679ef442745a60a0ab97c4e9eac437aef9463
Reviewed-on: http://gerrit.cloudera.org:8080/11484
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add decommissioning support / graceful shutdown / quiesce
> -
>
> Key: IMPALA-1760
> URL: https://issues.apache.org/jira/browse/IMPALA-1760
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Distributed Exec
>Affects Versions: Impala 2.1.1
>Reporter: Henry Robinson
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: resource-management, scalability, scheduler, usability
>
> In larger clusters, node maintenance is a frequent occurrence. There's no way 
> currently to stop an Impala node without failing running queries, without 
> draining queries across

[jira] [Commented] (IMPALA-110) Add support for multiple distinct operators in the same query block

2018-09-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628287#comment-16628287
 ] 

ASF subversion and git services commented on IMPALA-110:


Commit df53ec2385190bba2b3cefb43b094cde6d33642f in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=df53ec2 ]

IMPALA-110: Support for multiple DISTINCT

This patch adds support for having multiple aggregate functions in a
single SELECT block that use DISTINCT over different sets of columns.

Planner design:
- The existing tree-based plan shape with a two-phased
  aggregation is maintained.
- Existing plans are not changed.
- Aggregates are grouped into 'aggregation classes' based on their
  expressions in the distinct portion which may be empty for
  non-distinct aggregates.
- The aggregation framework is generalized to simultaneously process
  multiple aggregation classes within the tree-based plan. This
  process splits the results of different aggregation classes into
  separate rows, so a final aggregation is needed to transpose the
  results into the desired form.
- Main challenge: Each aggregation class consumes and produces
  different tuples, so conceptually a union-type of tuples flows
  through the runtime. The tuple union is represented by a TupleRow
  with one tuple per aggregation class. Only one tuple in such a
  TupleRow is non-NULL.
- Backend exec nodes in the aggregation plan will be aware of this
  tuple-union either explicitly in their implementation or by relying
  on expressions that distinguish the aggregation classes.
- To distinguish the aggregation classes, e.g. in hash exchanges,
  CASE expressions are crafted to hash/group on the appropriate slots.

Deferred FE work:
- Beautify/condense the long CASE exprs
- Push applicable conjuncts into individual aggregators before
  the transposition step
- Added a few testing TODOs to reduce the size of this patch
- Decide whether we want to change existing plans to the new model

Execution design:
- Previous patches separated out aggregation logic from the exec node
  into Aggregators. This is extended to support multiple Aggregators
  per node, with different grouping and aggregating functions.
- There is a fast path for aggregations with only one aggregator,
  which leaves the execution essentially unchanged from before.
- When there are multiple aggregators, the first aggregation node in
  the plan replicates its input to each aggregator. The output of this
  step is rows where only a single tuple is non-null, corresponding to
  the aggregator that produced the row.
- A new expr is introduced, ValidTupleId, which takes one of these
  rows and returns which tuple is non-null.
- For additional aggregation nodes, the input is split apart into
  'mini-batches' according to which aggregator the row corresponds to.

Testing:
- Added analyzer and planner tests
- Added end-to-end queries tests
- Ran hdfs/core tests
- Added support in the query generator and ran in a loop.

Change-Id: I055402eaef6d81e5f70e850d9f8a621e766830a4
Reviewed-on: http://gerrit.cloudera.org:8080/10771
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add support for multiple distinct operators in the same query block
> ---
>
> Key: IMPALA-110
> URL: https://issues.apache.org/jira/browse/IMPALA-110
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Affects Versions: Impala 0.5, Impala 1.4, Impala 2.0, Impala 2.2, Impala 
> 2.3.0
>Reporter: Greg Rahn
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: sql-language
>
> Impala only allows a single (DISTINCT columns) expression in each query.
> {color:red}Note:
> If you do not need precise accuracy, you can produce an estimate of the 
> distinct values for a column by specifying NDV(column); a query can contain 
> multiple instances of NDV(column). To make Impala automatically rewrite 
> COUNT(DISTINCT) expressions to NDV(), enable the APPX_COUNT_DISTINCT query 
> option.
> {color}
> {code}
> [impala:21000] > select count(distinct i_class_id) from item;
> Query: select count(distinct i_class_id) from item
> Query finished, fetching results ...
> 16
> Returned 1 row(s) in 1.51s
> {code}
> {code}
> [impala:21000] > select count(distinct i_class_id), count(distinct 
> i_brand_id) from item;
> Query: select count(distinct i_class_id), count(distinct i_brand_id) from item
> ERROR: com.cloudera.impala.common.AnalysisException: Analysis exception (in 
> select count(distinct i_class_id), count(distinct i_brand_id) from item)
>   at 
> com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:133)
>   at 
> com.cloudera.impala.ser

[jira] [Commented] (IMPALA-2990) Coordinator should timeout a connection for an unresponsive backend

2018-09-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628285#comment-16628285
 ] 

ASF subversion and git services commented on IMPALA-2990:
-

Commit f46de21140f3bb483884fc49f5ded7afc466faac in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=f46de21 ]

IMPALA-1760: Implement shutdown command

This is the same patch except with fixes for the test failures
on EC and S3 noted in the JIRA.

This allows graceful shutdown of executors and partially graceful
shutdown of coordinators (new operations fail, old operations can
continue).

Details:
* In order to allow future admin commands, this is implemented with
  function-like syntax and does not add any reserved words.
* ALL privilege is required on the server
* The coordinator impalad that the client is connected to can be shut
  down directly with ":shutdown()".
* Remote shutdown of another impalad is supported, e.g. with
  ":shutdown('hostname')", so that non-coordinators can be shut down
  and for the convenience of the client, which does not have to
  connect to the specific impalad. There is no assumption that the
  other impalad is registered in the statestore; just that the
  coordinator can connect to the other daemon's thrift endpoint.
  This simplifies things and allows shutdown in various important
  cases, e.g. statestore down.
* The shutdown time limit can be overridden to force a quicker or
  slower shutdown by specifying a deadline in seconds after the
  statement is executed.
* If shutting down, a banner is shown on the root debug page.

Workflow:
1. (if a coordinator) clients are prevented from submitting
  queries to this coordinator via some out-of-band mechanism,
  e.g. load balancer
2. the shutdown process is started via ":shutdown()"
3. a bit is set in the statestore and propagated to coordinators,
  which stop scheduling fragment instances on this daemon
  (if an executor).
4. the query startup grace period (which is ideally set to the AC
  queueing delay plus some additional leeway) expires
5. once the daemon is quiesced (i.e. no fragments, no registered
  queries), it shuts itself down.
6. If the daemon does not successfully quiesce (e.g. rogue clients,
  long-running queries), after a longer timeout (counted from the start
  of the shutdown process) it will shut down anyway.

What this does:
* Executors can be shut down without causing a service-wide outage
* Shutting down an executor will not disrupt any short-running queries
  and will wait for long-running queries up to a threshold.
* Coordinators can be shut down without query failures only if
  there is an out-of-band mechanism to prevent submission of more
  queries to the shut down coordinator. If queries are submitted to
  a coordinator after shutdown has started, they will fail.
* Long running queries or other issues (e.g. stuck fragments) will
  slow down but not prevent eventual shutdown.

Limitations:
* The startup grace period needs to be configured to be greater than
  the latency of statestore updates + scheduling + admission +
  coordinator startup. Otherwise a coordinator may send a
  fragment instance to the shutting down impalad. (We could
  automate this configuration as a follow-on)
* The startup grace period means a minimum latency for shutdown,
  even if the cluster is idle.
* We depend on the statestore detecting the process going down
  if queries are still running on that backend when the timeout
  expires. This may still be subject to existing problems,
  e.g. IMPALA-2990.

Tests:
* Added parser, analysis and authorization tests.
* End-to-end test of shutting down impalads.
* End-to-end test of shutting down then restarting an executor while
  queries are running.
* End-to-end test of shutting down a coordinator
  - New queries cannot be started on coord, existing queries continue to run
  - Exercises various Beeswax and HS2 operations.

Change-Id: I8f3679ef442745a60a0ab97c4e9eac437aef9463
Reviewed-on: http://gerrit.cloudera.org:8080/11484
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Coordinator should timeout a connection for an unresponsive backend
> ---
>
> Key: IMPALA-2990
> URL: https://issues.apache.org/jira/browse/IMPALA-2990
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.3.0
>Reporter: Sailesh Mukil
>Assignee: Michael Ho
>Priority: Critical
>  Labels: hang, observability, supportability
>
> The coordinator currently waits indefinitely if it does not hear back from a 
> backend. This could cause a query to hang indefinitely in case of a network 
> error, etc.
> We should add logic

[jira] [Commented] (IMPALA-7546) Impala 3.1 Doc: Doc the new query option TIMEZONE

2018-09-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628281#comment-16628281
 ] 

ASF subversion and git services commented on IMPALA-7546:
-

Commit 17bc980d9540b29a1667841b7bffc2084204ac35 in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=17bc980 ]

IMPALA-7546: [DOCS] A new TIMEZONE query option

Documented the new TIMEZONE query option to set a time TIMEZONE
to be used in timestamp conversions.

Change-Id: I734b8b37ae2360422fce269ed87507a04e8c05ac
Reviewed-on: http://gerrit.cloudera.org:8080/11505
Tested-by: Impala Public Jenkins 
Reviewed-by: Csaba Ringhofer 


> Impala 3.1 Doc: Doc the new query option TIMEZONE
> -
>
> Key: IMPALA-7546
> URL: https://issues.apache.org/jira/browse/IMPALA-7546
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
> Fix For: Impala 3.1.0
>
>
> https://gerrit.cloudera.org/#/c/11505/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7624) test-with-docker sometimes hangs creating docker containers

2018-09-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628286#comment-16628286
 ] 

ASF subversion and git services commented on IMPALA-7624:
-

Commit 91673fee607b552f142c6ab2aad0e96efa9e0f80 in impala's branch 
refs/heads/master from [~philip]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=91673fe ]

IMPALA-7624: Workaround docker/kernel bug causing test-with-docker to sometimes 
hang.

I've observed that builds of test-with-docker that have "suite
parallelism" sometimes hang when the Docker containers are
being created. (The implementation had multiple threads calling
"docker create" simultaneously.) Trolling the mailing lists,
it's maybe a bug in Docker or the kernel. I've never caught
it live enough to strace it.

A hopeful workaround is to serialize the docker create calls, which is
easy and harmless, given that "docker create" is usually pretty quick
(subsecond) and the overall run time here is hours+.

With this change, I was able to run test-with-docker with
--suite-concurrency=6 on a c5.9xlarge in AWS, with a total runtime of
1h35m.

The hangs are intermittent and cause, in the typical case, inconsistency
in runtimes because less parallelism happens when one of the "docker
create" calls hang. (I've seen them resume after one of the other
containers finishes.) We'll find out with time whether this stabilizes
it or has no effect.

Change-Id: I3e44db7a6ce08a42d6fe574d7348332578cd9e51
Reviewed-on: http://gerrit.cloudera.org:8080/11481
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 


> test-with-docker sometimes hangs creating docker containers
> ---
>
> Key: IMPALA-7624
> URL: https://issues.apache.org/jira/browse/IMPALA-7624
> Project: IMPALA
>  Issue Type: Task
>Reporter: Philip Zeyliger
>Priority: Major
>
> I've seen the test-with-docker executions hang, or sort of hang, in threads 
> doing {{docker create}}. I think this is ultimately a Docker or kernel bug, 
> but we can work around it by serializing our "docker create" invocations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7456) Deprecated file-based authorization

2018-09-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628283#comment-16628283
 ] 

ASF subversion and git services commented on IMPALA-7456:
-

Commit 48640b5dfa131ca0c7ae9e541e376d11ac6e6d33 in impala's branch 
refs/heads/master from [~aholley]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=48640b5 ]

IMPALA-7456: Deprecate file-based authorization

This patch simply adds a warning message to the log when the
authorization_policy_file run-time flag is used.  Sentry has
deprecated the use of policy files and they do not support
user level privileges which are required for object ownership.
Here is the Jira where it will be removed. SENTRY-1922

Test:
- Added custom cluster test to validate logs
- Ran all custom cluster tests

Change-Id: Ibbb13f3ef1c3a00812c180ecef022ea638c2ebc7
Reviewed-on: http://gerrit.cloudera.org:8080/11502
Reviewed-by: Fredy Wijaya 
Tested-by: Impala Public Jenkins 


> Deprecated file-based authorization
> ---
>
> Key: IMPALA-7456
> URL: https://issues.apache.org/jira/browse/IMPALA-7456
> Project: IMPALA
>  Issue Type: Dependency upgrade
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Adam Holley
>Assignee: Adam Holley
>Priority: Major
>  Labels: security
> Fix For: Impala 3.1.0
>
>
> Sentry has deprecated their support of file-based authorizations.  Some newer 
> security features such as object ownership require user level authorizations 
> which the file-based security does not support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7600) Mem limit exceeded in test_kudu_scan_mem_usage

2018-09-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628945#comment-16628945
 ] 

ASF subversion and git services commented on IMPALA-7600:
-

Commit ce145ffee6ee68a60c4ef663cb9f47f22d9eb19f in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ce145ff ]

IMPALA-7600: bump mem_limit for test_kudu_scan_mem_usage

The estimate for memory consumption for this scan is 9 columns * 384kb
per column = 3.375mb. So if we set the mem_limit to 6.5mb, we should
still not get more than one scanner thread, but we can avoid hitting
out-of-memory.

The issue in the JIRA was queued row batches. With this change, and
num_scanner_threads=2, there should be max 12 row batches
(10 in the queue, 2 in the scanner threads about to be enqueued)
and based on the column stats I'd estimate that each row batch is
around 200kb, so this change should provide significantly more headroom.

Change-Id: I6d992cc076bc8678089f765bdffe92e877e9d229
Reviewed-on: http://gerrit.cloudera.org:8080/11513
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Mem limit exceeded in test_kudu_scan_mem_usage
> --
>
> Key: IMPALA-7600
> URL: https://issues.apache.org/jira/browse/IMPALA-7600
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
>
> Seen in an exhaustive release build:
> {noformat}
> 00:05:35  TestScanMemLimit.test_kudu_scan_mem_usage[exec_option: 
> {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: avro/snap/block] 
> 00:05:35 [gw6] linux2 -- Python 2.7.5 
> /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/bin/../infra/python/env/bin/python
> 00:05:35 query_test/test_mem_usage_scaling.py:358: in test_kudu_scan_mem_usage
> 00:05:35 self.run_test_case('QueryTest/kudu-scan-mem-usage', vector)
> 00:05:35 common/impala_test_suite.py:408: in run_test_case
> 00:05:35 result = self.__execute_query(target_impalad_client, query, 
> user=user)
> 00:05:35 common/impala_test_suite.py:623: in __execute_query
> 00:05:35 return impalad_client.execute(query, user=user)
> 00:05:35 common/impala_connection.py:160: in execute
> 00:05:35 return self.__beeswax_client.execute(sql_stmt, user=user)
> 00:05:35 beeswax/impala_beeswax.py:176: in execute
> 00:05:35 handle = self.__execute_query(query_string.strip(), user=user)
> 00:05:35 beeswax/impala_beeswax.py:350: in __execute_query
> 00:05:35 self.wait_for_finished(handle)
> 00:05:35 beeswax/impala_beeswax.py:371: in wait_for_finished
> 00:05:35 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> 00:05:35 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 00:05:35 EQuery aborted:Memory limit exceeded: Error occurred on backend 
> impala-ec2-centos74-m5-4xlarge-ondemand-0e2c.vpc.cloudera.com:22000 by 
> fragment b34270820f59a0c9:a507139e0001
> 00:05:35 E   Memory left in process limit: 10.12 GB
> 00:05:35 E   Memory left in query limit: -16.92 KB
> 00:05:35 E   Query(b34270820f59a0c9:a507139e): memory limit exceeded. 
> Limit=4.00 MB Reservation=0 ReservationLimit=0 OtherMemory=4.02 MB Total=4.02 
> MB Peak=4.02 MB
> 00:05:35 E Fragment b34270820f59a0c9:a507139e: Reservation=0 
> OtherMemory=40.10 KB Total=40.10 KB Peak=340.00 KB
> 00:05:35 E   EXCHANGE_NODE (id=2): Reservation=32.00 KB OtherMemory=0 
> Total=32.00 KB Peak=32.00 KB
> 00:05:35 E KrpcDeferredRpcs: Total=0 Peak=0
> 00:05:35 E   PLAN_ROOT_SINK: Total=0 Peak=0
> 00:05:35 E   CodeGen: Total=103.00 B Peak=332.00 KB
> 00:05:35 E Fragment b34270820f59a0c9:a507139e0001: Reservation=0 
> OtherMemory=3.98 MB Total=3.98 MB Peak=3.98 MB
> 00:05:35 E   SORT_NODE (id=1): Total=342.00 KB Peak=342.00 KB
> 00:05:35 E   KUDU_SCAN_NODE (id=0): Total=3.63 MB Peak=3.63 MB
> 00:05:35 E Queued Batches: Total=3.30 MB Peak=3.63 MB
> 00:05:35 E   KrpcDataStreamSender (dst_id=2): Total=1.16 KB Peak=1.16 KB
> 00:05:35 E   CodeGen: Total=3.66 KB Peak=1.14 MB
> 00:05:35 E   
> 00:05:35 E   Memory limit exceeded: Error occurred on backend 
> impala-ec2-centos74-m5-4xlarge-ondemand-0e2c.vpc.cloudera.com:22000 by 
> fragment b34270820f59a0c9:a507139e0001
> 00:05:35 E   Memory left in process limit: 10.12 GB
> 00:05:35 E   Memory left in query limit: -16.92 KB
> 00:05:35 E   Query(b34270820f59a0c9:a507139e): memory limit exceeded. 
> Limi

[jira] [Commented] (IMPALA-7628) test_tls_ecdh failing on CentOS 6/Python 2.6

2018-09-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629342#comment-16629342
 ] 

ASF subversion and git services commented on IMPALA-7628:
-

Commit 09150f04cac84965e3b390404c57a51261aecf56 in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=09150f0 ]

IMPALA-7628: skip test_tls_ecdh on Python 2.6

This is a temporary workaround. On the CentOS 6 build that failed
test_tls_v12, test_wildcard_san_ssl and test_wildcard_ssl were
all skipped so I figured this will unblock the tests without
losing coverage on most platforms that have recent Python.

Change-Id: I94ae9d254d5fd337774a24106eb9b08585ac0b01
Reviewed-on: http://gerrit.cloudera.org:8080/11519
Reviewed-by: Thomas Marshall 
Tested-by: Impala Public Jenkins 


> test_tls_ecdh failing on CentOS 6/Python 2.6
> 
>
> Key: IMPALA-7628
> URL: https://issues.apache.org/jira/browse/IMPALA-7628
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
> Environment: CentOS 6.4, Python 2.6
>Reporter: Tim Armstrong
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
>
> {noformat}
> custom_cluster/test_client_ssl.py:125: in test_tls_ecdh
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:198: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.6/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.6/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.6/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.6/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 1] _ssl.c:490: error:14094410:SSL 
> routines:SSL3_READ_BYTES:sslv3 alert handshake failure
> E   Not connected to Impala, could not execute queries.
> {noformat}
> Git hash is e38715e25297cc3643482be04e3b1b273e339b54
> I'm going to push out a temporary fix to unblock tests (since there are other 
> related tests skipped on this platform) but I'll let Thomas validate the 
> correctness of it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7616) Refactor PrincipalPrivilege.buildPrivilegeName

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631015#comment-16631015
 ] 

ASF subversion and git services commented on IMPALA-7616:
-

Commit 8f766f02c0116812c25bfa88c7fcf15b8f058b63 in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=8f766f0 ]

IMPALA-7616: Remove the use of privilege name in TPrivilege

Prior to this patch, privilege name was a field in TPrivilege Thrift
message. The privilege name was constructed from any other fields in
the TPrivilege. This is very error-prone since setting privilege name
prior to setting any other fields can cause a consistency issue.
Another issue is updating a particular field in TPrivilege requires
re-building the privilege name and updating its field. This patch
refactors the code by removing the privilege name field from the
TPrivilege and generates the privilege name as needed.

Testing:
- Ran all FE tests
- Added a new authorization E2E test
- Ran authorization E2E tests

Change-Id: Ia813dcc7d3872f126865c1f8f37175201a0b10ab
Reviewed-on: http://gerrit.cloudera.org:8080/11509
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Refactor PrincipalPrivilege.buildPrivilegeName
> --
>
> Key: IMPALA-7616
> URL: https://issues.apache.org/jira/browse/IMPALA-7616
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Adam Holley
>Assignee: Fredy Wijaya
>Priority: Minor
> Fix For: Impala 3.1.0
>
>
> The buildPrivilegeName pattern across the frontend code is odd in that 
> setting the name is an explicit function and not built during the get from 
> the constituent parts.  e.g. If you create a privilege that doesn't have the 
> grant option set, and then set the grant option after, the getPrivilegeName() 
> will return a name that does not have the grant option.  This should be 
> refactored to build the name on the getPrivilegeName call based on the 
> current values in the Privilege object.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6857) Add JVM Pause Monitor to Impala Processes

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631014#comment-16631014
 ] 

ASF subversion and git services commented on IMPALA-6857:
-

Commit abd230647fa92db29ac3719096eb4ebc7c151069 in impala's branch 
refs/heads/master from [~philip]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=abd2306 ]

IMPALA-7596. Adding JvmPauseMonitor (and other GC) metrics to Impala metrics.

Following up to IMPALA-6857, it's useful for monitoring tools to see if
the pause monitor is getting triggered, and to see other GC metrics.

The Java side here, and the Thrift side, were easy enough.

However, the Impala metric implementation here caused us to call into
the frontend to read through the JMX memory beans 72 times, because each
call to GetValue() was getting all the data for the pool. This structure
made it hard to add additional, non-pool, metrics, and it felt wasteful.
To combat this, I added a cache of 10 seconds for getting the metrics
from the Frontend. The counters will typically re-use the same data.

There are five metrics here, and to avoid yet another enum class, I used
C++ lambdas to capture which field of the Thrift object I care about. If
folks like the approach, I think it can simplify way the enums for the
pool metrics as well.

I measured the cost of calling into the metrics code by
looping the metrics-gathering 100 times and looking at CPU
time for the process using this script:

  START_CPU=$(cat /proc/$(fuser 25000/tcp 2> /dev/null | tr -d ' ')/stat | awk 
'{ print $14 + $15 }')
  for i in $(seq 100); do
curl http://localhost:25000/jsonmetrics?json > /dev/null 2> /dev/null
  done
  END_CPU=$(  cat /proc/$(fuser 25000/tcp 2> /dev/null | tr -d ' ')/stat | awk 
'{ print $14 + $15 }')
  echo $START_CPU $END_CPU $(($END_CPU - $START_CPU))

On a release build on my development machine, gathering metrics 100
times took 0.16 cpu seconds without this change and 0.07 cpu seconds
with this change. The measurement accuracy here is 0.01 (I spot-checked
this with using the cpuacct cgroup infrastructure which gives you nanos,
but it was more painful to script), but this convinces me that this is a
net improvement.

Change-Id: Ia707393962ad94ef715ec015b3fe3bb1769104a2
Reviewed-on: http://gerrit.cloudera.org:8080/11468
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add JVM Pause Monitor to Impala Processes
> -
>
> Key: IMPALA-6857
> URL: https://issues.apache.org/jira/browse/IMPALA-6857
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog, Frontend
>Reporter: Philip Zeyliger
>Priority: Major
>  Labels: ramp-up, supportability
> Fix For: Impala 3.1.0
>
>
> In IMPALA-3114, we added a pause monitor for Impala. In addition to that, we 
> should port/borrow Hadoop's JvmPauseMonitor 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/JvmPauseMonitor.java.]
>  I believe that when the JVM is aggressively GCing, the C++ threads will 
> continue to get scheduled (and won't log), but the Java ones will log. (I've 
> definitely seen JvmPauseMonitor be accurate many times.)
> [~bharathv], when you were testing this, were you able to reproduce it 
> triggering when the JVM half was in "GC hell"?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7353) Fix bogus too-high memory estimates

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631016#comment-16631016
 ] 

ASF subversion and git services commented on IMPALA-7353:
-

Commit 1c94450ca92606fb6b708de2ea07445cc6610dbf in impala's branch 
refs/heads/master from [~bikram.sngh91]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=1c94450 ]

IMPALA-7353: Improve memory estimates for Hbase Scan Nodes

Currently for hbase scan nodes we use a constant estimate of 1GB which
is generally a gross over-estimation. This patch improves upon those
estimates by using huerestics based on how hbase rows are stored and
fetched and how the scanners interact with the internal memory pool.

Testing:
Added/Modified resource requirements planner test.
Added a junit test for the estimation logic.

Change-Id: I583545c3f5e454854f111871c5fbc4f108ae4bff
Reviewed-on: http://gerrit.cloudera.org:8080/11306
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Fix bogus too-high memory estimates
> ---
>
> Key: IMPALA-7353
> URL: https://issues.apache.org/jira/browse/IMPALA-7353
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: resource-management
>
> Some operators have bogus memory estimates that are probably way too high. We 
> should adjust them to be a little more realistic.
> E.g. in HBaseScanNode:
> {code}
>   @Override
>   public void computeNodeResourceProfile(TQueryOptions queryOptions) {
> // TODO: What's a good estimate of memory consumption?
> nodeResourceProfile_ =  ResourceProfile.noReservation(1024L * 1024L * 
> 1024L);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7596) Expose JvmPauseMonitor and GC Metrics to Impala's metrics infrastructure

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631013#comment-16631013
 ] 

ASF subversion and git services commented on IMPALA-7596:
-

Commit abd230647fa92db29ac3719096eb4ebc7c151069 in impala's branch 
refs/heads/master from [~philip]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=abd2306 ]

IMPALA-7596. Adding JvmPauseMonitor (and other GC) metrics to Impala metrics.

Following up to IMPALA-6857, it's useful for monitoring tools to see if
the pause monitor is getting triggered, and to see other GC metrics.

The Java side here, and the Thrift side, were easy enough.

However, the Impala metric implementation here caused us to call into
the frontend to read through the JMX memory beans 72 times, because each
call to GetValue() was getting all the data for the pool. This structure
made it hard to add additional, non-pool, metrics, and it felt wasteful.
To combat this, I added a cache of 10 seconds for getting the metrics
from the Frontend. The counters will typically re-use the same data.

There are five metrics here, and to avoid yet another enum class, I used
C++ lambdas to capture which field of the Thrift object I care about. If
folks like the approach, I think it can simplify way the enums for the
pool metrics as well.

I measured the cost of calling into the metrics code by
looping the metrics-gathering 100 times and looking at CPU
time for the process using this script:

  START_CPU=$(cat /proc/$(fuser 25000/tcp 2> /dev/null | tr -d ' ')/stat | awk 
'{ print $14 + $15 }')
  for i in $(seq 100); do
curl http://localhost:25000/jsonmetrics?json > /dev/null 2> /dev/null
  done
  END_CPU=$(  cat /proc/$(fuser 25000/tcp 2> /dev/null | tr -d ' ')/stat | awk 
'{ print $14 + $15 }')
  echo $START_CPU $END_CPU $(($END_CPU - $START_CPU))

On a release build on my development machine, gathering metrics 100
times took 0.16 cpu seconds without this change and 0.07 cpu seconds
with this change. The measurement accuracy here is 0.01 (I spot-checked
this with using the cpuacct cgroup infrastructure which gives you nanos,
but it was more painful to script), but this convinces me that this is a
net improvement.

Change-Id: Ia707393962ad94ef715ec015b3fe3bb1769104a2
Reviewed-on: http://gerrit.cloudera.org:8080/11468
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Expose JvmPauseMonitor and GC Metrics to Impala's metrics infrastructure
> 
>
> Key: IMPALA-7596
> URL: https://issues.apache.org/jira/browse/IMPALA-7596
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Major
>
> In IMPALA-6857 we added a thread that checks for GC pauses a bit. To allow 
> monitoring tools to pick up on the fact that pauses are happening, it's 
> useful to promote those as full-fledged metrics.
> It turns out we were also collecting those metrics by doing a lot of round 
> trips to the Java side of the house. This JIRA may choose to address that as 
> well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6202) mod and % should be equivalent

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631019#comment-16631019
 ] 

ASF subversion and git services commented on IMPALA-6202:
-

Commit ba27b038148f0694662c14710f18ee6e94cf82b7 in impala's branch 
refs/heads/master from [~yzhangal]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ba27b03 ]

IMPALA-6202. mod and % should be equivalent.

Currently in DECIMAL V2 mode, typeof(9.9 % 3) is DECIMAL(2,1) and 
typeof(mod(9.9, 3)) is
DECIMAL(4,1), while both are expected to be DECIMAL(2,1). This jira fixes V2 
mode by
replacing "mod" with "%" at parser stage thus they share the same code path 
afterwards.

Testing:
Added unit tests and done real cluster testing.

Change-Id: Ib0067da04083859ffbf662a29007154461bea2ba
Reviewed-on: http://gerrit.cloudera.org:8080/11443
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 


> mod and % should be equivalent
> --
>
> Key: IMPALA-6202
> URL: https://issues.apache.org/jira/browse/IMPALA-6202
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 2.11.0
>Reporter: Jim Apple
>Assignee: Yongjun Zhang
>Priority: Major
>
> The docs say:
> "mod(numeric_type a, same_type b) Purpose: Returns the modulus of a number. 
> Equivalent to the % arithmetic operator."
> and
> "fmod(double a, double b), fmod(float a, float b) Purpose: Returns the 
> modulus of a floating-point number. Equivalent to the % arithmetic operator."
> But these can't both be true:
> {noformat}
> [localhost:21000] > select typeof(9.9 % 3), typeof(mod(9.9, 3)), 
> typeof(fmod(9.9, 3));
> +-+-+--+
> | typeof(9.9 % 3) | typeof(mod(9.9, 3)) | typeof(fmod(9.9, 3)) |
> +-+-+--+
> | DECIMAL(2,1)| DECIMAL(4,1)| FLOAT|
> +-+-+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7631) Add Sentry configuration to allow specific privileges to be granted explicitly

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631018#comment-16631018
 ] 

ASF subversion and git services commented on IMPALA-7631:
-

Commit 3f78b74c921508cb30099dc9f809d898ba4aced7 in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=3f78b74 ]

IMPALA-7631: Add sentry.db.explicit.grants.permitted in sentry-site*.xml

SENTRY-2413 requires a new configuration:
sentry.db.explicit.grants.permitted to be added into sentry-site*.xml to
specify which privileges are permitted to be granted explicitly.

Testing:
- Ran all FE tests
- Ran authorization E2E tests

Change-Id: I4adac50fe194cb341d49a40915763f70cd81c348
Reviewed-on: http://gerrit.cloudera.org:8080/11527
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add Sentry configuration to allow specific privileges to be granted explicitly
> --
>
> Key: IMPALA-7631
> URL: https://issues.apache.org/jira/browse/IMPALA-7631
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Sentry requires a new configuration (sentry.db.explicit.grants.permitted) to 
> specify which privileges are permitted to be granted explicitly: 
> https://issues.apache.org/jira/browse/SENTRY-2413. We need to update 
> sentry-site*template files with a new configuration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7531) Add daemon-level metrics about fetch-from-catalog cache

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631309#comment-16631309
 ] 

ASF subversion and git services commented on IMPALA-7531:
-

Commit ac33c0c42e1e7cc898893b1ae1f69c13287d20a8 in impala's branch 
refs/heads/master from Bharath Vissapragada
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ac33c0c ]

IMPALA-7531: Daemon level catalog cache metrics

This patch adds the aggregated CatalogdMetaProvider cache stats to
the catalog metrics on the coordinators. They can be accessed under
:/metrics#catalog.

These metrics are refreshed at the end of planning, for each query run.

Testing:
---

Visual inspection by running a few queries locally and making sure
stats are updated. Also modified existing tests to account for this
behavior.

Change-Id: I23c131b77ca84aa4df8919213bbd83944fa112a5
Reviewed-on: http://gerrit.cloudera.org:8080/11511
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 


> Add daemon-level metrics about fetch-from-catalog cache
> ---
>
> Key: IMPALA-7531
> URL: https://issues.apache.org/jira/browse/IMPALA-7531
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: bharath v
>Priority: Major
>
> It would be good to expose daemon-level metrics about cache hit rate, 
> occupancy, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-110) Add support for multiple distinct operators in the same query block

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631306#comment-16631306
 ] 

ASF subversion and git services commented on IMPALA-110:


Commit 90ff232b606f68bd01c2da56fbd47a677a94d5cf in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=90ff232 ]

IMPALA-110 (part 3): Add multiple DISTINCT support to query generator

Previously, Impala was only able to support DISTINCT in aggregate
functions over a single expr per SELECT list. IMPALA-110 removes this
restriction.

This patch eliminates code in query_generator.py that grouped exprs
for aggregate functions in order to pick a single to make DISTINCT,
and instead simply iterates over all agg functions and makes each one
DISTINCT with a configurable probability.

Testing:
- Ran the query generator overnight with no problems (except the usual
  false positives).

Change-Id: I4a3f14655719ade7b2f6471c561dba4007fd46fa
Reviewed-on: http://gerrit.cloudera.org:8080/11073
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add support for multiple distinct operators in the same query block
> ---
>
> Key: IMPALA-110
> URL: https://issues.apache.org/jira/browse/IMPALA-110
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Affects Versions: Impala 0.5, Impala 1.4, Impala 2.0, Impala 2.2, Impala 
> 2.3.0
>Reporter: Greg Rahn
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: sql-language
> Fix For: Impala 3.1.0
>
>
> Impala only allows a single (DISTINCT columns) expression in each query.
> {color:red}Note:
> If you do not need precise accuracy, you can produce an estimate of the 
> distinct values for a column by specifying NDV(column); a query can contain 
> multiple instances of NDV(column). To make Impala automatically rewrite 
> COUNT(DISTINCT) expressions to NDV(), enable the APPX_COUNT_DISTINCT query 
> option.
> {color}
> {code}
> [impala:21000] > select count(distinct i_class_id) from item;
> Query: select count(distinct i_class_id) from item
> Query finished, fetching results ...
> 16
> Returned 1 row(s) in 1.51s
> {code}
> {code}
> [impala:21000] > select count(distinct i_class_id), count(distinct 
> i_brand_id) from item;
> Query: select count(distinct i_class_id), count(distinct i_brand_id) from item
> ERROR: com.cloudera.impala.common.AnalysisException: Analysis exception (in 
> select count(distinct i_class_id), count(distinct i_brand_id) from item)
>   at 
> com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:133)
>   at 
> com.cloudera.impala.service.Frontend.createExecRequest(Frontend.java:221)
>   at 
> com.cloudera.impala.service.JniFrontend.createExecRequest(JniFrontend.java:89)
> Caused by: com.cloudera.impala.common.AnalysisException: all DISTINCT 
> aggregate functions need to have the same set of parameters as COUNT(DISTINCT 
> i_class_id); deviating function: COUNT(DISTINCT i_brand_id)
>   at 
> com.cloudera.impala.analysis.AggregateInfo.createDistinctAggInfo(AggregateInfo.java:196)
>   at 
> com.cloudera.impala.analysis.AggregateInfo.create(AggregateInfo.java:143)
>   at 
> com.cloudera.impala.analysis.SelectStmt.createAggInfo(SelectStmt.java:466)
>   at 
> com.cloudera.impala.analysis.SelectStmt.analyzeAggregation(SelectStmt.java:347)
>   at com.cloudera.impala.analysis.SelectStmt.analyze(SelectStmt.java:155)
>   at 
> com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:130)
>   ... 2 more
> {code}
> Hive supports this:
> {code}
> $ hive -e "select count(distinct i_class_id), count(distinct i_brand_id) from 
> item;"
> Logging initialized using configuration in 
> file:/etc/hive/conf.dist/hive-log4j.properties
> Hive history file=/tmp/grahn/hive_job_log_grahn_201303052234_1625576708.txt
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=
> Starting Job = job_201302081514_0073, Tracking URL = 
> http://impala:50030/jobdetails.jsp?jobid=job_201302081514_0073
> Kill Command = /usr/lib/hadoop/bin/hadoop job  
> -Dmapred.job.tracker=m0525.mtv.cloudera.com:8021 -kill job_201302081514_0073
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1
> 2013-03-05 22:34:43,255 Stage-1 map = 0%,  reduce = 0%
> 2013-03-05 22:34:49,323 Stage-1 map = 100%,  reduce = 0%, Cum

[jira] [Commented] (IMPALA-7632) Erasure coding builds still failing because of default query options

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631308#comment-16631308
 ] 

ASF subversion and git services commented on IMPALA-7632:
-

Commit e83fe23a5fd61a279e51d6dd9a979f9db2574dcd in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e83fe23 ]

IMPALA-7632: fix erasure coding build for custom cluster tests

Fix tests to always pass query options via the query_options
parameter.

Modified the infrastructure to fail on non-erasure-coding builds if
tests pass in default query options in the wrong way.

Skip an restart test that makes assumptions about scheduling that EC
seems to break.

Testing:
Ran core tests with erasure coding enabled.

Change-Id: I4d809faedc0c45417519f13c73559efb6c54154e
Reviewed-on: http://gerrit.cloudera.org:8080/11536
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Erasure coding builds still failing because of default query options
> 
>
> Key: IMPALA-7632
> URL: https://issues.apache.org/jira/browse/IMPALA-7632
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build
>
> Two tests fail because the default query options they set were clobbered by 
> the custom cluster test infra:
> *TestSetAndUnset.test_set_and_unset
> *TestAdmissionController.test_set_request_pool
> {noformat}
> hs2/hs2_test_suite.py:48: in add_session
> fn(self)
> custom_cluster/test_set_and_unset.py:44: in test_set_and_unset
> assert "DEBUG_ACTION\tcustom\tDEVELOPMENT" in result.data, "baseline"
> E   AssertionError: baseline
> E   assert 'DEBUG_ACTION\tcustom\tDEVELOPMENT' in 
> ['ABORT_ON_ERROR\t0\tREGULAR', 'ALLOW_ERASURE_CODED_FILES\t1\tDEVELOPMENT', 
> 'APPX_COUNT_DISTINCT\t0\tADVANCED', 'BATCH_SIZE\t0\tDEVELOPMENT', 
> 'BUFFER_POOL_LIMIT\t\tADVANCED', 'COMPRESSION_CODEC\t\tREGULAR', ...]
> E+  where ['ABORT_ON_ERROR\t0\tREGULAR', 
> 'ALLOW_ERASURE_CODED_FILES\t1\tDEVELOPMENT', 
> 'APPX_COUNT_DISTINCT\t0\tADVANCED', 'BATCH_SIZE\t0\tDEVELOPMENT', 
> 'BUFFER_POOL_LIMIT\t\tADVANCED', 'COMPRESSION_CODEC\t\tREGULAR', ...] = 
> .data
> hs2/hs2_test_suite.py:48: in add_session
> fn(self)
> custom_cluster/test_admission_controller.py:317: in test_set_request_pool
> ['MEM_LIMIT=2', 'REQUEST_POOL=root.queueB'])
> custom_cluster/test_admission_controller.py:224: in __check_query_options
> assert False, "Expected query options %s, got %s." % (expected, actual)
> E   AssertionError: Expected query options 
> MEM_LIMIT=2,REQUEST_POOL=root.queueB, got 
> allow_erasure_coded_files=1,request_pool=root.queueb.
> E   assert False{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-110) Add support for multiple distinct operators in the same query block

2018-09-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631307#comment-16631307
 ] 

ASF subversion and git services commented on IMPALA-110:


Commit 90ff232b606f68bd01c2da56fbd47a677a94d5cf in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=90ff232 ]

IMPALA-110 (part 3): Add multiple DISTINCT support to query generator

Previously, Impala was only able to support DISTINCT in aggregate
functions over a single expr per SELECT list. IMPALA-110 removes this
restriction.

This patch eliminates code in query_generator.py that grouped exprs
for aggregate functions in order to pick a single to make DISTINCT,
and instead simply iterates over all agg functions and makes each one
DISTINCT with a configurable probability.

Testing:
- Ran the query generator overnight with no problems (except the usual
  false positives).

Change-Id: I4a3f14655719ade7b2f6471c561dba4007fd46fa
Reviewed-on: http://gerrit.cloudera.org:8080/11073
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add support for multiple distinct operators in the same query block
> ---
>
> Key: IMPALA-110
> URL: https://issues.apache.org/jira/browse/IMPALA-110
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Affects Versions: Impala 0.5, Impala 1.4, Impala 2.0, Impala 2.2, Impala 
> 2.3.0
>Reporter: Greg Rahn
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: sql-language
> Fix For: Impala 3.1.0
>
>
> Impala only allows a single (DISTINCT columns) expression in each query.
> {color:red}Note:
> If you do not need precise accuracy, you can produce an estimate of the 
> distinct values for a column by specifying NDV(column); a query can contain 
> multiple instances of NDV(column). To make Impala automatically rewrite 
> COUNT(DISTINCT) expressions to NDV(), enable the APPX_COUNT_DISTINCT query 
> option.
> {color}
> {code}
> [impala:21000] > select count(distinct i_class_id) from item;
> Query: select count(distinct i_class_id) from item
> Query finished, fetching results ...
> 16
> Returned 1 row(s) in 1.51s
> {code}
> {code}
> [impala:21000] > select count(distinct i_class_id), count(distinct 
> i_brand_id) from item;
> Query: select count(distinct i_class_id), count(distinct i_brand_id) from item
> ERROR: com.cloudera.impala.common.AnalysisException: Analysis exception (in 
> select count(distinct i_class_id), count(distinct i_brand_id) from item)
>   at 
> com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:133)
>   at 
> com.cloudera.impala.service.Frontend.createExecRequest(Frontend.java:221)
>   at 
> com.cloudera.impala.service.JniFrontend.createExecRequest(JniFrontend.java:89)
> Caused by: com.cloudera.impala.common.AnalysisException: all DISTINCT 
> aggregate functions need to have the same set of parameters as COUNT(DISTINCT 
> i_class_id); deviating function: COUNT(DISTINCT i_brand_id)
>   at 
> com.cloudera.impala.analysis.AggregateInfo.createDistinctAggInfo(AggregateInfo.java:196)
>   at 
> com.cloudera.impala.analysis.AggregateInfo.create(AggregateInfo.java:143)
>   at 
> com.cloudera.impala.analysis.SelectStmt.createAggInfo(SelectStmt.java:466)
>   at 
> com.cloudera.impala.analysis.SelectStmt.analyzeAggregation(SelectStmt.java:347)
>   at com.cloudera.impala.analysis.SelectStmt.analyze(SelectStmt.java:155)
>   at 
> com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:130)
>   ... 2 more
> {code}
> Hive supports this:
> {code}
> $ hive -e "select count(distinct i_class_id), count(distinct i_brand_id) from 
> item;"
> Logging initialized using configuration in 
> file:/etc/hive/conf.dist/hive-log4j.properties
> Hive history file=/tmp/grahn/hive_job_log_grahn_201303052234_1625576708.txt
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=
> Starting Job = job_201302081514_0073, Tracking URL = 
> http://impala:50030/jobdetails.jsp?jobid=job_201302081514_0073
> Kill Command = /usr/lib/hadoop/bin/hadoop job  
> -Dmapred.job.tracker=m0525.mtv.cloudera.com:8021 -kill job_201302081514_0073
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1
> 2013-03-05 22:34:43,255 Stage-1 map = 0%,  reduce = 0%
> 2013-03-05 22:34:49,323 Stage-1 map = 100%,  reduce = 0%, Cum

[jira] [Commented] (IMPALA-7492) Add support for DATE text parser/formatter

2018-09-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633291#comment-16633291
 ] 

ASF subversion and git services commented on IMPALA-7492:
-

Commit cb49371613909e56debee6275fd54759eb36ad33 in impala's branch 
refs/heads/master from [~attilaj]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=cb49371 ]

IMPALA-7492: Add support for DATE text parser/formatter

This change is the first step in implementing support for DATE type
(IMPALA-6169).

The DATE parser/formatter is implemented by the new DateParser class.
- The parser supports parsing both default and custom formatted DATE
values. CCTZ is used to validate the parsed dates.
- The formatter supports default and custom formatting of DATE values.

In the future, DateParser will be used in the text scanner/writer and
in the DATE <-> STRING cast functions.

The DateParser class reuses some of the functionality already
implemented in the TimestampParser class to minimize redundancy. To
make code reuse easier, a new namespace (datetime_parse_util) was
created and the common functionality was moved there.

This change also adds a new class (DateValue) to represent a DATE
value in-memory. The DateParser and DateValue classes are used only in
tests at the moment, therefore this patch doesn't change user facing
behavior.

Testing:
- Added BE-tests for DateParser and DateValue classes.
- Re-run parse-timestamp-benchmark to make sure that parser
  performance hasn't degraded.

Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Reviewed-on: http://gerrit.cloudera.org:8080/11450
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add support for DATE text parser/formatter
> --
>
> Key: IMPALA-7492
> URL: https://issues.apache.org/jira/browse/IMPALA-7492
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6169) Implement DATE type

2018-09-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633292#comment-16633292
 ] 

ASF subversion and git services commented on IMPALA-6169:
-

Commit cb49371613909e56debee6275fd54759eb36ad33 in impala's branch 
refs/heads/master from [~attilaj]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=cb49371 ]

IMPALA-7492: Add support for DATE text parser/formatter

This change is the first step in implementing support for DATE type
(IMPALA-6169).

The DATE parser/formatter is implemented by the new DateParser class.
- The parser supports parsing both default and custom formatted DATE
values. CCTZ is used to validate the parsed dates.
- The formatter supports default and custom formatting of DATE values.

In the future, DateParser will be used in the text scanner/writer and
in the DATE <-> STRING cast functions.

The DateParser class reuses some of the functionality already
implemented in the TimestampParser class to minimize redundancy. To
make code reuse easier, a new namespace (datetime_parse_util) was
created and the common functionality was moved there.

This change also adds a new class (DateValue) to represent a DATE
value in-memory. The DateParser and DateValue classes are used only in
tests at the moment, therefore this patch doesn't change user facing
behavior.

Testing:
- Added BE-tests for DateParser and DateValue classes.
- Re-run parse-timestamp-benchmark to make sure that parser
  performance hasn't degraded.

Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Reviewed-on: http://gerrit.cloudera.org:8080/11450
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Implement DATE type
> ---
>
> Key: IMPALA-6169
> URL: https://issues.apache.org/jira/browse/IMPALA-6169
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Reporter: Quanlong Huang
>Assignee: Attila Jeges
>Priority: Major
>
> In Hive, the Date type describes a particular year/month/day, in the form 
> -­MM-­DD.
> Hive has supported Date type in Parquet two years ago in Hive-1.2.0. (See 
> https://issues.apache.org/jira/browse/HIVE-8119 and 
> https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-VersionsandLimitations.)
> We should add support for Date type too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7597) "show partitions" does not retry on InconsistentMetadataFetchException

2018-09-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633295#comment-16633295
 ] 

ASF subversion and git services commented on IMPALA-7597:
-

Commit 6f7b162154daae4614a6f1da0be920394478b123 in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=6f7b162 ]

IMPALA-7599: make the number of local cache retries configurable

Under heavy read/write load, the number of retries needed for queries
in order to skip over inconsistent metadata exceptions needs to be set
higher. This change makes the number of retries configurable. It can be
set with the newly added flag --local_catalog_max_fetch_retries.
In addition, this change increases the default from 10 to 40, which was
sufficient when handling several workloads with high read/write load.
Follow-up change for IMPALA-7597 will make use of this configuration
when retrying for cases other than analyzing queries.
Made several fixes to exception messages.

Testing:
- manual tests
- added an e2e test that sets the flag and checks for inconsistent metadata

Change-Id: I4f14d5a8728f3cb07c7710589c44c2cd52478ba8
Reviewed-on: http://gerrit.cloudera.org:8080/11539
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> "show partitions" does not retry on InconsistentMetadataFetchException
> --
>
> Key: IMPALA-7597
> URL: https://issues.apache.org/jira/browse/IMPALA-7597
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Assignee: Vuk Ercegovac
>Priority: Critical
>
> IMPALA-7530 added retries in case LocalCatalog throws 
> InconsistentMetadataFetchException. These retries apply to all code paths 
> taking {{Frontend#createExecRequest()}}. 
> "show partitions" additionally takes {{Frontend#getTableStats()} and aborts 
> the first time it sees InconsistentMetadataFetchException. 
> We need to make sure all the queries (especially DDLs) retry if they hit this 
> exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7622) Add query profile metrics for RPC's used when pulling incremental stats.

2018-09-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633289#comment-16633289
 ] 

ASF subversion and git services commented on IMPALA-7622:
-

Commit 235748316c5cada5c58b3e84a4e20ee57f1c4a49 in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=2357483 ]

IMPALA-7622: adds profile metrics when fetching incremental stats

When computing incremental statistics by fetching the stats directly
from catalogd, a potentially expensive RPC is made from the impalad
coordinator to catalogd. This change adds metrics to the frontend
section of the profile to track how long the request takes, the size
of the compressed bytes received, and the number of partitions received.

The profile for a 'compute incremental ...' command on a table with
no statistics looks like this:

Frontend:
 - StatsFetch.CompressedBytes: 0
 - StatsFetch.TotalPartitions: 24
 - StatsFetch.NumPartitionsWithStats: 0
 - StatsFetch.Time: 26ms

And the profile looks as follows when the table has stats, so the stats
are fetched:

Frontend:
 - StatsFetch.CompressedBytes: 24622
 - StatsFetch.TotalPartitions: 23
 - StatsFetch.NumPartitionsWithStats: 23
 - StatsFetch.Time: 14ms

Testing:
- manual inspection
- e2e test to check the profile

Change-Id: Ic9b268548c7a98c751eb99855ee08313d1d5a903
Reviewed-on: http://gerrit.cloudera.org:8080/11534
Reviewed-by: Vuk Ercegovac 
Tested-by: Impala Public Jenkins 


> Add query profile metrics for RPC's used when pulling incremental stats.
> 
>
> Key: IMPALA-7622
> URL: https://issues.apache.org/jira/browse/IMPALA-7622
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Vuk Ercegovac
>Assignee: Vuk Ercegovac
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> When --pull_incremental_statistics is enabled, the frontend will fetch these 
> stats from catalogd. We should record metrics for this, such as number 
> partitions fetched, size of received bytes, and elapsed time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7599) Make num retries for InconsistentMetadataFetchException configurable

2018-09-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633294#comment-16633294
 ] 

ASF subversion and git services commented on IMPALA-7599:
-

Commit 6f7b162154daae4614a6f1da0be920394478b123 in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=6f7b162 ]

IMPALA-7599: make the number of local cache retries configurable

Under heavy read/write load, the number of retries needed for queries
in order to skip over inconsistent metadata exceptions needs to be set
higher. This change makes the number of retries configurable. It can be
set with the newly added flag --local_catalog_max_fetch_retries.
In addition, this change increases the default from 10 to 40, which was
sufficient when handling several workloads with high read/write load.
Follow-up change for IMPALA-7597 will make use of this configuration
when retrying for cases other than analyzing queries.
Made several fixes to exception messages.

Testing:
- manual tests
- added an e2e test that sets the flag and checks for inconsistent metadata

Change-Id: I4f14d5a8728f3cb07c7710589c44c2cd52478ba8
Reviewed-on: http://gerrit.cloudera.org:8080/11539
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Make num retries for InconsistentMetadataFetchException configurable
> 
>
> Key: IMPALA-7599
> URL: https://issues.apache.org/jira/browse/IMPALA-7599
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Assignee: Vuk Ercegovac
>Priority: Major
>
> Currently hardcoded to 10 (INCONSISTENT_METADATA_NUM_RETRIES)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7503) SHOW GRANT USER not showing all privileges

2018-09-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633290#comment-16633290
 ] 

ASF subversion and git services commented on IMPALA-7503:
-

Commit 34e666f520a5a7ef87f055263f5e4d9df5d888c0 in impala's branch 
refs/heads/master from [~aholley]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=34e666f ]

IMPALA-7503: SHOW GRANT USER not showing all privileges.

This patch fixes the SHOW GRANT USER statement to show all privileges
granted to a user, either directly via object ownership, or granted
through a role via a group the user belongs to. The output for SHOW
GRANT USER will have two additional columns for privilege name and
privilege type so the user can know where the privilege comes from.

Truncated sample showing two columns that are different from role:
++++--+-...
| principal_type | principal_name | scope  | database | ...
++++--+-...
| USER   | foo| table  | foo_db   | ...
| ROLE   | foo_role   | server |  | ...
++++--+-...

Testing:
- Create new custom cluster test with custom group mapping.
- Ran FE and custom cluster tests.

Change-Id: Ie9f6c88f5569e1c414ceb8a86e7b013eaa3ecde1
Reviewed-on: http://gerrit.cloudera.org:8080/11531
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> SHOW GRANT USER not showing all privileges
> --
>
> Key: IMPALA-7503
> URL: https://issues.apache.org/jira/browse/IMPALA-7503
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Adam Holley
>Assignee: Adam Holley
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> SHOW GRANT USER will only show the privileges explicitly granted to the user. 
>  It should show all the privileges that are granted to any groups/roles the 
> user belongs to.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6271) Impala daemon should log a message when it's being shut down

2018-09-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633293#comment-16633293
 ] 

ASF subversion and git services commented on IMPALA-6271:
-

Commit f699a6ce83f2c7b24e6b7b31c5a7647cc132eef6 in impala's branch 
refs/heads/master from Pranay
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=f699a6c ]

IMPALA-6271: Impala daemon should log a message when it's being shut down

Currently Impalad does not log any message when SIGTERM is sent to
impalad to terminate or to do a graceful shut down. This change logs
a message when SIGTERM is received by impalad/catalogd/statestored.
This logging will assist in debugging the issues seen in the field
where impalad was not gracefully shut down (some other signal
was generated that led to impalad/catalogd/statestored crash).

Testing:
---
a) Used kill to send signals to impalad/catalogd/statestored
   `kill -s SIGTERM ` and see the
   log message is being logged in impalad/catalogd/statestored.INFO.
b) Ran test_breakpad.py to check that existing breakpad functionalities
   are not affected.
c) Ran exhaustive tests without failure.
d) Added new test in test_breakpad.py to handle SIGTERM for
   impalad/statestored/catalogd.

Change-Id: Id20da9e30440b7348557beccb8a0da14775fcc29
Reviewed-on: http://gerrit.cloudera.org:8080/10847
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Impala daemon should log a message when it's being shut down
> 
>
> Key: IMPALA-6271
> URL: https://issues.apache.org/jira/browse/IMPALA-6271
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Zoram Thanga
>Assignee: Pranay Singh
>Priority: Major
>  Labels: observability, supportability
>
> At present the Impala daemon does not log any message when it is being shut 
> down, usually via SIGTERM from management software or OS shutdown. It would 
> be good to at the very least catch this signal to log a message that we are 
> going down. This will aid in serviceability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-376) Built-in functions for parsing JSON

2018-09-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633296#comment-16633296
 ] 

ASF subversion and git services commented on IMPALA-376:


Commit ddef2cb9b14e7f8cf9a68a2a382e10a8e0f91c3d in impala's branch 
refs/heads/master from stiga-huang
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ddef2cb ]

IMPALA-376: add built-in functions for parsing JSON

This patch implements the same function as Hive UDF get_json_object.
We reuse RapidJson to parse the json string. In order to track the
memory used in RapidJson, we wrap FunctionContext into an allocator.

get_json_object accepts two parameters: a json string and a selector
(json path). We parse the json string into a Document tree and then
perform BFS according to the selector. For example, to process
get_json_object('[{\"a\":1}, {\"a\":2}, {\"a\":3}]', '$[*].a'),
we first perform '$[*]' to extract all the items in the root array.
Then we get a queue consists of {a:1},{a:2},{a:3} and perform '.a'
selector on all values in the queue. The final results is 1,2,3 in the
queue. As there're multiple results, they should be encapsulated into
an array. The output results is a string of '[1,2,3]'.

More examples can be found in expr-test.cc.

Test:
* Add unit tests in expr-test
* Add e2e tests in exprs.test
* Add tests in test_alloc_fail.py to check handling of out of memory

Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f
Reviewed-on: http://gerrit.cloudera.org:8080/10950
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Built-in functions for parsing JSON
> ---
>
> Key: IMPALA-376
> URL: https://issues.apache.org/jira/browse/IMPALA-376
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Product Backlog
> Environment: All supported environments
>Reporter: Zoltan Toth-Czifra
>Assignee: Quanlong Huang
>Priority: Minor
>  Labels: built-in-function
>
> Hi,
> Hive comes with some useful built-in UDFs to process JSON objects.
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
> Namely:
> - get_json_object
> - json_tuple
> To make Impala and Hive tables and quieries more interchangable, I am 
> proposing porting these UDFs to be part Impala's built in functions:
> http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_functions.html
> h4. Example
> Consider the following table *raw_log*
> ||action||parameters||
> |search|{"keyword":"hotel"}|
> |visit|{"url":"http://example.com"}|
> ...and the following query:
> {code}
> SELECT get_json_object(event_params, "$.keyword") AS keyword FROM raw_log 
> WHERE action='search';
> {code}
> The query should return the following results:
> ||keyword||
> |hotel|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7570) Impala Doc: Add a table of built-in functions

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635855#comment-16635855
 ] 

ASF subversion and git services commented on IMPALA-7570:
-

Commit 10bffe2f934720b848acd6b2fe8ce10b282d891d in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=10bffe2 ]

IMPALA-7570: [DOCS] Added a table of all built-in impala_functions

- Cleaned up no-value added texts.
- Added a table of built-in functions that users can use to get a link
  to functions.

Because the functions are listed as , the above list
of functions has to be manually maintained. When there is a new function
or a removed function, update the above list. The link format is:
FUNCTION NAME
For example:
WEEKOFYEAR

Change-Id: I2f6b024bc218a9158249f161fd16be10f16d19db
Reviewed-on: http://gerrit.cloudera.org:8080/11441
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins 


> Impala Doc: Add a table of built-in functions
> -
>
> Key: IMPALA-7570
> URL: https://issues.apache.org/jira/browse/IMPALA-7570
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>
> https://gerrit.cloudera.org/#/c/11438/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635861#comment-16635861
 ] 

ASF subversion and git services commented on IMPALA-6990:
-

Commit d3cf6d325779fff4ab0ba9db53411c510aa0f717 in impala's branch 
refs/heads/master from [~philip]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d3cf6d3 ]

IMPALA-7629: Re-enable erroneously disabled TestClientSsl tests.

The fix for IMPALA-6990 had a bug, disabling some tests erroneously.
With this change, the tests run on Ubuntu16:04 like so:

  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[] PASSED
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[] PASSED
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_v12[] PASSED
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_ssl[] 
xfail
  
tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_san_ssl[] 
xfail

The xfails are all "Inconsistent wildcard support on target platforms".

On centos7:

  custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[] PASSED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_v12[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_ssl[] xfail
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_san_ssl[] 
xfail

On centos6:
  custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[] PASSED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_v12[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_ssl[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_san_ssl[] 
SKIPPED

I used "curl --silent https://.../consoleText | grep test_client_ssl | sed -e 
's/\[.*\]/[]/'"
to extract these from Jenkins output.

Change-Id: I64879b8af39f967b0059797e7b36421ce0e58bed
Reviewed-on: http://gerrit.cloudera.org:8080/11530
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 


> TestClientSsl.test_tls_v12 failing due to Python SSL error
> --
>
> Key: IMPALA-6990
> URL: https://issues.apache.org/jira/browse/IMPALA-6990
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> We've seen quite a few jobs fail with the following error:
> *_ssl.c:504: EOF occurred in violation of protocol*
> {code:java}
> custom_cluster/test_client_ssl.py:128: in test_tls_v12
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
> E   Not connected to Impala, could not execute queries.
> {code}
> We need to investigate why this is happening and fix it.



--
This message was sent by Atlassian JIRA
(v

[jira] [Commented] (IMPALA-7622) Add query profile metrics for RPC's used when pulling incremental stats.

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635856#comment-16635856
 ] 

ASF subversion and git services commented on IMPALA-7622:
-

Commit d918b2aeb582ca465dc3e5066a77a7b4dab39641 in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d918b2a ]

Revert "IMPALA-7622: adds profile metrics when fetching incremental stats"

Breaks downstream dependence on profile (1/2 of changes).

This reverts commit 235748316c5cada5c58b3e84a4e20ee57f1c4a49.

Change-Id: I80b4c0e4b8487572285ac788ab0195896f221842
Reviewed-on: http://gerrit.cloudera.org:8080/11551
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add query profile metrics for RPC's used when pulling incremental stats.
> 
>
> Key: IMPALA-7622
> URL: https://issues.apache.org/jira/browse/IMPALA-7622
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Vuk Ercegovac
>Assignee: Vuk Ercegovac
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> When --pull_incremental_statistics is enabled, the frontend will fetch these 
> stats from catalogd. We should record metrics for this, such as number 
> partitions fetched, size of received bytes, and elapsed time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7595) Check failed: IsValidTime(time_) at timestamp-value.h:322

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635854#comment-16635854
 ] 

ASF subversion and git services commented on IMPALA-7595:
-

Commit 810841115a4f62dffd219cca8a9fbd34ea73e37c in impala's branch 
refs/heads/master from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=8108411 ]

IMPALA-7595: Check the validity of the time part of Parquet timestamps

Before this fix Impala did not check whether a timestamp's time part
is out of the valid [0, 24 hour) range when reading Parquet files,
so these timestamps were memcopied as they were to slots, leading to
results like:
1970-01-01 -00:00:00.1
1970-01-01 24:00:00

Different parts of Impala treat these timestamp differently:
- string conversion leads to invalid representation that cannot be
  converted back to timestamp
- timezone conversions handle the overflowing time part and give
  a valid timestamp result (at least since CCTZ, I did not check
  older versions of Impala)
- Parquet writing inserts these timestamp as they are, so the
  resulting Parquet file will also contain corrupt timestamps

The fix adds a check that converts these corrupt timestamps to NULL,
similarly to the handling of timestamp outside the [1400..1)
range. A new error code is added for this case. If both the date
and the time part is corrupt, then error about corrupt time is
returned.

Testing:
- added a new scanner test that reads a corrupted Parquet file
  with edge values

Change-Id: Ibc0ae651b6a0a028c61a15fd069ef9e904231058
Reviewed-on: http://gerrit.cloudera.org:8080/11521
Reviewed-by: Csaba Ringhofer 
Tested-by: Impala Public Jenkins 


> Check failed: IsValidTime(time_) at timestamp-value.h:322 
> --
>
> Key: IMPALA-7595
> URL: https://issues.apache.org/jira/browse/IMPALA-7595
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, crash
>
> See https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3197/. hash is 
> 23c7d7e57b7868eedbf5a9a4bc4aafd6066a04fb
> Some of the fuzz tests stand out amongst the tests that were running at the 
> same time as the crash, particularly:
>  19:12:17 [gw4] PASSED 
> query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_alltypes[exec_option:
>  {'debug_action': '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 
> 'abort_on_error': False, 'mem_limit': '512m', 'num_nodes': 0} | table_format: 
> parquet/none] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7527) Expose fetch-from-catalogd cache and latency metrics in profiles

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635862#comment-16635862
 ] 

ASF subversion and git services commented on IMPALA-7527:
-

Commit ea2809f5ddcf48f3f41dcd12743e8a17b4ea8cd7 in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ea2809f ]

Revert "IMPALA-7527: add fetch-from-catalogd cache info to profile"

Update to profile conflicts with downstream dependency (change 2/2).

This reverts commit 8c330adf409aa74857b23ba345f0c710a1f25a32.

Change-Id: Ide56f2cd3ee6a34f716b6b465f6fb5fb944e7db8
Reviewed-on: http://gerrit.cloudera.org:8080/11560
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Expose fetch-from-catalogd cache and latency metrics in profiles
> 
>
> Key: IMPALA-7527
> URL: https://issues.apache.org/jira/browse/IMPALA-7527
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 3.1.0
>Reporter: Todd Lipcon
>Assignee: Vuk Ercegovac
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Since we now have some caching and potential remote calls on the planning 
> path, it's important to be able to understand how that contributes to the 
> performance of planning. This JIRA tracks adding such information to the 
> profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7629) TestClientSsl tests seem to be disabled on non-legacy platforms

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635860#comment-16635860
 ] 

ASF subversion and git services commented on IMPALA-7629:
-

Commit d3cf6d325779fff4ab0ba9db53411c510aa0f717 in impala's branch 
refs/heads/master from [~philip]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d3cf6d3 ]

IMPALA-7629: Re-enable erroneously disabled TestClientSsl tests.

The fix for IMPALA-6990 had a bug, disabling some tests erroneously.
With this change, the tests run on Ubuntu16:04 like so:

  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[] PASSED
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[] PASSED
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_v12[] PASSED
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_ssl[] 
xfail
  
tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_san_ssl[] 
xfail

The xfails are all "Inconsistent wildcard support on target platforms".

On centos7:

  custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[] PASSED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_v12[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_ssl[] xfail
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_san_ssl[] 
xfail

On centos6:
  custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[] PASSED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_v12[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_ssl[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_san_ssl[] 
SKIPPED

I used "curl --silent https://.../consoleText | grep test_client_ssl | sed -e 
's/\[.*\]/[]/'"
to extract these from Jenkins output.

Change-Id: I64879b8af39f967b0059797e7b36421ce0e58bed
Reviewed-on: http://gerrit.cloudera.org:8080/11530
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 


> TestClientSsl tests seem to be disabled on non-legacy platforms
> ---
>
> Key: IMPALA-7629
> URL: https://issues.apache.org/jira/browse/IMPALA-7629
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
> Environment: Ubuntu 16.04, Python 2.7.14
>Reporter: Tim Armstrong
>Assignee: Philip Zeyliger
>Priority: Blocker
>
> I noticed that when I ran some of these tests on Ubuntu 16.04 they are 
> skipped:
> {noformat}
> $ impala-py.test tests/custom_cluster/test_client_ssl.py -k ecdh
> ...
> tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] SKIPPED
> {noformat}
> I don't think this is intended. The logic in IMPALA-6990 looks backwards in 
> that HAS_LEGACY_OPENSSL is a non-None integer (i.e. truthy) when the version 
> field exists.
> Assigning to Phil since he reviewed the patch and probably has some context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7646) SHOW GRANT USER not working on kerberized clusters

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635857#comment-16635857
 ] 

ASF subversion and git services commented on IMPALA-7646:
-

Commit a381483e658f824d1639096e03f1a23b2a216c41 in impala's branch 
refs/heads/master from [~aholley]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a381483 ]

IMPALA-7646: SHOW GRANT USER does not work for kerberos cluster

This patch fixes the SHOW GRANT USER statement to properly check
that the requesting user short name matches the name in the
SHOW GRANT USER statement to determine whether or not an admin
check is required for showing the privileges. Previous to this
patch, the full kerberos user name, e.g. foo_user@REALM was
compared against "SHOW GRANT USER foo_user" and did not match
do admin privileges were required.

Testing:
- Ran all fe and custom cluster tests.
- Validated against kerberized cluster.

Change-Id: Iba4c627b72c8cbc323be25917698a75d153afd31
Reviewed-on: http://gerrit.cloudera.org:8080/11553
Reviewed-by: Fredy Wijaya 
Tested-by: Impala Public Jenkins 


> SHOW GRANT USER not working on kerberized clusters
> --
>
> Key: IMPALA-7646
> URL: https://issues.apache.org/jira/browse/IMPALA-7646
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Adam Holley
>Assignee: Adam Holley
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> SHOW GRANT USER foo_user;
> does not work on kerberized clusters because the requester name does not 
> match the users name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7532) Add retry/back-off to fetch-from-catalog RPCs

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635859#comment-16635859
 ] 

ASF subversion and git services commented on IMPALA-7532:
-

Commit ade399c08f5a74f87d192b47d2b65c3b56d05f7c in impala's branch 
refs/heads/master from [~tianyiwang]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ade399c ]

IMPALA-7532: Add catalogd client backoff time into impalad CLI options

Impala may fail queries or fail to start if the connection to catalogd
cannot be estabilished. Impala already has a retrial mechanism but the
backoff time is currently 0. This patch adds an option
"catalog_client_rpc_retry_interval_ms" for it, defaulting to 10 seconds.

Change-Id: I924c1f2fd37021f4c8fb6b46aa278ac4b1aee131
Reviewed-on: http://gerrit.cloudera.org:8080/11543
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add retry/back-off to fetch-from-catalog RPCs
> -
>
> Key: IMPALA-7532
> URL: https://issues.apache.org/jira/browse/IMPALA-7532
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Tianyi Wang
>Priority: Major
>
> Currently if there is an error connecting to the catalog server, the 'fetch 
> from catalog' implementation will retry with no apparent backoff. We should 
> retry for some period of time with backoff in between the attempts, so that 
> impala can ride over short interruptions of the catalog service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7520) NPE in SentryProxy

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635858#comment-16635858
 ] 

ASF subversion and git services commented on IMPALA-7520:
-

Commit de39b0331e1e000162a93bfb90888e2dfbadcd13 in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=de39b03 ]

IMPALA-7520: Fix NullPointerException in SentryProxy

Prior to this patch, the code in SentryProxy could throw a
NullPointerException when trying to retrieve a set of privileges for a
given role name. I was able to manually reproduce the issue by doing
the following steps:

1. Get all Sentry role privileges: [a, b] --> in SentryProxy
2. Add a sleep statement before getting all Sentry roles --> in SentryProxy
3. Add a new Sentry role: [c] --> Externally via Sentry CLI
4. Get all Sentry roles: [a, b, c] --> in SentryProxy
   Role c was added in step 3.
5. Get Sentry role privileges for role c: NPE --> in SentryProxy

The fix is to add a null guard when retrieving Sentry privileges for a
given role name and let the new role get updated in the next Sentry
refresh.

Testing:
- Ran all FE tests
- Ran all authorization E2E tests
- Manually tested it by temporarily modifying the SentryProxy code and
  did not see the NullPointerException

Change-Id: I36af840056a4d037fb5c7b1d9a167c0eb8526a11
Reviewed-on: http://gerrit.cloudera.org:8080/11552
Reviewed-by: Fredy Wijaya 
Tested-by: Impala Public Jenkins 


> NPE in SentryProxy
> --
>
> Key: IMPALA-7520
> URL: https://issues.apache.org/jira/browse/IMPALA-7520
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Adam Holley
>Assignee: Fredy Wijaya
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> In SentryProxy.refreshPrivilegesInCache(), the call to 
> allPrincipalPrivileges.get(principal.getName()) is sometimes returning null.
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.impala.util.SentryProxy$PolicyReader.refreshPrivilegesInCatalog(SentryProxy.java:245)
> at 
> org.apache.impala.util.SentryProxy$PolicyReader.refreshRolePrivileges(SentryProxy.java:197)
> at 
> org.apache.impala.util.SentryProxy$PolicyReader.run(SentryProxy.java:139)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5031) UBSAN clean and method for testing UBSAN cleanliness

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636191#comment-16636191
 ] 

ASF subversion and git services commented on IMPALA-5031:
-

Commit 20bde289ebb6d1097b1afab7ad171498f4d164a7 in impala's branch 
refs/heads/master from [~jbapple]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=20bde28 ]

IMPALA-5031: null ptr errors in C calls in BE tests

This patch fixes all remaining UBSAN "null pointer passed as argument"
errors in the backend tests. These are undefined behavior according to
"7.1.4 Use of library functions" in the C99 standard (which is
included in C++14 in section [intro.refs]):

If an argument to a function has an invalid value (such as a value
outside the domain of the function, or a pointer outside the
address space of the program, or a null pointer, or a pointer to
non-modifiable storage when the corresponding parameter is not
const-qualified) or a type (after promotion) not expected by a
function with variable number of arguments, the behavior is
undefined.

The interesting parts of the backtraces for the errors fixed in this
patch are below:

exprs/string-functions-ir.cc:311:17: runtime error: null pointer passed as 
argument 2, which is declared to never be null
/usr/include/string.h:43:45: note: nonnull attribute specified here
#0 StringFunctions::Replace(impala_udf::FunctionContext*, 
impala_udf::StringVal const&, impala_udf::StringVal const&, 
impala_udf::StringVal const&) exprs/string-functions-ir.cc:311:5
#1 impala_udf::StringVal 
ScalarFnCall::InterpretEval(ScalarExprEvaluator*, 
TupleRow const*) const exprs/scalar-fn-call.cc:485:580
#2 ScalarFnCall::GetStringVal(ScalarExprEvaluator*, TupleRow const*) const 
exprs/scalar-fn-call.cc:599:44
#3 ScalarExprEvaluator::GetValue(ScalarExpr const&, TupleRow const*) 
exprs/scalar-expr-evaluator.cc:299:38
#4 ScalarExprEvaluator::GetValue(TupleRow const*) 
exprs/scalar-expr-evaluator.cc:250:10
#5 void Tuple::MaterializeExprs(TupleRow*, TupleDescriptor 
const&, ScalarExprEvaluator* const*, MemPool*, StringValue**, int*, int*) 
runtime/tuple.cc:222:27
#6 void Tuple::MaterializeExprs(TupleRow*, TupleDescriptor 
const&, vector const&, MemPool*, vector*, 
int*) runtime/tuple.h:174:5
#7 UnionNode::MaterializeExprs(vector const&, 
TupleRow*, unsigned char*, RowBatch*) exec/union-node-ir.cc:29:14
#8 UnionNode::GetNextConst(RuntimeState*, RowBatch*) 
exec/union-node.cc:263:5
#9 UnionNode::GetNext(RuntimeState*, RowBatch*, bool*) 
exec/union-node.cc:296:45
#10 FragmentInstanceState::ExecInternal() 
runtime/fragment-instance-state.cc:310:59
#11 FragmentInstanceState::Exec() runtime/fragment-instance-state.cc:95:14
#12 QueryState::ExecFInstance(FragmentInstanceState*) 
runtime/query-state.cc:488:24
#13 QueryState::StartFInstances()::$_0::operator()() const 
runtime/query-state.cc:416:35
#20 thread_proxy (exprs/expr-test+0x55ca939)

exprs/string-functions-ir.cc:868:15: runtime error: null pointer passed as 
argument 2, which is declared to never be null
/usr/include/string.h:43:45: note: nonnull attribute specified here
#0 StringFunctions::ConcatWs(impala_udf::FunctionContext*, 
impala_udf::StringVal const&, int, impala_udf::StringVal const*) 
exprs/string-functions-ir.cc:868:3
#1 impala_udf::StringVal 
ScalarFnCall::InterpretEval(ScalarExprEvaluator*, 
TupleRow const*) const exprs/scalar-fn-call.cc:510:270
#2 ScalarFnCall::GetStringVal(ScalarExprEvaluator*, TupleRow const*) const 
exprs/scalar-fn-call.cc:599:44
#3 ScalarExprEvaluator::GetValue(ScalarExpr const&, TupleRow const*) 
exprs/scalar-expr-evaluator.cc:299:38
#4 ScalarExprEvaluator::GetValue(TupleRow const*) 
exprs/scalar-expr-evaluator.cc:250:10
#5 void Tuple::MaterializeExprs(TupleRow*, TupleDescriptor 
const&, ScalarExprEvaluator* const*, MemPool*, StringValue**, int*, int*) 
runtime/tuple.cc:222:27
#6 void Tuple::MaterializeExprs(TupleRow*, TupleDescriptor 
const&, vector const&, MemPool*, vector*, 
int*) runtime/tuple.h:174:5
#7 UnionNode::MaterializeExprs(vector const&, 
TupleRow*, unsigned char*, RowBatch*) exec/union-node-ir.cc:29:14
#8 UnionNode::GetNextConst(RuntimeState*, RowBatch*) 
exec/union-node.cc:263:5
#9 UnionNode::GetNext(RuntimeState*, RowBatch*, bool*) 
exec/union-node.cc:296:45
#10 FragmentInstanceState::ExecInternal() 
runtime/fragment-instance-state.cc:310:59
#11 FragmentInstanceState::Exec() runtime/fragment-instance-state.cc:95:14
#12 QueryState::ExecFInstance(FragmentInstanceState*) 
runtime/query-state.cc:488:24
#13 QueryState::StartFInstances()::$_0::operator()() const 
runtime/query-state.cc:416:35
#20 thread_proxy (exprs/expr-test+0x55ca939)

exprs/string-functions-ir.cc:871:17: runtime error: null pointer passed as 
argument 2, which is declared to never b

[jira] [Commented] (IMPALA-7585) Always set user credentials after creating a KRPC proxy

2018-10-02 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636192#comment-16636192
 ] 

ASF subversion and git services commented on IMPALA-7585:
-

Commit e1d1b4f14f1a2d8cab378b419eed3c4e4590a311 in impala's branch 
refs/heads/master from Michael Ho
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e1d1b4f ]

IMPALA-7585: Always set user credentials after creating RPC proxy

kudu::rpc::Proxy() ctor may fail in GetLoggedInUser() for various reasons
(e.g. missing certain libraries). This resulted in an empty username being
used in kudu::rpc::ConnectionId. With plaintext SASL (e.g. in an insecure
Impala cluster), this may result in the following error during connection
negotiation:

Not authorized: Client connection negotiation failed: client connection to 
127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username.

In Impala, we don't consider failing GetLoggedInUser() a fatal error.
This change fixes the issue above by always explicitly setting the
username after creating the proxy. The username is "impala". Please
note that this username is not really used anywhere for authorization
for RPC services. Authorization is only done when authentication is
enabled with Kerberos. With Kerberos enabled, the username is derived
from the Kerberos principal instead of the user credentials set in
the ConnectionId. It's there mostly to satisfy the SASL plaintext case.

rpc-mgr-test has been updated to test for this string when Kerberos is
disabled.

Testing done: core test; rpc-mgr-test; rpc-mgr-kerberized-test

Change-Id: I75059f55bcdb8f95916610100cad4d8280daf3f6
Reviewed-on: http://gerrit.cloudera.org:8080/11477
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Always set user credentials after creating a KRPC proxy
> ---
>
> Key: IMPALA-7585
> URL: https://issues.apache.org/jira/browse/IMPALA-7585
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Michael Ho
>Priority: Major
>
> {{kudu::rpc::Proxy}} ctor may fail in {{GetLoggedInUser()}} for various 
> reason:
> {noformat}
> Error calling getpwuid_r(): No such file or directory (error 2). 
> {noformat}
> This resulted in an empty user name being used in 
> {{kudu::rpc::ConnectionId}}. With plaintext SASL (e.g. in an insecure Impala 
> cluster), this may result in the following error:
> {noformat}
> Not authorized: Client connection negotiation failed: client connection to 
> 127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username.
> {noformat}
> While one can argue that Kudu should fall back to some default username (e.g. 
> "cpp-client") when {{GetLoggedInUserName()}} fails, it may have non-trivial 
> consequence (e.g. generating an authn token with some random username on one 
> machine while using the real user name on another machine). Therefore, it's 
> best for Impala to explicitly set the user credentials 
> (impala/) after creating the proxy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7521) CLONE - Speed up sub-second unix time->TimestampValue conversions

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638930#comment-16638930
 ] 

ASF subversion and git services commented on IMPALA-7521:
-

Commit d301600a85f399863d89c281a9a1dc3091d52fcc in impala's branch 
refs/heads/master from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d301600 ]

Revert "IMPALA-7595: Revert "IMPALA-7521: Speed up sub-second unix 
time->TimestampValue conversions""

IMPALA-7595 added proper handling for invalid time-of-day values
in Parquet, so the DCHECK mentioned in IMPALA-7595 will no longer
be hit. This means that IMPALA-7521 can be committed again without
causing problems.

This reverts commit f8b472ee6442e31a867a6dd6aaac22cc44291d41.

Change-Id: Ibab04bc6ad09db331220312ed21d90622fdfc41b
Reviewed-on: http://gerrit.cloudera.org:8080/11573
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> CLONE - Speed up sub-second unix time->TimestampValue conversions
> -
>
> Key: IMPALA-7521
> URL: https://issues.apache.org/jira/browse/IMPALA-7521
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: performance, timestamp
>
> Currently Impala converts from sub-second unix time to TimestampValue (which 
> is split do date_ and time_ similarly to boost::posix_time::ptime ) by first 
> splitting the input into seconds and sub-seconds part, converting the seconds 
> part with  boost::posix_time::from_time_t(), and then adding the sub-seconds 
> part to this timestamp. This can be done much faster  by splitting the 
> sub-second input into date_ and time_ directly.
> Avoiding boost::posix_time::from_time_t() would be also nice because it can 
> only deal with timestamps from 1677 to 2262, which adds extra complexity to 
> the related code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7595) Check failed: IsValidTime(time_) at timestamp-value.h:322

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638928#comment-16638928
 ] 

ASF subversion and git services commented on IMPALA-7595:
-

Commit d301600a85f399863d89c281a9a1dc3091d52fcc in impala's branch 
refs/heads/master from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d301600 ]

Revert "IMPALA-7595: Revert "IMPALA-7521: Speed up sub-second unix 
time->TimestampValue conversions""

IMPALA-7595 added proper handling for invalid time-of-day values
in Parquet, so the DCHECK mentioned in IMPALA-7595 will no longer
be hit. This means that IMPALA-7521 can be committed again without
causing problems.

This reverts commit f8b472ee6442e31a867a6dd6aaac22cc44291d41.

Change-Id: Ibab04bc6ad09db331220312ed21d90622fdfc41b
Reviewed-on: http://gerrit.cloudera.org:8080/11573
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Check failed: IsValidTime(time_) at timestamp-value.h:322 
> --
>
> Key: IMPALA-7595
> URL: https://issues.apache.org/jira/browse/IMPALA-7595
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, crash
> Fix For: Impala 3.1.0
>
>
> See https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3197/. hash is 
> 23c7d7e57b7868eedbf5a9a4bc4aafd6066a04fb
> Some of the fuzz tests stand out amongst the tests that were running at the 
> same time as the crash, particularly:
>  19:12:17 [gw4] PASSED 
> query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_alltypes[exec_option:
>  {'debug_action': '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 
> 'abort_on_error': False, 'mem_limit': '512m', 'num_nodes': 0} | table_format: 
> parquet/none] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6634) Impala 2.13 Doc: Document leading and trailing whitespace behaviour with string->timestamp conversion

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638923#comment-16638923
 ] 

ASF subversion and git services commented on IMPALA-6634:
-

Commit ccec241637e78a1432653330a45540127eb3e7df in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ccec241 ]

IMPALA-6634: [DOCS] Whitespace behavior in string to timestamp conversion

Noted that leading and trailing whitespaces are ignored when a string is
implicitly or explicitly converted to a timestamp.

Change-Id: Id5e485f0dccd2e6e1351d6d8194ec94c753f60b3
Reviewed-on: http://gerrit.cloudera.org:8080/11567
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 


> Impala 2.13 Doc: Document leading and trailing whitespace behaviour with 
> string->timestamp conversion
> -
>
> Key: IMPALA-6634
> URL: https://issues.apache.org/jira/browse/IMPALA-6634
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: docs
>
> Just putting this on your radar. The docs don't mention this behaviour. It 
> seems fairly minor but may be useful to clarify. 
> Ref: 
> https://impala.apache.org/docs/build/html/topics/impala_literals.html#timestamp_literals



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7581) Hang in buffer-pool-test

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638931#comment-16638931
 ] 

ASF subversion and git services commented on IMPALA-7581:
-

Commit 93606e6046054526c093975894efc1eef8a53bc1 in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=93606e6 ]

IMPALA-7581: timeout backend tests after 2 hours

This will abort the process if a backend test is taking too long, which
we assume is because of a hang. This makes job failures easier to triage
and may make it easier to debug failures if we collect coredumps or
minidumps.

Also disable the death tests for ASAN under the theory that the
probability of the hang is higher than a regular DEBUG build.

Testing:
Reduced the timeout to 5s and confirmed that it worked.

Change-Id: I2e4ef9cb0549ead0bae57b11489f6a4d9b44ef95
Reviewed-on: http://gerrit.cloudera.org:8080/11533
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 


> Hang in buffer-pool-test
> 
>
> Key: IMPALA-7581
> URL: https://issues.apache.org/jira/browse/IMPALA-7581
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Impala 3.1.0
>
> Attachments: gdb.txt
>
>
> We have observed a hang in buffer-pool-test an ASAN build. Unfortunately, no 
> logs were generated with any info about what might have happened.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7527) Expose fetch-from-catalogd cache and latency metrics in profiles

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638925#comment-16638925
 ] 

ASF subversion and git services commented on IMPALA-7527:
-

Commit e6bbe4eaf5ba606ea3f4f1ed3360ecf9172a9ec3 in impala's branch 
refs/heads/master from [~tlipcon]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e6bbe4e ]

IMPALA-7527: add fetch-from-catalogd cache info to profile

Reapplies reverted IMPALA-7527. Adding a top-level entry to the profile
broke downstream consumers. The change here is to add the additional stats
to the summary profile.

This patch adds a Java wrapper for a RuntimeProfile object. The wrapper
supports some basic operations like non-hierarchical counters and
informational strings.

During planning, a profile is created, and passed back to the backend as
part of the ExecRequest. The backend then updates the query profile
based on the info emitted from the frontend.

This patch also adds the first use case for this profile information:
the CatalogdMetaProvider emits counters for cache hits, misses, and
fetch times, broken down by metadata category.

The emitted profile is a bit of a superset of the existing 'timeline'
functionality. However, it seems that some tools may parse the timeline
in its current location in the profile, so moving it might be
incompatible. I elected to leave that alone for now and just emit
counters in the new profile.

Change-Id: I419be157168cddb7521ea61e8f86733306b9315e
Reviewed-on: http://gerrit.cloudera.org:8080/11569
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Expose fetch-from-catalogd cache and latency metrics in profiles
> 
>
> Key: IMPALA-7527
> URL: https://issues.apache.org/jira/browse/IMPALA-7527
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 3.1.0
>Reporter: Todd Lipcon
>Assignee: Vuk Ercegovac
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Since we now have some caching and potential remote calls on the planning 
> path, it's important to be able to understand how that contributes to the 
> performance of planning. This JIRA tracks adding such information to the 
> profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7521) CLONE - Speed up sub-second unix time->TimestampValue conversions

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638927#comment-16638927
 ] 

ASF subversion and git services commented on IMPALA-7521:
-

Commit d301600a85f399863d89c281a9a1dc3091d52fcc in impala's branch 
refs/heads/master from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d301600 ]

Revert "IMPALA-7595: Revert "IMPALA-7521: Speed up sub-second unix 
time->TimestampValue conversions""

IMPALA-7595 added proper handling for invalid time-of-day values
in Parquet, so the DCHECK mentioned in IMPALA-7595 will no longer
be hit. This means that IMPALA-7521 can be committed again without
causing problems.

This reverts commit f8b472ee6442e31a867a6dd6aaac22cc44291d41.

Change-Id: Ibab04bc6ad09db331220312ed21d90622fdfc41b
Reviewed-on: http://gerrit.cloudera.org:8080/11573
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> CLONE - Speed up sub-second unix time->TimestampValue conversions
> -
>
> Key: IMPALA-7521
> URL: https://issues.apache.org/jira/browse/IMPALA-7521
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: performance, timestamp
>
> Currently Impala converts from sub-second unix time to TimestampValue (which 
> is split do date_ and time_ similarly to boost::posix_time::ptime ) by first 
> splitting the input into seconds and sub-seconds part, converting the seconds 
> part with  boost::posix_time::from_time_t(), and then adding the sub-seconds 
> part to this timestamp. This can be done much faster  by splitting the 
> sub-second input into date_ and time_ directly.
> Avoiding boost::posix_time::from_time_t() would be also nice because it can 
> only deal with timestamps from 1677 to 2262, which adds extra complexity to 
> the related code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7595) Check failed: IsValidTime(time_) at timestamp-value.h:322

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638929#comment-16638929
 ] 

ASF subversion and git services commented on IMPALA-7595:
-

Commit d301600a85f399863d89c281a9a1dc3091d52fcc in impala's branch 
refs/heads/master from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d301600 ]

Revert "IMPALA-7595: Revert "IMPALA-7521: Speed up sub-second unix 
time->TimestampValue conversions""

IMPALA-7595 added proper handling for invalid time-of-day values
in Parquet, so the DCHECK mentioned in IMPALA-7595 will no longer
be hit. This means that IMPALA-7521 can be committed again without
causing problems.

This reverts commit f8b472ee6442e31a867a6dd6aaac22cc44291d41.

Change-Id: Ibab04bc6ad09db331220312ed21d90622fdfc41b
Reviewed-on: http://gerrit.cloudera.org:8080/11573
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Check failed: IsValidTime(time_) at timestamp-value.h:322 
> --
>
> Key: IMPALA-7595
> URL: https://issues.apache.org/jira/browse/IMPALA-7595
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, crash
> Fix For: Impala 3.1.0
>
>
> See https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3197/. hash is 
> 23c7d7e57b7868eedbf5a9a4bc4aafd6066a04fb
> Some of the fuzz tests stand out amongst the tests that were running at the 
> same time as the crash, particularly:
>  19:12:17 [gw4] PASSED 
> query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_alltypes[exec_option:
>  {'debug_action': '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 
> 'abort_on_error': False, 'mem_limit': '512m', 'num_nodes': 0} | table_format: 
> parquet/none] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7595) Check failed: IsValidTime(time_) at timestamp-value.h:322

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638926#comment-16638926
 ] 

ASF subversion and git services commented on IMPALA-7595:
-

Commit d301600a85f399863d89c281a9a1dc3091d52fcc in impala's branch 
refs/heads/master from [~csringhofer]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d301600 ]

Revert "IMPALA-7595: Revert "IMPALA-7521: Speed up sub-second unix 
time->TimestampValue conversions""

IMPALA-7595 added proper handling for invalid time-of-day values
in Parquet, so the DCHECK mentioned in IMPALA-7595 will no longer
be hit. This means that IMPALA-7521 can be committed again without
causing problems.

This reverts commit f8b472ee6442e31a867a6dd6aaac22cc44291d41.

Change-Id: Ibab04bc6ad09db331220312ed21d90622fdfc41b
Reviewed-on: http://gerrit.cloudera.org:8080/11573
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Check failed: IsValidTime(time_) at timestamp-value.h:322 
> --
>
> Key: IMPALA-7595
> URL: https://issues.apache.org/jira/browse/IMPALA-7595
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build, crash
> Fix For: Impala 3.1.0
>
>
> See https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3197/. hash is 
> 23c7d7e57b7868eedbf5a9a4bc4aafd6066a04fb
> Some of the fuzz tests stand out amongst the tests that were running at the 
> same time as the crash, particularly:
>  19:12:17 [gw4] PASSED 
> query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_alltypes[exec_option:
>  {'debug_action': '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 
> 'abort_on_error': False, 'mem_limit': '512m', 'num_nodes': 0} | table_format: 
> parquet/none] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7527) Expose fetch-from-catalogd cache and latency metrics in profiles

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638924#comment-16638924
 ] 

ASF subversion and git services commented on IMPALA-7527:
-

Commit e6bbe4eaf5ba606ea3f4f1ed3360ecf9172a9ec3 in impala's branch 
refs/heads/master from [~tlipcon]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e6bbe4e ]

IMPALA-7527: add fetch-from-catalogd cache info to profile

Reapplies reverted IMPALA-7527. Adding a top-level entry to the profile
broke downstream consumers. The change here is to add the additional stats
to the summary profile.

This patch adds a Java wrapper for a RuntimeProfile object. The wrapper
supports some basic operations like non-hierarchical counters and
informational strings.

During planning, a profile is created, and passed back to the backend as
part of the ExecRequest. The backend then updates the query profile
based on the info emitted from the frontend.

This patch also adds the first use case for this profile information:
the CatalogdMetaProvider emits counters for cache hits, misses, and
fetch times, broken down by metadata category.

The emitted profile is a bit of a superset of the existing 'timeline'
functionality. However, it seems that some tools may parse the timeline
in its current location in the profile, so moving it might be
incompatible. I elected to leave that alone for now and just emit
counters in the new profile.

Change-Id: I419be157168cddb7521ea61e8f86733306b9315e
Reviewed-on: http://gerrit.cloudera.org:8080/11569
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Expose fetch-from-catalogd cache and latency metrics in profiles
> 
>
> Key: IMPALA-7527
> URL: https://issues.apache.org/jira/browse/IMPALA-7527
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 3.1.0
>Reporter: Todd Lipcon
>Assignee: Vuk Ercegovac
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Since we now have some caching and potential remote calls on the planning 
> path, it's important to be able to understand how that contributes to the 
> performance of planning. This JIRA tracks adding such information to the 
> profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7575) Fix doc for fmod, mod and %

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639069#comment-16639069
 ] 

ASF subversion and git services commented on IMPALA-7575:
-

Commit 80edf37010884d6aa7edae818d8219918ea0809e in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=80edf37 ]

IMPALA-7575: [DOCS] FMOD() is not the same as the % operator

- Removed the text about FMOD being equivalent to %.
- Added a note that MOD will show as % in the query plan.

Change-Id: I3b02d3e3f556d93e1d651eaee12217d6b0e3f9e0
Reviewed-on: http://gerrit.cloudera.org:8080/11586
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Fix doc for fmod, mod and % 
> 
>
> Key: IMPALA-7575
> URL: https://issues.apache.org/jira/browse/IMPALA-7575
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Yongjun Zhang
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> This Jira is intended to fix doc issues related to problem described in 
> IMPALA-6202.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639067#comment-16639067
 ] 

ASF subversion and git services commented on IMPALA-7351:
-

Commit 3fabc2de4771349079bcd9dc8bdcb267f43b2a6b in impala's branch 
refs/heads/master from [~bikram.sngh91]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=3fabc2d ]

IMPALA-7351: Improve memory estimates for Kudu Scan Nodes

This patch adds memory estimates for kudu scan nodes based on
empirically derived estimates for the scan's memory consumption
that were added in IMPALA-7096.

Testing:
Modified resource requirements planner test.

Change-Id: If9bb52530271b0bff91311a67d222a2e9fac1229
Reviewed-on: http://gerrit.cloudera.org:8080/11440
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add memory estimates for plan nodes and sinks with missing estimates
> 
>
> Key: IMPALA-7351
> URL: https://issues.apache.org/jira/browse/IMPALA-7351
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: resource-management
>
> Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, 
> etc are missing memory estimates entirely. 
> We should add a basic estimate for all these cases based on experiments and 
> data from real workloads. In some cases 0 may be the right estimate (e.g. for 
> streaming nodes like SelectNode that just pass through data) but we should 
> remove TODOs and document the reasoning in those cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7096) Ensure no memory limit exceeded regressions from IMPALA-4835 because of non-reserved memory

2018-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639068#comment-16639068
 ] 

ASF subversion and git services commented on IMPALA-7096:
-

Commit 3fabc2de4771349079bcd9dc8bdcb267f43b2a6b in impala's branch 
refs/heads/master from [~bikram.sngh91]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=3fabc2d ]

IMPALA-7351: Improve memory estimates for Kudu Scan Nodes

This patch adds memory estimates for kudu scan nodes based on
empirically derived estimates for the scan's memory consumption
that were added in IMPALA-7096.

Testing:
Modified resource requirements planner test.

Change-Id: If9bb52530271b0bff91311a67d222a2e9fac1229
Reviewed-on: http://gerrit.cloudera.org:8080/11440
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Ensure no memory limit exceeded regressions from IMPALA-4835 because of 
> non-reserved memory
> ---
>
> Key: IMPALA-7096
> URL: https://issues.apache.org/jira/browse/IMPALA-7096
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: resource-management
> Fix For: Impala 3.1.0
>
> Attachments: ScanConsumingMostMemory.txt
>
>
> IMPALA-7078 showed some cases where non-buffer memory could accumulate in the 
> row batch queue and cause memory consumption problems.
> The decision for whether to spin up a scanner thread in IMPALA-4835 
> implicitly assumes that buffer memory is the bulk of memory consumed by a 
> scan, but there may be cases where that is not true and the previous 
> heuristic would be more conservative about starting a scanner thread.
> We should investigate this further and figure out how to avoid it if there's 
> an issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7554) Update custom cluster tests to have sentry create new log on each start

2018-10-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648716#comment-16648716
 ] 

ASF subversion and git services commented on IMPALA-7554:
-

Commit 21f521a7c280031e33cde7c61a979683c5abed00 in impala's branch 
refs/heads/master from [~aholley]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=21f521a ]

IMPALA-7554: Update custom cluster tests to have new logs for sentry

This patch adds the ability to create a new log for each spawn of the
sentry service. This will enable better trouble shooting for the
custom cluster tests that restart the sentry service.

Testing:
- Ran all custom cluster tests.

Change-Id: I6e538af7fd6e6ea21dc3f4442bdebf3b31558516
Reviewed-on: http://gerrit.cloudera.org:8080/11624
Reviewed-by: Fredy Wijaya 
Tested-by: Impala Public Jenkins 


> Update custom cluster tests to have sentry create new log on each start
> ---
>
> Key: IMPALA-7554
> URL: https://issues.apache.org/jira/browse/IMPALA-7554
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Adam Holley
>Assignee: Adam Holley
>Priority: Trivial
> Fix For: Impala 3.1.0
>
>
> Currently when the sentry service is restarted in various custom cluster 
> tests, the previous log file is overwritten.  It should use a unique log file 
> name to preserve any errors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7622) Add query profile metrics for RPC's used when pulling incremental stats.

2018-10-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648719#comment-16648719
 ] 

ASF subversion and git services commented on IMPALA-7622:
-

Commit 97f028299c9d9d7493bdbeaacbf0a288678f9371 in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=97f0282 ]

IMPALA-7622: adds profile metrics for incremental stats

Reapplies change after fixing where frontend profile is placed in runtime
profile.

When computing incremental statistics by fetching the stats directly
from catalogd, a potentially expensive RPC is made from the impalad
coordinator to catalogd. This change adds metrics to the frontend
section of the profile to track how long the request takes, the size
of the compressed bytes received, and the number of partitions received.

The profile for a 'compute incremental ...' command on a table with
no statistics looks like this:

Frontend:
 - StatsFetch.CompressedBytes: 0
 - StatsFetch.TotalPartitions: 24
 - StatsFetch.NumPartitionsWithStats: 0
 - StatsFetch.Time: 26ms

And the profile looks as follows when the table has stats, so the stats
are fetched:

Frontend:
 - StatsFetch.CompressedBytes: 24622
 - StatsFetch.TotalPartitions: 23
 - StatsFetch.NumPartitionsWithStats: 23
 - StatsFetch.Time: 14ms

Testing:
- manual inspection
- e2e test to check the profile

Change-Id: I94559a749500d44aa6aad564134d55c39e1d5273
Reviewed-on: http://gerrit.cloudera.org:8080/11670
Reviewed-by: Tianyi Wang 
Tested-by: Impala Public Jenkins 


> Add query profile metrics for RPC's used when pulling incremental stats.
> 
>
> Key: IMPALA-7622
> URL: https://issues.apache.org/jira/browse/IMPALA-7622
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Vuk Ercegovac
>Assignee: Vuk Ercegovac
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> When --pull_incremental_statistics is enabled, the frontend will fetch these 
> stats from catalogd. We should record metrics for this, such as number 
> partitions fetched, size of received bytes, and elapsed time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7644) Hide Parquet page index writing with feature flag

2018-10-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648722#comment-16648722
 ] 

ASF subversion and git services commented on IMPALA-7644:
-

Commit af76186e013607cb64baf151c039e4f6aaab4350 in impala's branch 
refs/heads/master from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=af76186 ]

IMPALA-7704: Revert "IMPALA-7644: Hide Parquet page index writing with feature 
flag"

The fix for IMPALA-7644 introduced ASAN issues detailed in
IMPALA-7704. Reverting for now.

This reverts commit 843683ed6c2ef41c7c25e9fa4af68801dbdd1a78.

Change-Id: Icf0a64d6ec747275e3ecd6e801e054f81095591a
Reviewed-on: http://gerrit.cloudera.org:8080/11671
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Ho 


> Hide Parquet page index writing with feature flag
> -
>
> Key: IMPALA-7644
> URL: https://issues.apache.org/jira/browse/IMPALA-7644
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: parquet, performance
>
> Currently there is no released Impala version that can write the Parquet page 
> index:
> [https://github.com/apache/parquet-format/blob/master/PageIndex.md]
> However, the current Impala master writes the page index since IMPALA-5842, 
> but cannot read it.
> I think we should hide the write path with a feature flag until Impala is 
> able to read it back and has better test coverage on it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7704) ASAN tests failing in HdfsParquetTableWriter

2018-10-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648723#comment-16648723
 ] 

ASF subversion and git services commented on IMPALA-7704:
-

Commit af76186e013607cb64baf151c039e4f6aaab4350 in impala's branch 
refs/heads/master from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=af76186 ]

IMPALA-7704: Revert "IMPALA-7644: Hide Parquet page index writing with feature 
flag"

The fix for IMPALA-7644 introduced ASAN issues detailed in
IMPALA-7704. Reverting for now.

This reverts commit 843683ed6c2ef41c7c25e9fa4af68801dbdd1a78.

Change-Id: Icf0a64d6ec747275e3ecd6e801e054f81095591a
Reviewed-on: http://gerrit.cloudera.org:8080/11671
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Ho 


> ASAN tests failing in HdfsParquetTableWriter
> 
>
> Key: IMPALA-7704
> URL: https://issues.apache.org/jira/browse/IMPALA-7704
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Zoltán Borók-Nagy
>Priority: Blocker
>  Labels: broken-build
>
> ASAN tests have been failing for the last few runs. Here is the output:
> {noformat}
> ==117268==ERROR: AddressSanitizer: use-after-poison on address 0x7ef9312e0ec0 
> at pc 0x016a0c98 bp 0x7ef91e502540 sp 0x7ef91e501cf0
> READ of size 32681 at 0x7ef9312e0ec0 thread T82364
> #0 0x16a0c97 in __interceptor_memcpy 
> /data/jenkins/workspace/impala-toolchain-package-build/label/impala-toolchnbld-cent70-ec2-c3-4xl-ondem/toolchain/source/llvm/llvm-5.0.1.src-p1/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:738
> #1 0x7f02e43c2dca in jni_SetByteArrayRegion 
> (/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so+0x6d3dca)
> #2 0x4ae29b7 in hdfsWrite 
> /container.redhat6/build/cdh/hadoop/3.0.0-cdh6.x-SNAPSHOT/rpm/BUILD/hadoop-3.0.0-cdh6.x-SNAPSHOT/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c:1626
> #3 0x332ca67 in impala::HdfsTableWriter::Write(unsigned char const*, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-table-writer.cc:46:13
> #4 0x25ac037 in 
> impala::HdfsParquetTableWriter::BaseColumnWriter::Flush(long*, long*, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-parquet-table-writer.cc:736:5
> #5 0x25b0e57 in impala::HdfsParquetTableWriter::FlushCurrentRowGroup() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-parquet-table-writer.cc:1195:5
> #6 0x25b4147 in impala::HdfsParquetTableWriter::Finalize() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-parquet-table-writer.cc:1161:3
> #7 0x251e424 in 
> impala::HdfsTableSink::FinalizePartitionFile(impala::RuntimeState*, 
> impala::OutputPartition*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-table-sink.cc:620:5
> #8 0x2523d1b in impala::HdfsTableSink::FlushFinal(impala::RuntimeState*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-table-sink.cc:660:5
> #9 0x1fc09f0 in impala::FragmentInstanceState::ExecInternal() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:324:3
> #10 0x1fbbd5c in impala::FragmentInstanceState::Exec() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:95:14
> #11 0x1fd5f94 in 
> impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/query-state.cc:478:24
> #12 0x1cdef96 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:766:14
> #13 0x23ac27e in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/util/thread.cc:359:3
> #14 0x23b7708 in void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list0&, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525:9
> #15 0x23b755b in boost::_bi::bind_t std::string const&, boost::function,

[jira] [Commented] (IMPALA-7644) Hide Parquet page index writing with feature flag

2018-10-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648721#comment-16648721
 ] 

ASF subversion and git services commented on IMPALA-7644:
-

Commit af76186e013607cb64baf151c039e4f6aaab4350 in impala's branch 
refs/heads/master from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=af76186 ]

IMPALA-7704: Revert "IMPALA-7644: Hide Parquet page index writing with feature 
flag"

The fix for IMPALA-7644 introduced ASAN issues detailed in
IMPALA-7704. Reverting for now.

This reverts commit 843683ed6c2ef41c7c25e9fa4af68801dbdd1a78.

Change-Id: Icf0a64d6ec747275e3ecd6e801e054f81095591a
Reviewed-on: http://gerrit.cloudera.org:8080/11671
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Ho 


> Hide Parquet page index writing with feature flag
> -
>
> Key: IMPALA-7644
> URL: https://issues.apache.org/jira/browse/IMPALA-7644
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: parquet, performance
>
> Currently there is no released Impala version that can write the Parquet page 
> index:
> [https://github.com/apache/parquet-format/blob/master/PageIndex.md]
> However, the current Impala master writes the page index since IMPALA-5842, 
> but cannot read it.
> I think we should hide the write path with a feature flag until Impala is 
> able to read it back and has better test coverage on it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7704) ASAN tests failing in HdfsParquetTableWriter

2018-10-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648720#comment-16648720
 ] 

ASF subversion and git services commented on IMPALA-7704:
-

Commit af76186e013607cb64baf151c039e4f6aaab4350 in impala's branch 
refs/heads/master from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=af76186 ]

IMPALA-7704: Revert "IMPALA-7644: Hide Parquet page index writing with feature 
flag"

The fix for IMPALA-7644 introduced ASAN issues detailed in
IMPALA-7704. Reverting for now.

This reverts commit 843683ed6c2ef41c7c25e9fa4af68801dbdd1a78.

Change-Id: Icf0a64d6ec747275e3ecd6e801e054f81095591a
Reviewed-on: http://gerrit.cloudera.org:8080/11671
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Ho 


> ASAN tests failing in HdfsParquetTableWriter
> 
>
> Key: IMPALA-7704
> URL: https://issues.apache.org/jira/browse/IMPALA-7704
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Zoltán Borók-Nagy
>Priority: Blocker
>  Labels: broken-build
>
> ASAN tests have been failing for the last few runs. Here is the output:
> {noformat}
> ==117268==ERROR: AddressSanitizer: use-after-poison on address 0x7ef9312e0ec0 
> at pc 0x016a0c98 bp 0x7ef91e502540 sp 0x7ef91e501cf0
> READ of size 32681 at 0x7ef9312e0ec0 thread T82364
> #0 0x16a0c97 in __interceptor_memcpy 
> /data/jenkins/workspace/impala-toolchain-package-build/label/impala-toolchnbld-cent70-ec2-c3-4xl-ondem/toolchain/source/llvm/llvm-5.0.1.src-p1/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:738
> #1 0x7f02e43c2dca in jni_SetByteArrayRegion 
> (/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so+0x6d3dca)
> #2 0x4ae29b7 in hdfsWrite 
> /container.redhat6/build/cdh/hadoop/3.0.0-cdh6.x-SNAPSHOT/rpm/BUILD/hadoop-3.0.0-cdh6.x-SNAPSHOT/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c:1626
> #3 0x332ca67 in impala::HdfsTableWriter::Write(unsigned char const*, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-table-writer.cc:46:13
> #4 0x25ac037 in 
> impala::HdfsParquetTableWriter::BaseColumnWriter::Flush(long*, long*, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-parquet-table-writer.cc:736:5
> #5 0x25b0e57 in impala::HdfsParquetTableWriter::FlushCurrentRowGroup() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-parquet-table-writer.cc:1195:5
> #6 0x25b4147 in impala::HdfsParquetTableWriter::Finalize() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-parquet-table-writer.cc:1161:3
> #7 0x251e424 in 
> impala::HdfsTableSink::FinalizePartitionFile(impala::RuntimeState*, 
> impala::OutputPartition*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-table-sink.cc:620:5
> #8 0x2523d1b in impala::HdfsTableSink::FlushFinal(impala::RuntimeState*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/hdfs-table-sink.cc:660:5
> #9 0x1fc09f0 in impala::FragmentInstanceState::ExecInternal() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:324:3
> #10 0x1fbbd5c in impala::FragmentInstanceState::Exec() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:95:14
> #11 0x1fd5f94 in 
> impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/query-state.cc:478:24
> #12 0x1cdef96 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:766:14
> #13 0x23ac27e in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/util/thread.cc:359:3
> #14 0x23b7708 in void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list0&, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525:9
> #15 0x23b755b in boost::_bi::bind_t std::string const&, boost::function,

[jira] [Commented] (IMPALA-7701) Grant option always shows as NULL in SHOW GRANT ROLE/USER for any HS2 clients

2018-10-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648718#comment-16648718
 ] 

ASF subversion and git services commented on IMPALA-7701:
-

Commit a80ec4a6d987464352e8f7da89110151569a5a64 in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a80ec4a ]

IMPALA-7701: grant_option in SHOW GRANT always returns NULL from HS2 clients

Prior to this patch, SHOW GRANT ROLE/USER always showed NULL in
grant_option column because the grant_option column header was set
to use BOOLEAN type but the column value was set to use STRING.
This mismatch causes HS2 clients to interpret the column value as
not set (NULL). The patch fixes the issue by setting the grant_option
column value to use BOOLEAN value. The patch also renames
test_show_grant_user.py to test_show_grant.py for all tests related to
SHOW GRANT statements.

Testing:
- Ran all FE tests
- Added new E2E test running SHOW GRANT statements from HS2 client
- Ran all E2E authorization tests

Change-Id: I1e175544172b63d36dceedc61e1f47e0f910d7cf
Reviewed-on: http://gerrit.cloudera.org:8080/11663
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Grant option always shows as NULL in SHOW GRANT ROLE/USER for any HS2 clients
> -
>
> Key: IMPALA-7701
> URL: https://issues.apache.org/jira/browse/IMPALA-7701
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> In Impala shell:
> {noformat}
> [localhost:21000] default> show grant user test_user;
> Query: show grant user test_user
> +++--+--+---++-+---+--+---+
> | principal_type | principal_name | scope| database | table | column | 
> uri | privilege | grant_option | create_time   |
> +++--+--+---++-+---+--+---+
> | USER   | test_user  | database | foo  |   ||
>  | owner | true | Thu, Oct 11 2018 11:48:57.186 |
> | ROLE   | admin  | server   |  |   ||
>  | all   | false| Thu, Oct 11 2018 11:48:48.203 |
> +++--+--+---++-+---+--+---+
> Fetched 2 row(s) in 0.13s
> [localhost:21000] default> show grant role admin;
> Query: show grant role admin
> ++--+---++-+---+--+---+
> | scope  | database | table | column | uri | privilege | grant_option | 
> create_time   |
> ++--+---++-+---+--+---+
> | server |  |   || | all   | false| Thu, 
> Oct 11 2018 11:48:48.203 |
> ++--+---++-+---+--+---+
> Fetched 1 row(s) in 0.01s
> {noformat}
> Using Impyla (HS2 client):
> {noformat}
> >>> from impala.dbapi import connect
> >>> conn = connect(host='localhost', port=21050)
> >>> cursor = conn.cursor()
> >>> cursor.execute('show grant user test_user')
> >>> print(cursor.fetchall())
> [('USER', 'test_user', 'database', 'foo', '', '', '', 'owner', None, 'Thu, 
> Oct 11 2018 11:48:57.186'), ('ROLE', 'admin', 'server', '', '', '', '', 
> 'all', None, 'Thu, Oct 11 2018 11:48:48.203')]
> >>> cursor.execute('show grant role admin')
> >>> print(cursor.fetchall())
> [('server', '', '', '', '', 'all', None, 'Thu, Oct 11 2018 11:48:48.203')]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7690) TestAdmissionController.test_pool_config_change_while_queued fails on centos6

2018-10-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648717#comment-16648717
 ] 

ASF subversion and git services commented on IMPALA-7690:
-

Commit e65ac1a4341acfce7c1afc08c0c0566ee0ca50ab in impala's branch 
refs/heads/master from [~bikram.sngh91]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e65ac1a ]

IMPALA-7690: Make test_pool_config_change_while_queued compatible with
python 2.6

The ElementTree XML API used in test_pool_config_change_while_queued
used iter() which was added in python 2.7. Switching it to
getiterator() made it compatible with python 2.6.

Change-Id: Id2593609e5be288054d1361f0fe57580e17ea042
Reviewed-on: http://gerrit.cloudera.org:8080/11660
Reviewed-by: Pooja Nilangekar 
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins 


> TestAdmissionController.test_pool_config_change_while_queued fails on centos6
> -
>
> Key: IMPALA-7690
> URL: https://issues.apache.org/jira/browse/IMPALA-7690
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Pooja Nilangekar
>Assignee: Bikramjeet Vig
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.1.0
>
>
> TestAdmissionController.test_pool_config_change_while_queued fails on Centos6 
> because python 2.6 does not support iter() on {{xml.etree.ElementTree. }}
>  
> Here are the logs from the test failure:
>  
> {code:java}
> custom_cluster/test_admission_controller.py:767: in 
> test_pool_config_change_while_queued
> config.set_config_value(pool_name, config_str, 1)
> common/resource_pool_config.py:43: in set_config_value
> node = self.__find_xml_node(self.root, pool_name, config_str)
> common/resource_pool_config.py:86: in __find_xml_node
> for property in xml_root.iter('property'):
> E   AttributeError: _ElementInterface instance has no attribute 'iter'
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5031) UBSAN clean and method for testing UBSAN cleanliness

2018-10-15 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650905#comment-16650905
 ] 

ASF subversion and git services commented on IMPALA-5031:
-

Commit 8fc702be0c5017775e99fba0e88772ce3c14eb68 in impala's branch 
refs/heads/master from [~jbapple]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=8fc702b ]

IMPALA-5031: fix signed overflows in decimal

The standard says that overflow for signed arithmetic operations is
undefined behavior; see [expr]:

If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable values
for its type, the behavior is undefined.

and [basic.fundamental]:

Unsigned integers shall obey the laws of arithmetic modulo 2^n
where n is the number of bits in the value representation of that
particular size of integer. This implies that unsigned arithmetic
does not overflow because a result that cannot be represented by
the resulting unsigned integer type is reduced modulo the number
that is one greater than the largest value that can be represented
by the resulting unsigned integer type.

All of the overflows fixed in this patch were tested with expr-test's
DecimalArithmeticTest.

Change-Id: Ibf882428931e4f4264be2fc8cd9d6b1fc89b8ace
Reviewed-on: http://gerrit.cloudera.org:8080/11604
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins 


> UBSAN clean and method for testing UBSAN cleanliness
> 
>
> Key: IMPALA-5031
> URL: https://issues.apache.org/jira/browse/IMPALA-5031
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend, Infrastructure
>Affects Versions: Impala 2.9.0
>Reporter: Jim Apple
>Assignee: Jim Apple
>Priority: Minor
>
> http://releases.llvm.org/3.8.0/tools/clang/docs/UndefinedBehaviorSanitizer.html
>  builds are supported after https://gerrit.cloudera.org/#/c/6186/, but 
> Impala's test suite triggers many errors under UBSAN. Those errors should be 
> fixed and then there should be a way to run the test suite under UBSAN and 
> fail if there were any errors detected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7703) Upgrade to Sentry 2.1.0

2018-10-15 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650906#comment-16650906
 ] 

ASF subversion and git services commented on IMPALA-7703:
-

Commit 9571b18ca4ad197eee1bb43092f05722de4c8ee5 in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=9571b18 ]

IMPALA-7703: Update to Sentry 2.1.0

This patch bumps the CDH_BUILD_NUMBER to 632827 in order to use Sentry
2.1.0.

Testing:
- Ran all core tests

Change-Id: I001d17313663171bc6ff23d62026c258486726a1
Reviewed-on: http://gerrit.cloudera.org:8080/11678
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Upgrade to Sentry 2.1.0
> ---
>
> Key: IMPALA-7703
> URL: https://issues.apache.org/jira/browse/IMPALA-7703
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Upgrade to Sentry 2.1.0 for various bug fixes related to fine-grained 
> privileges and object ownership.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7705) Impala 2.13 & 3.1 Docs: ALTER DATABASE SET OWNER

2018-10-15 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650907#comment-16650907
 ] 

ASF subversion and git services commented on IMPALA-7705:
-

Commit 1bfd7ee1c6315cc38126ed81dd757f56e01c5b16 in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=1bfd7ee ]

IMPALA-7705: [DOCS] Documented the ALTER DATABASE SET OWNER statement

Change-Id: Ifac0b689d55f525145b37846967a7a22f0e9245b
Reviewed-on: http://gerrit.cloudera.org:8080/11674
Tested-by: Impala Public Jenkins 
Reviewed-by: Adam Holley 
Reviewed-by: Fredy Wijaya 


> Impala 2.13 & 3.1 Docs: ALTER DATABASE SET OWNER
> 
>
> Key: IMPALA-7705
> URL: https://issues.apache.org/jira/browse/IMPALA-7705
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
> Fix For: Impala 3.1.0
>
>
>  
> ALTER DATABASE SET OWNER https://issues.apache.org/jira/browse/IMPALA-7016
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7706) Impala Doc: ALTER TABLE SET OWNER not on Sentry page for Impala

2018-10-15 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650908#comment-16650908
 ] 

ASF subversion and git services commented on IMPALA-7706:
-

Commit fe6dabda6a55dcd6f047829012a5f9ebf0755df3 in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=fe6dabd ]

IMPALA-7706: [DOCS] Added the privileges required for ALTER TABLE SET OWNER

Also, added the privileges required for ALTER VIEW SET OWNER

Change-Id: I671dbe3e6fb3118a67c59fb1fcaf1ec53139b587
Reviewed-on: http://gerrit.cloudera.org:8080/11675
Reviewed-by: Fredy Wijaya 
Tested-by: Impala Public Jenkins 


> Impala Doc: ALTER TABLE SET OWNER not on Sentry page for Impala
> ---
>
> Key: IMPALA-7706
> URL: https://issues.apache.org/jira/browse/IMPALA-7706
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
> Fix For: Impala 3.1.0
>
>
> ALTER TABLE SET OWNER, which requires either ALL WITH GRANT, or OWNER WITH 
> GRANT for the TABLE, is not mentioned on the page.  It's a special case so 
> may not fit in the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7673) Parse --var variable values to replace variables within the value

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653849#comment-16653849
 ] 

ASF subversion and git services commented on IMPALA-7673:
-

Commit 31dfa3e28c2dadb6567c1bc812011286024efa4c in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=31dfa3e ]

IMPALA-7673: Support values from other variables in Impala shell --var

Prior to this patch, Impala shell --var could not accept values from
other variables unlike the one in Impala interactive shell with the SET
command.  This patch refactors the logic of variable substitution to
use the same logic in both interactive and command line shells.

Example:
$ impala-shell.sh \
--var="msg1=1" \
--var="msg2=\${var:msg1}2" \
--var="msg3=\${var:msg1}\${var:msg2}"

[localhost:21000] default> select ${var:msg3};
Query: select 112
+-+
| 112 |
+-+
| 112 |
+-+

Testing:
- Added a new shell test
- Ran all shell tests

Change-Id: Ib5b9fda329c45f2e5682f3cbc76d29ceca2e226a
Reviewed-on: http://gerrit.cloudera.org:8080/11623
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Parse --var variable values to replace variables within the value
> -
>
> Key: IMPALA-7673
> URL: https://issues.apache.org/jira/browse/IMPALA-7673
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 2.11.0, Impala 3.0
> Environment: CentOS Linux release 7.4.1708
> CDH 5.14.4
>Reporter: Aaron Baff
>Assignee: Fredy Wijaya
>Priority: Minor
> Fix For: Impala 3.1.0
>
>
> Related to IMPALA-2180
> In working on a query using SET variables, and trying to move them to 
> impala-shell --var options to set the variables, the later variable which 
> depends on the 1st one doesn't have the 1st one be replaced properly like it 
> does with a SET.
> For example:
> --var="DATA_DATE_START='2018-09-28'
> --var="START_ACTION_CLICK_RANGE=from_timestamp(date_sub(to_timestamp(\${var:DATA_DATE_START},'-MM-dd'),
>  93), '-MM-dd')"
> In the query that gets run, the ${var:START_ACTION_CLICK_RANGE} gets replaced 
> with
> from_timestamp(date_sub(to_timestamp(${var:DATA_DATE_START},'-MM-dd'), 
> 93), '-MM-dd')
> not with
> from_timestamp(date_sub(to_timestamp('2018-09-28','-MM-dd'), 93), 
> '-MM-dd')
> as I would expect it to.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7678) Revert IMPALA-7660

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653857#comment-16653857
 ] 

ASF subversion and git services commented on IMPALA-7678:
-

Commit 5e92d139b951e77d3f9f355c1e46736454a654b0 in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5e92d13 ]

IMPALA-7678: Reapply "IMPALA-7660: Support ECDH ciphers for debug webserver"

This patch reverses the revert of IMPALA-7660.

The problem with IMPALA-7660 was that urllib.urlopen added the
'context' parameter in 2.7.9, so it isn't present on rhel7, which uses
2.7.5

The fix is to switch to using the 'requests' library, which supports
ssl connections on all the platforms Impala is supported on.

This patch also adds more info to the error message printed by
start-impala-cluster.py when the debug webserver cannot be reached yet
to help with debugging these issues in the future.

Testing:
- Ran full builds on rhel7, rhel6, and ubuntu16.

Change-Id: I679469ed7f27944f75004ec4b16d513e6ea6b544
Reviewed-on: http://gerrit.cloudera.org:8080/11625
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Revert IMPALA-7660
> --
>
> Key: IMPALA-7678
> URL: https://issues.apache.org/jira/browse/IMPALA-7678
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Pooja Nilangekar
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
>
> After merging IMPALA-7660, impala server starts up but 
> start-impala-cluster.py can't contact the debug webpage on RHEL builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7076) Impala 2.13 & 3.1 Docs: ALTER TABLE / VIEW SET OWNER

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653854#comment-16653854
 ] 

ASF subversion and git services commented on IMPALA-7076:
-

Commit 96debe0a5c701781029908848437b66fcc5fff25 in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=96debe0 ]

IMPALA-7076: [DOCS] Document ALTER TABLE / VIEW SET OWNER statement

Change-Id: I203a800855a413069a40c728dfa157939ea15caf
Reviewed-on: http://gerrit.cloudera.org:8080/11673
Tested-by: Impala Public Jenkins 
Reviewed-by: Fredy Wijaya 


> Impala 2.13 & 3.1 Docs: ALTER TABLE / VIEW SET OWNER
> 
>
> Key: IMPALA-7076
> URL: https://issues.apache.org/jira/browse/IMPALA-7076
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Fredy Wijaya
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7702) Enable pull incremental stats a default

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653853#comment-16653853
 ] 

ASF subversion and git services commented on IMPALA-7702:
-

Commit ad265842b691927e1c204203390172a92dc38a68 in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ad26584 ]

IMPALA-7702: enable fetch incremental stats by default

Flips the default from always off to always on for
--pull_incremental_statistics. With this setting, the default
is for coordinators to fetch incremental stats from catalogd
directly (only when computing incremental stats) instead of
receiving it from the statestore broadcast.

Fetching incremental stats is not applicable when using a
CatalogMetaProvider. By making fetch the default, it would
require that --pull_incremental_statistics is set to false
when enabling CatalogMetaProvider. This change makes
--use_local_catalog to take priority over --pull_incremental_statistics
so that when both are turned on, only the local catalog setting
is enabled.

Testing:
- manual testing
- moved the testing for pull incremental stats out of custom cluster
  tests since the default flipped
- added tests that run with local catalog and pulling incremental stats.

Change-Id: I5601a24f81bb3466cff5308c7093d2765bb1c481
Reviewed-on: http://gerrit.cloudera.org:8080/11677
Reviewed-by: Vuk Ercegovac 
Tested-by: Impala Public Jenkins 


> Enable pull incremental stats a default
> ---
>
> Key: IMPALA-7702
> URL: https://issues.apache.org/jira/browse/IMPALA-7702
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Vuk Ercegovac
>Assignee: Vuk Ercegovac
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7669) Concurrent invalidate with compute (or drop) stats throws NPE.

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653855#comment-16653855
 ] 

ASF subversion and git services commented on IMPALA-7669:
-

Commit 2b2cf8d96617320d184d070f9319c2463aa0d84f in impala's branch 
refs/heads/master from Bharath Vissapragada
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=2b2cf8d ]

IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

The bug here was that any partial RPC on an IncompleteTable was throwing
an NPE.

Ideally, we attempt to load the table (if we find that it is not loaded)
before making the partial info request, but a concurrent invalidate could
reset the table state and move it back to an uninitialized state.

This patch handles this case better by propagating a meaningful error to
the caller.

Testing:
---
- Added a test that fails consistently with an NPE without this patch.

Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Reviewed-on: http://gerrit.cloudera.org:8080/11638
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 


> Concurrent invalidate with compute (or drop) stats throws NPE.
> --
>
> Key: IMPALA-7669
> URL: https://issues.apache.org/jira/browse/IMPALA-7669
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Assignee: bharath v
>Priority: Critical
>
> *This is a Catalog V2 only bug*
> NPE is thrown when trying to getPartialInfo() from an IncompleteTable (result 
> of ivalidate) and cause_ is null.
> {noformat}
> @Override
>   public TGetPartialCatalogObjectResponse getPartialInfo(
>   TGetPartialCatalogObjectRequest req) throws TableLoadingException {
> Throwables.propagateIfPossible(cause_, TableLoadingException.class);
> throw new TableLoadingException(cause_.getMessage());  <-
>   }
> {noformat}
> {noformat}
> I1004 16:51:28.845305 85380 jni-util.cc:308] java.lang.NullPointerException
> at 
> org.apache.impala.catalog.IncompleteTable.getPartialInfo(IncompleteTable.java:140)
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.getPartialCatalogObject(CatalogServiceCatalog.java:2171)
> at 
> org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:236)
> {noformat}
> Actual caller stack trace is this.
> {noformat}
> I1004 16:51:21.666422 67179 Frontend.java:1086] Analyzing query: compute 
> stats ads
> I1004 16:51:28.850023 67179 jni-util.cc:308] 
> org.apache.impala.catalog.local.LocalCatalogException: Could not load table 
> parnal.ads from metastore
> at 
> org.apache.impala.catalog.local.LocalTable.loadTableMetadata(LocalTable.java:128)
> at org.apache.impala.catalog.local.LocalTable.load(LocalTable.java:89)
> at org.apache.impala.catalog.local.LocalDb.getTable(LocalDb.java:119)
> at 
> org.apache.impala.analysis.StmtMetadataLoader.getMissingTables(StmtMetadataLoader.java:251)
> at 
> org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:140)
> at 
> org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:116)
> at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1118)
> at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1092)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1064)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:158)
> Caused by: org.apache.thrift.TException: 
> TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
> error_msgs:[NullPointerException: null]), lookup_status:OK)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:354)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.access$100(CatalogdMetaProvider.java:163)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$5.call(CatalogdMetaProvider.java:565)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$5.call(CatalogdMetaProvider.java:560)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$1.call(CatalogdMetaProvider.java:411)
> at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
> at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> at com.google.

[jira] [Commented] (IMPALA-7660) Support ECDH ciphers for debug webserver

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653858#comment-16653858
 ] 

ASF subversion and git services commented on IMPALA-7660:
-

Commit 5e92d139b951e77d3f9f355c1e46736454a654b0 in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5e92d13 ]

IMPALA-7678: Reapply "IMPALA-7660: Support ECDH ciphers for debug webserver"

This patch reverses the revert of IMPALA-7660.

The problem with IMPALA-7660 was that urllib.urlopen added the
'context' parameter in 2.7.9, so it isn't present on rhel7, which uses
2.7.5

The fix is to switch to using the 'requests' library, which supports
ssl connections on all the platforms Impala is supported on.

This patch also adds more info to the error message printed by
start-impala-cluster.py when the debug webserver cannot be reached yet
to help with debugging these issues in the future.

Testing:
- Ran full builds on rhel7, rhel6, and ubuntu16.

Change-Id: I679469ed7f27944f75004ec4b16d513e6ea6b544
Reviewed-on: http://gerrit.cloudera.org:8080/11625
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Support ECDH ciphers for debug webserver
> 
>
> Key: IMPALA-7660
> URL: https://issues.apache.org/jira/browse/IMPALA-7660
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: security
>
> A recent change (IMPALA-7519) added support for ecdh ciphers for our 
> beeswax/hs2 server. It would be useful to support this for our debug webpage 
> server, which is based on squeasel
> A recent commit on squeasel 
> (https://github.com/cloudera/squeasel/commit/8aa6177ba08e69cd4498c4c7a453340d86c3ad0f)
>  added support for this, so this is basically just pulling that commit in and 
> adding tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7709) Add options to restart catalogd and statestored in start-impala-cluster.py

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653851#comment-16653851
 ] 

ASF subversion and git services commented on IMPALA-7709:
-

Commit 97731e5b85c75ae6fa06af74456428890069ca12 in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=97731e5 ]

IMPALA-7709: Add options to restart catalogd and statestored in 
start-impala-cluster.py

This patch adds two options start-impala-cluster.py:
--restart_catalogd_only to restart catalogd process
--restart_statestored_only to restart statestored process

Testing:
- Manually tested the two new options

Change-Id: Ide26902f6bce11718708d5ab0174282dd94400a3
Reviewed-on: http://gerrit.cloudera.org:8080/11687
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add options to restart catalogd and statestored in start-impala-cluster.py
> --
>
> Key: IMPALA-7709
> URL: https://issues.apache.org/jira/browse/IMPALA-7709
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Currently there exists an option --restart_impalad_only to restart impalad 
> processes. We need similar options to restart catalogd and statestored in 
> start-impala-cluster.py, which can be useful for testing against catalogd and 
> statestored scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7713) Add test coverage for catalogd restart when authorization is enabled

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653856#comment-16653856
 ] 

ASF subversion and git services commented on IMPALA-7713:
-

Commit 0cd9151801cf446330b129b0609d6cedd4b98f06 in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=0cd9151 ]

IMPALA-7713: Add test coverage for catalogd restart when authorization is 
enabled

This patch adds a test coverage for catalogd restart when authorization
is enabled to ensure all privileges in the impalad's catalogs get reset
after the catalogd restart to avoid stale privileges in the impalad's
catalogs, which can pose a security issue.

Testing:
- Ran all E2E authorization tests
- Added a new test

Change-Id: Ib9a168697401cf0b83c7a193fa477888b48cb369
Reviewed-on: http://gerrit.cloudera.org:8080/11696
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add test coverage for catalogd restart when authorization is enabled
> 
>
> Key: IMPALA-7713
> URL: https://issues.apache.org/jira/browse/IMPALA-7713
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> There's currently no test for catalogd restart when authorization is enabled 
> especially when the authorization data is stored in both impalad/catalogd's 
> catalog.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7660) Support ECDH ciphers for debug webserver

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653860#comment-16653860
 ] 

ASF subversion and git services commented on IMPALA-7660:
-

Commit 5e92d139b951e77d3f9f355c1e46736454a654b0 in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5e92d13 ]

IMPALA-7678: Reapply "IMPALA-7660: Support ECDH ciphers for debug webserver"

This patch reverses the revert of IMPALA-7660.

The problem with IMPALA-7660 was that urllib.urlopen added the
'context' parameter in 2.7.9, so it isn't present on rhel7, which uses
2.7.5

The fix is to switch to using the 'requests' library, which supports
ssl connections on all the platforms Impala is supported on.

This patch also adds more info to the error message printed by
start-impala-cluster.py when the debug webserver cannot be reached yet
to help with debugging these issues in the future.

Testing:
- Ran full builds on rhel7, rhel6, and ubuntu16.

Change-Id: I679469ed7f27944f75004ec4b16d513e6ea6b544
Reviewed-on: http://gerrit.cloudera.org:8080/11625
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Support ECDH ciphers for debug webserver
> 
>
> Key: IMPALA-7660
> URL: https://issues.apache.org/jira/browse/IMPALA-7660
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: security
>
> A recent change (IMPALA-7519) added support for ecdh ciphers for our 
> beeswax/hs2 server. It would be useful to support this for our debug webpage 
> server, which is based on squeasel
> A recent commit on squeasel 
> (https://github.com/cloudera/squeasel/commit/8aa6177ba08e69cd4498c4c7a453340d86c3ad0f)
>  added support for this, so this is basically just pulling that commit in and 
> adding tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7545) Add admission control status to query log

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653852#comment-16653852
 ] 

ASF subversion and git services commented on IMPALA-7545:
-

Commit ec2dabafb989e2aca0fddb8f6c467e6f551d0424 in impala's branch 
refs/heads/master from poojanilangekar
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ec2daba ]

IMPALA-7545: Add queuing reason to query log

After this change, the HS2 GetLog() function returns the queuing
reason for a query when it is queued by the AdmissionController.

Testing: Added an end-to-end test to test_admission_controller.py
to verify the query logs returned.

Change-Id: I2e5d8de4f6691a9ba2594ca68c54ea4dca760545
Reviewed-on: http://gerrit.cloudera.org:8080/11669
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add admission control status to query log
> -
>
> Key: IMPALA-7545
> URL: https://issues.apache.org/jira/browse/IMPALA-7545
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Pooja Nilangekar
>Priority: Critical
>  Labels: admission-control, observability
>
> We already include the query progress in the HS2 GetLog() response (although 
> for some reason we don't do the same for beeswax) so we should include 
> admission control progress. We should definitely include it if the query is 
> currently queued, it's probably too noisy to include once the query has been 
> admitted.
> We should also do the same for beeswax/impala-shell so that 
> live_progress/live_summary is useful if the query is queued. We should look 
> at the live_progress/live_summary mechanisms and extend those to include the 
> required information to report admission control state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7708) Switch to faster compression strategy for incremental stats

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653850#comment-16653850
 ] 

ASF subversion and git services commented on IMPALA-7708:
-

Commit 3b6c0f6296e807b25f3e40bd614b2571f4f01d48 in impala's branch 
refs/heads/master from Bharath Vissapragada
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=3b6c0f6 ]

IMPALA-7708: Switch to faster deflater compression level for incr stats

On a table with 3000 partitions and ~150 columns, we noticed that the
BEST_SPEED deflater strategy is ~8x faster with ~4% compression ratio
penalty. Given these results, this patch switches the default to
BEST_SPEED from BEST_COMPRESSION.

Change-Id: Ife688aca3aed0e1e8af26c8348b850175d84b4ad
Reviewed-on: http://gerrit.cloudera.org:8080/11685
Reviewed-by: Philip Zeyliger 
Reviewed-by: Vuk Ercegovac 
Tested-by: Impala Public Jenkins 


> Switch to faster compression strategy for incremental stats
> ---
>
> Key: IMPALA-7708
> URL: https://issues.apache.org/jira/browse/IMPALA-7708
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Assignee: bharath v
>Priority: Major
>
> Currently we set the Deflater mode to BEST_COMPRESSION by default.
> {noformat}
> public static byte[] deflateCompress(byte[] input) {
> if (input == null) return null;
> ByteArrayOutputStream bos = new ByteArrayOutputStream(input.length);
> // TODO: Benchmark other compression levels.
> DeflaterOutputStream stream =
> new DeflaterOutputStream(bos, new 
> Deflater(Deflater.BEST_COMPRESSION));
> {noformat}
> In some experiments, we noticed that the fastest compression mode 
> (BEST_SPEED) performs ~8x faster with only ~4% compression ratio penalty. 
> Here are some results on a real world table with 3000 partitions with 
> incremental stats.
>  
> | |Time taken for serialization (seconds)|OutputBytes size (MB)|
> |Gzip best compression|92|194|
> |Gzip fastest compression|11|212|
> |Gzip default compression|57|195|
> |No compression|5|452|
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7660) Support ECDH ciphers for debug webserver

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653859#comment-16653859
 ] 

ASF subversion and git services commented on IMPALA-7660:
-

Commit 5e92d139b951e77d3f9f355c1e46736454a654b0 in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5e92d13 ]

IMPALA-7678: Reapply "IMPALA-7660: Support ECDH ciphers for debug webserver"

This patch reverses the revert of IMPALA-7660.

The problem with IMPALA-7660 was that urllib.urlopen added the
'context' parameter in 2.7.9, so it isn't present on rhel7, which uses
2.7.5

The fix is to switch to using the 'requests' library, which supports
ssl connections on all the platforms Impala is supported on.

This patch also adds more info to the error message printed by
start-impala-cluster.py when the debug webserver cannot be reached yet
to help with debugging these issues in the future.

Testing:
- Ran full builds on rhel7, rhel6, and ubuntu16.

Change-Id: I679469ed7f27944f75004ec4b16d513e6ea6b544
Reviewed-on: http://gerrit.cloudera.org:8080/11625
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Support ECDH ciphers for debug webserver
> 
>
> Key: IMPALA-7660
> URL: https://issues.apache.org/jira/browse/IMPALA-7660
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: security
>
> A recent change (IMPALA-7519) added support for ecdh ciphers for our 
> beeswax/hs2 server. It would be useful to support this for our debug webpage 
> server, which is based on squeasel
> A recent commit on squeasel 
> (https://github.com/cloudera/squeasel/commit/8aa6177ba08e69cd4498c4c7a453340d86c3ad0f)
>  added support for this, so this is basically just pulling that commit in and 
> adding tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7715) "Impala Conditional Functions" documentation errata

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654228#comment-16654228
 ] 

ASF subversion and git services commented on IMPALA-7715:
-

Commit a0f351a647b7f48468e6dd3877a16947a746ab9c in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a0f351a ]

IMPALA-7715: [DOCS] Better descriptions for conditional functions

- Updated the descriptions for ISTRUE, ISFALSE, NONULVALUE, NULLVALUE.
- Updated several function names to use caps.

Change-Id: I5cc90d62645730d2674bcb3af614863aa92b92f6
Reviewed-on: http://gerrit.cloudera.org:8080/11704
Tested-by: Impala Public Jenkins 
Reviewed-by: Paul Rogers 
Reviewed-by: Alex Rodoni 


> "Impala Conditional Functions" documentation errata
> ---
>
> Key: IMPALA-7715
> URL: https://issues.apache.org/jira/browse/IMPALA-7715
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Alex Rodoni
>Priority: Minor
> Fix For: Impala 3.1.0
>
>
> Consider the documentation page [Impala Conditional 
> Functions|https://impala.apache.org/docs/build3x/html/topics/impala_conditional_functions.html].
> Multiple functions have ambiguous descriptions. For example:
> {quote}isfalse(boolean)
> Purpose: Tests if a Boolean expression is false or not. Returns true if so.
> {quote}
> The above is confusing, it essentially means: "Returns true if a Boolean 
> expression is false or not." This obviously means the function always returns 
> false, which is not accurate.
> Reword to say: "Returns true if the expression is false. Returns false if the 
> expression is true or NULL."
> Other ambiguous descriptions:
> {quote}
> istrue(boolean)
> Purpose: Tests if a Boolean expression is true or not. Returns true if so.
> {quote}
> Better: "Returns true if an expression is true. Returns false if the 
> expression is false or NULL."
> Others:
> {quote}nonnullvalue(expression)
> Purpose: Tests if an expression (of any type) is NULL or not. Returns false 
> if so.
> {quote}
> Better: "Returns true if the expression is non-null, tase if the expression 
> is null. Same as {{expression IS NOT NULL}}."
> {quote}
> nullvalue(expression)
> Purpose: Tests if an expression (of any type) is NULL or not. Returns true if 
> so.
> {quote}
> Better: "Returns true if the expression is NULL, false otherwise. Same as 
> {{expression IS NULL}}"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7639) impala-shell stuck at "Starting Impala Shell without Kerberos authentication" in test_multiline_queries_in_history

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654231#comment-16654231
 ] 

ASF subversion and git services commented on IMPALA-7639:
-

Commit 6399a65a00cfb6b48da29acbb0921a360bf3a019 in impala's branch 
refs/heads/master from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=6399a65 ]

IMPALA-7639: Move concurrent UDF tests to a custom cluster test

Two test_udfs.py tests (test_native_functions_race and
test_concurrent_jar_drop_use) spawn dozens of connections to
test Impala behavior under concurrency. These connections
use up frontend service threads and can cause shell tests
to timeout when trying to connect.

This moves both tests to a new TestUdfConcurrency custom
cluster test. The new custom cluster test uses a larger
fe_service_threads value to allow full concurrency. The
tests run serially and cannot impact other tests.

This also reduces the test dimensions for test_native_functions_race
so that it runs one configuration rather than eight.

Change-Id: I3f255823167a4dd807a07276f630ef02435900a3
Reviewed-on: http://gerrit.cloudera.org:8080/11701
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> impala-shell stuck at "Starting Impala Shell without Kerberos authentication" 
> in test_multiline_queries_in_history
> --
>
> Key: IMPALA-7639
> URL: https://issues.apache.org/jira/browse/IMPALA-7639
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tianyi Wang
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: broken-build, flaky
>
> Impala-shell wrote nothing other than "Starting Impala Shell without Kerberos 
> authentication" to stdout until timeout.
> {noformat}
> 03:11:31 [gw3] linux2 -- Python 2.7.12 
> /home/ubuntu/Impala/bin/../infra/python/env/bin/python
> 03:11:31 shell/test_shell_interactive.py:303: in 
> test_multiline_queries_in_history
> 03:11:31 child_proc.expect(PROMPT_REGEX)
> 03:11:31 
> ../infra/python/env/local/lib/python2.7/site-packages/pexpect/__init__.py:1451:
>  in expect
> 03:11:31 timeout, searchwindowsize)
> 03:11:31 
> ../infra/python/env/local/lib/python2.7/site-packages/pexpect/__init__.py:1466:
>  in expect_list
> 03:11:31 timeout, searchwindowsize)
> 03:11:31 
> ../infra/python/env/local/lib/python2.7/site-packages/pexpect/__init__.py:1568:
>  in expect_loop
> 03:11:31 raise TIMEOUT(str(err) + '\n' + str(self))
> 03:11:31 E   TIMEOUT: Timeout exceeded.
> 03:11:31 E   
> 03:11:31 E   version: 3.3
> 03:11:31 E   command: /home/ubuntu/Impala/bin/impala-shell.sh
> 03:11:31 E   args: ['/home/ubuntu/Impala/bin/impala-shell.sh']
> 03:11:31 E   searcher: 
> 03:11:31 E   buffer (last 100 chars): 'Starting Impala Shell without Kerberos 
> authentication\r\n'
> 03:11:31 E   before (last 100 chars): 'Starting Impala Shell without Kerberos 
> authentication\r\n'
> 03:11:31 E   after: 
> 03:11:31 E   match: None
> 03:11:31 E   match_index: None
> 03:11:31 E   exitstatus: None
> 03:11:31 E   flag_eof: False
> 03:11:31 E   pid: 118020
> 03:11:31 E   child_fd: 18
> 03:11:31 E   closed: False
> 03:11:31 E   timeout: 30
> 03:11:31 E   delimiter: 
> 03:11:31 E   logfile: None
> 03:11:31 E   logfile_read: None
> 03:11:31 E   logfile_send: None
> 03:11:31 E   maxread: 2000
> 03:11:31 E   ignorecase: False
> 03:11:31 E   searchwindowsize: None
> 03:11:31 E   delaybeforesend: 0.05
> 03:11:31 E   delayafterclose: 0.1
> 03:11:31 E   delayafterterminate: 0.1
> {noformat}
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3258/console



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7644) Hide Parquet page index writing with feature flag

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654229#comment-16654229
 ] 

ASF subversion and git services commented on IMPALA-7644:
-

Commit de7f09d726240a32739d59fe16faec5792e7c7a3 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=de7f09d ]

IMPALA-7644: Hide Parquet page index writing with feature flag

This commit adds the command line flag enable_parquet_page_index_writing
to the Impala daemon that switches Impala's ability of writing the
Parquet page index. By default the flag is false, i.e. Impala doesn't
write the page index.

This flag is only temporary, we plan to remove it once Impala is able to
read the page index and has better testing around it.

Because of this change I had to move test_parquet_page_index.py to the
custom_cluster test suite since I need to set this command line flag
in order to test the functionality. I also merged most of the test cases
because we don't want to restart the cluster too many times.

I removed 'num_data_pages_' from BaseColumnWriter since it was rather
confusing and didn't provide any measurable performance improvement.

This commit fixes the ASAN error produced by the first IMPALA-7644
commit which was reverted later.

Change-Id: Ib4a9098a2085a385351477c715ae245d83bf1c72
Reviewed-on: http://gerrit.cloudera.org:8080/11694
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Hide Parquet page index writing with feature flag
> -
>
> Key: IMPALA-7644
> URL: https://issues.apache.org/jira/browse/IMPALA-7644
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: parquet, performance
>
> Currently there is no released Impala version that can write the Parquet page 
> index:
> [https://github.com/apache/parquet-format/blob/master/PageIndex.md]
> However, the current Impala master writes the page index since IMPALA-5842, 
> but cannot read it.
> I think we should hide the write path with a feature flag until Impala is 
> able to read it back and has better test coverage on it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7644) Hide Parquet page index writing with feature flag

2018-10-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654230#comment-16654230
 ] 

ASF subversion and git services commented on IMPALA-7644:
-

Commit de7f09d726240a32739d59fe16faec5792e7c7a3 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=de7f09d ]

IMPALA-7644: Hide Parquet page index writing with feature flag

This commit adds the command line flag enable_parquet_page_index_writing
to the Impala daemon that switches Impala's ability of writing the
Parquet page index. By default the flag is false, i.e. Impala doesn't
write the page index.

This flag is only temporary, we plan to remove it once Impala is able to
read the page index and has better testing around it.

Because of this change I had to move test_parquet_page_index.py to the
custom_cluster test suite since I need to set this command line flag
in order to test the functionality. I also merged most of the test cases
because we don't want to restart the cluster too many times.

I removed 'num_data_pages_' from BaseColumnWriter since it was rather
confusing and didn't provide any measurable performance improvement.

This commit fixes the ASAN error produced by the first IMPALA-7644
commit which was reverted later.

Change-Id: Ib4a9098a2085a385351477c715ae245d83bf1c72
Reviewed-on: http://gerrit.cloudera.org:8080/11694
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Hide Parquet page index writing with feature flag
> -
>
> Key: IMPALA-7644
> URL: https://issues.apache.org/jira/browse/IMPALA-7644
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: parquet, performance
>
> Currently there is no released Impala version that can write the Parquet page 
> index:
> [https://github.com/apache/parquet-format/blob/master/PageIndex.md]
> However, the current Impala master writes the page index since IMPALA-5842, 
> but cannot read it.
> I think we should hide the write path with a feature flag until Impala is 
> able to read it back and has better test coverage on it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7555) impala-shell can hang in connect in certain cases

2018-10-18 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655905#comment-16655905
 ] 

ASF subversion and git services commented on IMPALA-7555:
-

Commit 2fb8ebaef2beecd511e963fadbb41cbb11add138 in impala's branch 
refs/heads/master from aphadke
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=2fb8eba ]

IMPALA-7555: Set socket timeout in impala-shell

impala-shell does not set any socket timeout while connecting to the
impala server. This change sets a timeout on the socket before
connecting and unsets it back after successfully connecting. The default
timeout on this socket is 5 sec.
Usage: impala-shell --client_connect_timeout=

Testing:
1. Added a test where I create a random listening socket.
impala-shell (with ssl enabled) connects to this socket and
times out after 2 sec.

2. Created a kerberized impala cluster with ssl enabled and
connected to the impalad using an openssl client (block the
beeswax server thread to accept new connection) -

E.g. - openssl s_client -connect :21000
Used impala-shell to connect to the same impalad later.
impala-shell timed out after the default of 5 sec.I verified
it manually.

Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813
Reviewed-on: http://gerrit.cloudera.org:8080/11540
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> impala-shell can hang in connect in certain cases
> -
>
> Key: IMPALA-7555
> URL: https://issues.apache.org/jira/browse/IMPALA-7555
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.1.0
>Reporter: Anuj Phadke
>Assignee: Anuj Phadke
>Priority: Major
>
> # Open a random listening port using netcat (nc -l 1234). 
>  # When I connect through impala-shell to this port, it hangs in the connect 
> call. I think this is similar to impala-shell connecting a load balancer port 
> and the connection between LB and impala going down.
>  # Connect in impala-shell also calls PingImpalaService which does more than 
> a normal TCP connect and can hang. Since there is no timeout set on this 
> socket, connect can hang in this call.
>  # We should set a timeout on the client socket to prevent hangs during the 
> connect phase of session creation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7272) impalad crash when Fatigue test

2018-10-18 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655908#comment-16655908
 ] 

ASF subversion and git services commented on IMPALA-7272:
-

Commit 9f5c5e6df03824cba292fe5a619153462c11669c in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=9f5c5e6 ]

IMPALA-7272: Fix crash in StringMinMaxFilter

StringMinMaxFilter uses a MemPool to allocate space for StringBuffers.
Previously, the MemPool was created by RuntimeFilterBank and passed to
each StringMinMaxFilter. In queries with multiple StringMinMaxFilters
being generated by the same fragment instance, this leads to
overlapping use of the MemPool by different threads, which is
incorrect as MemPools are not thread-safe.

The solution is to have each StringMinMaxFilter create its own
MemPool.

This patch also documents MemPool as not thread-safe and introduces a
DFAKE_MUTEX to help enforce correct usage. Doing this requires
modifying our CMakeLists.txt to pass '-DNDEBUG' to clang only in
RELEASE builds, so that the DFAKE_MUTEX will be present in the
compiled IR for DEBUG builds.

Testing:
- I have been unable to repro the actual crash despite trying a large
  variety of different things. However, with additional logging added
  its clear that the MemPool is being used concurrently, which is
  incorrect.
- Added an e2e test that covers the potential issue. It hits the
  DFAKE_MUTEX with a sleep added to MemPool::Allocate.
- Ran a full exhaustive build in both DEBUG and RELEASE.

Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29
Reviewed-on: http://gerrit.cloudera.org:8080/11650
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> impalad   crash when Fatigue test
> -
>
> Key: IMPALA-7272
> URL: https://issues.apache.org/jira/browse/IMPALA-7272
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0, Impala 2.12.0
> Environment: apache  branch  
> [329979d6fb0caa0dc449d7e0aa75460c30e868f0]
> centos 6.5
>  ./buildall.sh -skiptests -noclean -asan
>Reporter: yyzzjj
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
>  Labels: crash
> Attachments: e4386102-833c-40bb-4eec10b2-827c76be.dmp, 
> impalad_node0.ERROR, impalad_node0.WARNING, testing_impala.sh
>
>
> (gdb) bt
> #0 0x003269832635 in raise () from /lib64/libc.so.6
> #1 0x003269833e15 in abort () from /lib64/libc.so.6
> #2 0x04010f64 in google::DumpStackTraceAndExit() ()
> #3 0x040079dd in google::LogMessage::Fail() ()
> #4 0x04009282 in google::LogMessage::SendToLog() ()
> #5 0x040073b7 in google::LogMessage::Flush() ()
> #6 0x0400a97e in google::LogMessageFatal::~LogMessageFatal() ()
> #7 0x01a2dfab in impala::MemPool::CheckIntegrity (this=0x5916e1f8, 
> check_current_chunk_empty=true)
>  at /export/ldb/online/kudu_rpc_branch/be/src/runtime/mem-pool.cc:258
> #8 0x01a2cf56 in impala::MemPool::FindChunk (this=0x5916e1f8, 
> min_size=10, check_limits=true) at 
> /export/ldb/online/kudu_rpc_branch/be/src/runtime/mem-pool.cc:158
> #9 0x01a3dd1b in impala::MemPool::Allocate (alignment=8, 
> size=10, this=0x5916e1f8) at 
> /export/ldb/online/kudu_rpc_branch/be/src/runtime/mem-pool.h:273
> #10 impala::MemPool::TryAllocate (this=0x5916e1f8, size=10) at 
> /export/ldb/online/kudu_rpc_branch/be/src/runtime/mem-pool.h:109
> #11 0x01caefb8 in impala::StringBuffer::GrowBuffer 
> (this=0x7f90d9489c28, new_size=10) at 
> /export/ldb/online/kudu_rpc_branch/be/src/runtime/string-buffer.h:85
> #12 0x01caee18 in impala::StringBuffer::Append (this=0x7f90d9489c28, 
> str=0x7f92cda6e039 "1104700843don...@jd.com业务运营部\230\340\246͒\177", 
> str_len=10)
>  at /export/ldb/online/kudu_rpc_branch/be/src/runtime/string-buffer.h:53
> #13 0x01cac864 in impala::StringMinMaxFilter::CopyToBuffer 
> (this=0x7f90d9489c00, buffer=0x7f90d9489c28, value=0x7f90d9489c08, len=10)
>  at /export/ldb/online/kudu_rpc_branch/be/src/util/min-max-filter.cc:304
> #14 0x01cac2a9 in impala::StringMinMaxFilter::MaterializeValues 
> (this=0x7f90d9489c00) at 
> /export/ldb/online/kudu_rpc_branch/be/src/util/min-max-filter.cc:229
> #15 0x02b9641a in impala::FilterContext::MaterializeValues 
> (this=0x61cc0b70) at 
> /export/ldb/online/kudu_rpc_branch/be/src/exec/filter-context.cc:97
> #16 0x7f93fdb9440e in ?? ()
> #17 0x7f90a97f5400 in ?? ()
> #18 0x2acd2bba01a2e0f7 in ?? ()
> #19 0x5916e140 in ?? ()
> #20 0x7f930c34d740 in ?? ()
> #21 0x7f90a97f5220 in ?? ()
> #22 0x66aa77bb66aa77bb in ?? ()
> #23 0x61cc0b70 in ?? ()
> #24 0x61cc0b70 in ?? ()
> #25 0x61cc0b98 in ?? ()
> #26 0x00

[jira] [Commented] (IMPALA-5843) Use page index in Parquet files to skip pages

2018-10-18 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655907#comment-16655907
 ] 

ASF subversion and git services commented on IMPALA-5843:
-

Commit 48fb4902d4f28c9e4d327b80b00b33962c118c22 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=48fb490 ]

IMPALA-7543: Enhance scan ranges to support sub-ranges

This commit enhances the ScanRange class to make it possible to only
read some smaller parts of the whole ScanRange. This functionality
is needed by IMPALA-5843.

A sub-range is an offset and length which is located within the scan
range. Sub-ranges can be added to a scan range when calling
ScanRange::Reset(). If done so, the ScanRange class will only read the
parts defined by the sub-ranges.

If we have sub-ranges for a cache read then the ScanRange won't
enqueue the whole cache buffer (which contains the whole ScanRange),
but memcpy() the sub-ranges to IO/client buffers.

Smaller refactorings needed to do:
 * remove scan_range_offset_ from BufferDescriptor
 * number of bytes read are bookkeeped by ScanRange again

Testing:
 * introduced CacheReaderTestStub to fake cache reads during testing
 * extended disk-io-mgr-test.cc with sub-ranges

Change-Id: Iea26ba386713990f7671aab5a372cf449b8d51e4
Reviewed-on: http://gerrit.cloudera.org:8080/11520
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Use page index in Parquet files to skip pages
> -
>
> Key: IMPALA-5843
> URL: https://issues.apache.org/jira/browse/IMPALA-5843
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Lars Volker
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: parquet, performance
>
> Once IMPALA-5842 has been resolved, we should skip pages based on the page 
> index in Parquet files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7543) Enhance scan ranges to support sub-ranges

2018-10-18 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655906#comment-16655906
 ] 

ASF subversion and git services commented on IMPALA-7543:
-

Commit 48fb4902d4f28c9e4d327b80b00b33962c118c22 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=48fb490 ]

IMPALA-7543: Enhance scan ranges to support sub-ranges

This commit enhances the ScanRange class to make it possible to only
read some smaller parts of the whole ScanRange. This functionality
is needed by IMPALA-5843.

A sub-range is an offset and length which is located within the scan
range. Sub-ranges can be added to a scan range when calling
ScanRange::Reset(). If done so, the ScanRange class will only read the
parts defined by the sub-ranges.

If we have sub-ranges for a cache read then the ScanRange won't
enqueue the whole cache buffer (which contains the whole ScanRange),
but memcpy() the sub-ranges to IO/client buffers.

Smaller refactorings needed to do:
 * remove scan_range_offset_ from BufferDescriptor
 * number of bytes read are bookkeeped by ScanRange again

Testing:
 * introduced CacheReaderTestStub to fake cache reads during testing
 * extended disk-io-mgr-test.cc with sub-ranges

Change-Id: Iea26ba386713990f7671aab5a372cf449b8d51e4
Reviewed-on: http://gerrit.cloudera.org:8080/11520
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Enhance scan ranges to support sub-ranges
> -
>
> Key: IMPALA-7543
> URL: https://issues.apache.org/jira/browse/IMPALA-7543
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> For IMPALA-5843 we need to have smarter scan ranges that only read a list of 
> inner ranges.
> It'll be useful for Parquet files that have page index, so Impala will only 
> read the relevant pages.
> More information can be found in 
> [https://docs.google.com/document/d/1D-el8njq_I-JKd3NDcW1mRXID_n0dBDKIkjWxwULVus]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7424) Improve in-memory representation of incremental stats

2018-10-18 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655904#comment-16655904
 ] 

ASF subversion and git services commented on IMPALA-7424:
-

Commit 5af5456a2d95a43ce63f4e364ff0b9631729bb1a in impala's branch 
refs/heads/master from Bharath Vissapragada
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5af5456 ]

IMPALA-7689: Reduce per column per partition stats estimate size

With the improvements in the incremental stats memory representation
(IMPALA-7424), the per column per partition stats estimate should be
reduced to account for the compressed memory footprint. Doing some
experiments on various test tables, I see the size is down by 50-70%.

This patch reduces the size estimate by 50% (conservative). Ideally we
don't need to estimate on the Catalog server during serialization since
we can compute the byte sizes by looping through all the partitions.
However this patch retains the current logic to keep it consistent with
"compute incremental stats" analysis.

Change-Id: I347b41d9b298d7cd73ec812692172e0511415eee
Reviewed-on: http://gerrit.cloudera.org:8080/11706
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 


> Improve in-memory representation of incremental stats
> -
>
> Key: IMPALA-7424
> URL: https://issues.apache.org/jira/browse/IMPALA-7424
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: bharath v
>Assignee: bharath v
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Incremental stats are stored in the HMS' parameters map as plain Java 
> Strings. This is suboptimal since Java String class internally uses UTF-16 
> encoding for the underlying bytes. The idea here is to switch to a byte array 
> representation so that we can reduce the memory usage by half (8 bytes).  We 
> can also compress the byte array using gzip compression and lazily decompress 
> them when needed (typically during the incremental stats computation's 
> finalization phase).
> A prototype of this patch on a real-world Catalog dump showed ~54% JVM heap 
> usage reduction (end-to-end) and ~79% reduction in the heap footprint for the 
> incremental stats.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7689) Improve size estimate for incremental stats

2018-10-18 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655903#comment-16655903
 ] 

ASF subversion and git services commented on IMPALA-7689:
-

Commit 5af5456a2d95a43ce63f4e364ff0b9631729bb1a in impala's branch 
refs/heads/master from Bharath Vissapragada
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5af5456 ]

IMPALA-7689: Reduce per column per partition stats estimate size

With the improvements in the incremental stats memory representation
(IMPALA-7424), the per column per partition stats estimate should be
reduced to account for the compressed memory footprint. Doing some
experiments on various test tables, I see the size is down by 50-70%.

This patch reduces the size estimate by 50% (conservative). Ideally we
don't need to estimate on the Catalog server during serialization since
we can compute the byte sizes by looping through all the partitions.
However this patch retains the current logic to keep it consistent with
"compute incremental stats" analysis.

Change-Id: I347b41d9b298d7cd73ec812692172e0511415eee
Reviewed-on: http://gerrit.cloudera.org:8080/11706
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 


> Improve size estimate for incremental stats
> ---
>
> Key: IMPALA-7689
> URL: https://issues.apache.org/jira/browse/IMPALA-7689
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Vuk Ercegovac
>Assignee: bharath v
>Priority: Major
>
> After compressing incremental stats, their size estimate is not too 
> conservative. We use that size estimate to block the functionality (see the 
> corresponding expr in analysis and serialization in catalogd), so without 
> adjusting the estimate, the functionality will be blocked unnecessarily.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7597) "show partitions" does not retry on InconsistentMetadataFetchException

2018-10-18 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656111#comment-16656111
 ] 

ASF subversion and git services commented on IMPALA-7597:
-

Commit 5cc49c343f8558602af2663e3bb519da6d9852cc in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5cc49c3 ]

IMPALA-7597: wraps retries around Frontend metadata operations.

When configured to use the local catalog, concurrent metadata
reads and writes can cause the CatalogMetaProvider to throw
an InconsistentMetadataFetchException. Queries have been wrapped
with a retry loop, but the other frontend methods, such listing
table or partition information, can also fail from the same error.
These errors were seen under a workload consisting of concurrent
adding and showing partitions.

This change wraps call-sites (primarily in Frontend.java) that acquire
a Catalog, so have a chance of throwing an InconsistentMetadataFetchExecption.

Testing:
- added a test that checks whether inconsistent metadata exceptions
  are seen in a concurrent workload.

Change-Id: I43a21571d54a7966c5c68bea1fe69dbc62be2a0b
Reviewed-on: http://gerrit.cloudera.org:8080/11608
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> "show partitions" does not retry on InconsistentMetadataFetchException
> --
>
> Key: IMPALA-7597
> URL: https://issues.apache.org/jira/browse/IMPALA-7597
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Assignee: Vuk Ercegovac
>Priority: Critical
>
> IMPALA-7530 added retries in case LocalCatalog throws 
> InconsistentMetadataFetchException. These retries apply to all code paths 
> taking {{Frontend#createExecRequest()}}. 
> "show partitions" additionally takes {{Frontend#getTableStats()} and aborts 
> the first time it sees InconsistentMetadataFetchException. 
> We need to make sure all the queries (especially DDLs) retry if they hit this 
> exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7717) Partition id does not exist exception - Catalog V2

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657111#comment-16657111
 ] 

ASF subversion and git services commented on IMPALA-7717:
-

Commit 8c93a456891587c1add30a08fa8ab395208e0cf1 in impala's branch 
refs/heads/master from Bharath Vissapragada
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=8c93a45 ]

IMPALA-7717: Handle concurrent partition changes in local catalog mode

Current code throws a RuntimeException (RTE) when partial fetch RPCs
looking up partition metadata and the corresponding partition ID is
missing on the Catalog server. There are a couple of cases here.

1. The partition could be genuinely missing as it was dropped by a
   concurrent operation.
2. Partial fetch RPCs lookup partitions by IDs instead of names. This is
   problematic since the IDs can change over the lifetime of a table.

In both the cases, throwing a RTE is not the right approach and for (2)
we need to transparently retry the fetch with the new partition ID.

We eventually need to fix (2) as looking up by partition ID is not the
right approach.

Testing: Updated an e-e test which fails without the patch.

Change-Id: I2aa103ee159ce9478af9b5b27b36bc0cc286f442
Reviewed-on: http://gerrit.cloudera.org:8080/11732
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 


> Partition id does not exist exception - Catalog V2
> --
>
> Key: IMPALA-7717
> URL: https://issues.apache.org/jira/browse/IMPALA-7717
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: bharath v
>Assignee: bharath v
>Priority: Critical
> Attachments: IMPALA-7717-repro.patch
>
>
> Concurrent invalidates with partial RPC on partitioned tables can throw this 
> exception.
> {noformat}
> I1016 15:49:03.438048 30197 jni-util.cc:256] 
> java.lang.IllegalArgumentException: Partition id 162 does not exist
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:119)
>   at org.apache.impala.catalog.HdfsTable.getPartialInfo(HdfsTable.java:1711)
>   at 
> org.apache.impala.catalog.CatalogServiceCatalog.doGetPartialCatalogObject(CatalogServiceCatalog.java:2202)
>   at 
> org.apache.impala.catalog.CatalogServiceCatalog.getPartialCatalogObject(CatalogServiceCatalog.java:2141)
>   at 
> org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:237)
> I1016 15:49:03.440939 30197 status.cc:129] IllegalArgumentException: 
> Partition id 162 does not exist
> {noformat}
> {noformat}
>  @Override
>   public TGetPartialCatalogObjectResponse getPartialInfo(
>   TGetPartialCatalogObjectRequest req) throws TableLoadingException {
> 
> if (partIds != null) {
>   resp.table_info.partitions = 
> Lists.newArrayListWithCapacity(partIds.size());
>   for (long partId : partIds) {
> HdfsPartition part = partitionMap_.get(partId);
> Preconditions.checkArgument(part != null, "Partition id %s does not 
> exist",
> partId); <
> {noformat}
> The issue is that the invalidate command can reset the partition IDs and the 
> RPCs could look up with older IDs. 
> We should wrap this into an inconsistent metadata fetch exception and retry 
> rather than throwing a RTE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7729) Invalidate metadata hangs when there is an upper case role name

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657761#comment-16657761
 ] 

ASF subversion and git services commented on IMPALA-7729:
-

Commit 072f3ee9045d62cceb23f1f416f3052e0024cdcd in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=072f3ee ]

IMPALA-7729: Fix invalidate metadata hang when there is an upper case role name

Sentry stores the role names in lower case and Impala stores the role
names based on the original input role names. IMPALA-7343 introduced
a new bulk API (listAllRolesPrivileges) from Sentry that returns a map
of role name to a set of privileges. Since Impala preserves the case
sensitivity of the role names based on the original input role names,
this causes an issue when trying to retrieve a set of privileges from
a role name that is stored in Impala, especially when the role names in
Impala differ than the ones returned by listAllRolesPrivileges. This
issue will then result in privileges with mismatch role names to never
get refreshed in the Catalogd, which causes Impalad to wait indefinitely
waiting for the privileges to be updated by Catalogd. The fix is to get
a set of privileges using the role names returned by Sentry's
listAllRoles instead of using the role names stored in Impala.

Testing:
- Added a new E2E test
- Ran all E2E authorization tests

Change-Id: I5aa6f626ad3df4e9321ed18273d045517bc099c2
Reviewed-on: http://gerrit.cloudera.org:8080/11734
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Invalidate metadata hangs when there is an upper case role name
> ---
>
> Key: IMPALA-7729
> URL: https://issues.apache.org/jira/browse/IMPALA-7729
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Blocker
>
> {noformat}
> [localhost:21000] default> create role FOO;
> [localhost:21000] default> grant all on server to role FOO;
> [localhost:21000] default> grant role FOO to group test_group;
> [localhost:21000] default> invalidate metadata; -- this will hang
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7697) Query gets erased before the client gets a chance to call get_log to fetch the error string

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657766#comment-16657766
 ] 

ASF subversion and git services commented on IMPALA-7697:
-

Commit 0340a153ceed2ac6897569faf158e357f8f628df in impala's branch 
refs/heads/master from [~bikram.sngh91]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=0340a15 ]

IMPALA-7697: Fix flakiness in test_resource_limits

This patch fixes one of the tests in test_resource_limits that expects a
query to run for more than 2 seconds but currently fails because it
sometimes completes earlier than that.

Change-Id: I2ba7080f62f0af3e16ef6c304463ebf78dec1b0c
Reviewed-on: http://gerrit.cloudera.org:8080/11741
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Query gets erased before the client gets a chance to call get_log to fetch 
> the error string
> ---
>
> Key: IMPALA-7697
> URL: https://issues.apache.org/jira/browse/IMPALA-7697
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Pooja Nilangekar
>Assignee: Bikramjeet Vig
>Priority: Blocker
>  Labels: broken-build, flaky-test
>
> In a recent build, certain queries went missing from the 
> ImpalaServer::query_log_index_ before the client gets a chance to call 
> get_log to fetch the error string. Hence the test case failed. An easy 
> (intermediate) fix would be to increase FLAGS_query_log_size.
> Here are the test logs:
> {code:java}
> Error Message
> query_test/test_resource_limits.py:45: in test_resource_limits 
> self.run_test_case('QueryTest/query-resource-limits', vector) 
> common/impala_test_suite.py:478: in run_test_case assert False, "Expected 
> exception: %s" % expected_str E   AssertionError: Expected exception: 
> row_regex:.*expired due to execution time limit of 2s000ms.*
> Stacktrace
> query_test/test_resource_limits.py:45: in test_resource_limits
> self.run_test_case('QueryTest/query-resource-limits', vector)
> common/impala_test_suite.py:478: in run_test_case
> assert False, "Expected exception: %s" % expected_str
> E   AssertionError: Expected exception: row_regex:.*expired due to execution 
> time limit of 2s000ms.*
> Standard Error
> -- executing against localhost:21000
> SET SCAN_BYTES_LIMIT="0";
> -- 2018-10-10 22:38:29,826 INFO MainThread: Started query 
> 8e45a13bc999749e:58175e16
> {code}
> Here are the impalad logs:
> {code:java}
> impalad.INFO.20181010-191824.5460:I1010 22:38:29.825745 31580 
> impala-server.cc:1060] Registered query 
> query_id=8e45a13bc999749e:58175e16 
> session_id=43434de5f83010f9:7e0750ad7ad86b80
> impalad.INFO.20181010-191824.5460:I1010 22:38:29.826026 31580 
> impala-server.cc:1115] Query 8e45a13bc999749e:58175e16 has scan bytes 
> limit of 100.00 GB
> impalad.INFO.20181010-191824.5460:I1010 22:38:29.826328 31580 
> impala-beeswax-server.cc:197] get_results_metadata(): 
> query_id=8e45a13bc999749e:58175e16
> impalad..INFO.20181010-191824.5460:I1010 22:38:29.826584 31580 
> impala-server.cc:776] Query id 8e45a13bc999749e:58175e16 not found.
> impalad.INFO.20181010-191824.5460:I1010 22:38:29.826858 31580 
> impala-beeswax-server.cc:239] close(): 
> query_id=8e45a13bc999749e:58175e16
> impalad.INFO.20181010-191824.5460:I1010 22:38:29.826861 31580 
> impala-server.cc:1127] UnregisterQuery(): 
> query_id=8e45a13bc999749e:58175e16
> impalad.INFO.20181010-191824.5460:I1010 22:38:29.826864 31580 
> impala-server.cc:1238] Cancel(): query_id=8e45a13bc999749e:58175e16
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7616) Refactor PrincipalPrivilege.buildPrivilegeName

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657764#comment-16657764
 ] 

ASF subversion and git services commented on IMPALA-7616:
-

Commit bad49e73632f64a386ad1154201f99137af864d8 in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=bad49e7 ]

IMPALA-7721: Fix broken /catalog_object web API when getting a privilege

Before this patch, /catalog_object web API was broken when getting a
privilege due to an incorrect way of getting a role ID. IMPALA-7616
broke this even more due to a lack of test coverage in /catalog_object
when authorization is enabled. This patch fixes the issue and makes the
/catalog_object web API usable again for getting a privilege.

Testing:
- Added a new BE test
- Added a new E2E test
- Ran all E2E authorization tests

Change-Id: I525149d113a1437c1e1493ad3c25a755e370b0c7
Reviewed-on: http://gerrit.cloudera.org:8080/11721
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Refactor PrincipalPrivilege.buildPrivilegeName
> --
>
> Key: IMPALA-7616
> URL: https://issues.apache.org/jira/browse/IMPALA-7616
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Adam Holley
>Assignee: Fredy Wijaya
>Priority: Minor
> Fix For: Impala 3.1.0
>
>
> The buildPrivilegeName pattern across the frontend code is odd in that 
> setting the name is an explicit function and not built during the get from 
> the constituent parts.  e.g. If you create a privilege that doesn't have the 
> grant option set, and then set the grant option after, the getPrivilegeName() 
> will return a name that does not have the grant option.  This should be 
> refactored to build the name on the getPrivilegeName call based on the 
> current values in the Privilege object.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7681) Support new URI scheme for ADLS Gen2

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657767#comment-16657767
 ] 

ASF subversion and git services commented on IMPALA-7681:
-

Commit 7a022cf36a2c678dcff02d48db0641e6f74f068f in impala's branch 
refs/heads/master from [~mackrorysd]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=7a022cf ]

IMPALA-7681. Add Azure Blob File System (ADLS Gen2) support.

HADOOP-15407 adds a new FileSystem implementation called "ABFS" for the
ADLS Gen2 service. It's in the hadoop-azure module as a replacement for
WASB. Filesystem semantics should be the same, so skipped tests and
other behavior changes have simply mirrored what is done for ADLS Gen1
by default. Tests skipped on ADLS Gen1 due to eventual consistency of
the Python client can be run against ADLS Gen2.

Change-Id: I5120b071760e7655e78902dce8483f8f54de445d
Reviewed-on: http://gerrit.cloudera.org:8080/11630
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Support new URI scheme for ADLS Gen2
> 
>
> Key: IMPALA-7681
> URL: https://issues.apache.org/jira/browse/IMPALA-7681
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> HADOOP-15407 recently added a new FileSystem implementation called "ABFS" for 
> the ADLS Gen2 service. Instead of being in the hadoop-azure-datalake module, 
> it's in the hadoop-azure module as a replacement for WASB.
> It should have pretty much the same filesystem semantics as ADLS, but URIs 
> are configured separately, so we'll need a new function to pick it up, even 
> if we treat it the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7699) TestSpillingNoDebugActionDimensions fails earlier than expected

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657765#comment-16657765
 ] 

ASF subversion and git services commented on IMPALA-7699:
-

Commit 77c56a805abf23db27db493ed12af965e515428d in impala's branch 
refs/heads/master from [~bikram.sngh91]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=77c56a8 ]

IMPALA-7699: Fix spilling test run with hdfs erasure coding turned on

A spilling test when run on test build with hdfs erasure coding turned
on hits an out of memory error on the hdfs scan node. This happens
because the test is tuned for a regular 3 node minicluster without
hdfs erasure coding. Fix is to simply increase the memory limit on
the test to accommodate this difference yet keep it small enough to
achieve desired spilling on the hash join node.

Testing:
Ran it on an EC enabled minicluster to make sure it works

Change-Id: I207569822ba7388e78936d25e2311fa09c7a1b9a
Reviewed-on: http://gerrit.cloudera.org:8080/11740
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> TestSpillingNoDebugActionDimensions fails earlier than expected 
> 
>
> Key: IMPALA-7699
> URL: https://issues.apache.org/jira/browse/IMPALA-7699
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Pooja Nilangekar
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: broken-build
>
> In some of the recent runs, the query fails without sufficient memory, 
> however it is in the HDFS scan node rather than the hash join node. Here are 
> the corresponding logs: 
> Stacktrace:
> {code:java}
> query_test/test_spilling.py:113: in test_spilling_no_debug_action 
> self.run_test_case('QueryTest/spilling-no-debug-action', vector) 
> common/impala_test_suite.py:466: in run_test_case 
> self.__verify_exceptions(test_section['CATCH'], str(e), use_db) 
> common/impala_test_suite.py:319: in __verify_exceptions (expected_str, 
> actual_str) E AssertionError: Unexpected exception string. Expected: 
> row_regex:.*Cannot perform hash join at node with id .*. Repartitioning did 
> not reduce the size of a spilled partition.* E Not found in actual: 
> ImpalaBeeswaxException: Query aborted:Memory limit exceeded: Failed to 
> allocate tuple bufferHDFS_SCAN_NODE (id=1) could not allocate 190.00 KB 
> without exceeding limit.Error occurred on backend localhost:22001 by fragment 
> 2e4f0f944d373848:9ae1d7e20002
> {code}
>  
> Here are the impalad logs: 
> {code:java}
> I1010 18:31:30.721693  7270 coordinator.cc:498] ExecState: query 
> id=2e4f0f944d373848:9ae1d7e2 
> finstance=2e4f0f944d373848:9ae1d7e20002 on host=localhost:22001 
> (EXECUTING -> ERROR) status=Memory limit exceeded: Failed to allocate tuple 
> buffer
> HDFS_SCAN_NODE (id=1) could not allocate 190.00 KB without exceeding limit.
> Error occurred on backend localhost:22001 by fragment 
> 2e4f0f944d373848:9ae1d7e20002
> Memory left in process limit: 9.19 GB
> Memory left in query limit: 157.62 KB
> Query(2e4f0f944d373848:9ae1d7e2): Limit=150.00 MB Reservation=117.25 
> MB ReservationLimit=118.00 MB OtherMemory=32.60 MB Total=149.85 MB 
> Peak=149.85 MB
>   Unclaimed reservations: Reservation=5.75 MB OtherMemory=0 Total=5.75 MB 
> Peak=55.75 MB
>   Fragment 2e4f0f944d373848:9ae1d7e20003: Reservation=2.00 MB 
> OtherMemory=22.20 MB Total=24.20 MB Peak=24.20 MB
> Runtime Filter Bank: Reservation=2.00 MB ReservationLimit=2.00 MB 
> OtherMemory=0 Total=2.00 MB Peak=2.00 MB
> SORT_NODE (id=3): Total=0 Peak=0
> HASH_JOIN_NODE (id=2): Total=42.25 KB Peak=42.25 KB
>   Exprs: Total=13.12 KB Peak=13.12 KB
>   Hash Join Builder (join_node_id=2): Total=13.12 KB Peak=13.12 KB
> Hash Join Builder (join_node_id=2) Exprs: Total=13.12 KB Peak=13.12 KB
> HDFS_SCAN_NODE (id=0): Total=0 Peak=0
> EXCHANGE_NODE (id=4): Reservation=18.79 MB OtherMemory=235.89 KB 
> Total=19.02 MB Peak=19.02 MB
>   KrpcDeferredRpcs: Total=235.89 KB Peak=235.89 KB
> KrpcDataStreamSender (dst_id=5): Total=480.00 B Peak=480.00 B
> CodeGen: Total=3.13 MB Peak=3.13 MB
>   Fragment 2e4f0f944d373848:9ae1d7e20002: Reservation=109.50 MB 
> OtherMemory=10.39 MB Total=119.89 MB Peak=119.89 MB
> HDFS_SCAN_NODE (id=1): Reservation=109.50 MB OtherMemory=10.20 MB 
> Total=119.70 MB Peak=119.70 MB
>   Queued Batches: Total=6.12 MB Peak=6.12 MB
> KrpcDataStreamSender (dst_id=4): Total=688.00 B Peak=688.00 B
> CodeGen: Total=488.00 B Peak=51.00 KB
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7343) Change Sentry proxy to use the bulk API

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657762#comment-16657762
 ] 

ASF subversion and git services commented on IMPALA-7343:
-

Commit 072f3ee9045d62cceb23f1f416f3052e0024cdcd in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=072f3ee ]

IMPALA-7729: Fix invalidate metadata hang when there is an upper case role name

Sentry stores the role names in lower case and Impala stores the role
names based on the original input role names. IMPALA-7343 introduced
a new bulk API (listAllRolesPrivileges) from Sentry that returns a map
of role name to a set of privileges. Since Impala preserves the case
sensitivity of the role names based on the original input role names,
this causes an issue when trying to retrieve a set of privileges from
a role name that is stored in Impala, especially when the role names in
Impala differ than the ones returned by listAllRolesPrivileges. This
issue will then result in privileges with mismatch role names to never
get refreshed in the Catalogd, which causes Impalad to wait indefinitely
waiting for the privileges to be updated by Catalogd. The fix is to get
a set of privileges using the role names returned by Sentry's
listAllRoles instead of using the role names stored in Impala.

Testing:
- Added a new E2E test
- Ran all E2E authorization tests

Change-Id: I5aa6f626ad3df4e9321ed18273d045517bc099c2
Reviewed-on: http://gerrit.cloudera.org:8080/11734
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Change Sentry proxy to use the bulk API
> ---
>
> Key: IMPALA-7343
> URL: https://issues.apache.org/jira/browse/IMPALA-7343
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Adam Holley
>Assignee: Fredy Wijaya
>Priority: Major
>  Labels: security
> Fix For: Impala 3.1.0
>
>
> Currently, Impala makes a thrift call for each role to get privileges.  This 
> will change to get all privileges in one thrift call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7721) /catalog_object web API is broken for getting a privilege

2018-10-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657763#comment-16657763
 ] 

ASF subversion and git services commented on IMPALA-7721:
-

Commit bad49e73632f64a386ad1154201f99137af864d8 in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=bad49e7 ]

IMPALA-7721: Fix broken /catalog_object web API when getting a privilege

Before this patch, /catalog_object web API was broken when getting a
privilege due to an incorrect way of getting a role ID. IMPALA-7616
broke this even more due to a lack of test coverage in /catalog_object
when authorization is enabled. This patch fixes the issue and makes the
/catalog_object web API usable again for getting a privilege.

Testing:
- Added a new BE test
- Added a new E2E test
- Ran all E2E authorization tests

Change-Id: I525149d113a1437c1e1493ad3c25a755e370b0c7
Reviewed-on: http://gerrit.cloudera.org:8080/11721
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> /catalog_object web API is broken for getting a privilege
> -
>
> Key: IMPALA-7721
> URL: https://issues.apache.org/jira/browse/IMPALA-7721
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.9.0, Impala 3.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
>
> {noformat}
> [localhost:21000] > show grant role foo_role;
> +--+--+---++-+---+--+---+
> | scope| database | table | column | uri | privilege | grant_option | 
> create_time   |
> +--+--+---++-+---+--+---+
> | database | foo  |   || | all   | false| 
> Wed, Oct 10 2018 10:59:29.495 |
> +--+--+---++-+---+--+---+
> {noformat}
> http://localhost:25020/catalog_object?object_type=ROLE&object_name=foo_role
> {noformat}
> TCatalogObject {
>   01: type (i32) = 7,
>   02: catalog_version (i64) = 3,
>   08: role (struct) = TRole {
> 01: role_name (string) = "foo_role",
> 02: role_id (i32) = 2,
> 03: grant_groups (list) = list[0] {
> },
>   },
> }
> {noformat}
> http://localhost:25020/catalog_object?object_type=PRIVILEGE&object_name=server%3Dserver1-%3Edatabase%3Dfoo_role.2
> {noformat}
> Error: CatalogException: No role associated with ID: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5031) UBSAN clean and method for testing UBSAN cleanliness

2018-10-23 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661150#comment-16661150
 ] 

ASF subversion and git services commented on IMPALA-5031:
-

Commit 1104f6785b44535d1bbd38b338c18fa6febbf2c3 in impala's branch 
refs/heads/master from [~jbapple]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=1104f67 ]

IMPALA-5031: make codegen ubsan available by environment variable

bin/jenkins/all-tests.sh does not support any flags when calling
bootstrap_development.sh, which eventually calls buildall.sh. Since
Jenkins scripts are called non-interactively, the type of build is
usually controlled by an environment variable, but that was not
supported for codegen ubsan. This patch makes that possible under the
name "UBSAN_FULL".

Change-Id: Ifd108f8a56158566d95f4769048bc9ab45bd3514
Reviewed-on: http://gerrit.cloudera.org:8080/11742
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins 


> UBSAN clean and method for testing UBSAN cleanliness
> 
>
> Key: IMPALA-5031
> URL: https://issues.apache.org/jira/browse/IMPALA-5031
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend, Infrastructure
>Affects Versions: Impala 2.9.0
>Reporter: Jim Apple
>Assignee: Jim Apple
>Priority: Minor
>
> http://releases.llvm.org/3.8.0/tools/clang/docs/UndefinedBehaviorSanitizer.html
>  builds are supported after https://gerrit.cloudera.org/#/c/6186/, but 
> Impala's test suite triggers many errors under UBSAN. Those errors should be 
> fixed and then there should be a way to run the test suite under UBSAN and 
> fail if there were any errors detected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7432) Impala 3.1 Doc: Add logged_in_user alias for effective_user

2018-10-23 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661149#comment-16661149
 ] 

ASF subversion and git services commented on IMPALA-7432:
-

Commit 52c3a89a2272886d4944b8b553a0c095d5a273e8 in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=52c3a89 ]

IMPALA-7432: [DOCS] Document the new LOGGED_IN_USER function

Change-Id: I175e866ac45ad6e760a454fb8994f7cf39f51d2c
Reviewed-on: http://gerrit.cloudera.org:8080/11755
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 


> Impala 3.1 Doc: Add logged_in_user alias for effective_user
> ---
>
> Key: IMPALA-7432
> URL: https://issues.apache.org/jira/browse/IMPALA-7432
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
> Fix For: Impala 3.1.0
>
>
> https://gerrit.cloudera.org/#/c/11755/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7668) close() URLClassLoaders after usage.

2018-10-23 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661427#comment-16661427
 ] 

ASF subversion and git services commented on IMPALA-7668:
-

Commit e0c54b7c8ee05a88cec6f4f7a479971c45271bbb in impala's branch 
refs/heads/master from Bharath Vissapragada
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e0c54b7 ]

IMPALA-7668: Proper clean up of URLClassLoader

Starting with Java 7 URLClassLoader implements a close() method that
cleans up any open files and helps avoid bugs like fd leaks inside the
class loader. Additionally it also helps load updated class versions
of the classes that are loaded already by prior instances.

This commit makes sure that the URLClassLoader is close()'d in a few
places in the code.

Testing: Tricky to automate the tests for this behavior, so no new
tests were added.

Change-Id: I5c5100ef5c5a97d92d94fb68daab622f0aa30158
Reviewed-on: http://gerrit.cloudera.org:8080/11594
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 


> close() URLClassLoaders after usage.
> 
>
> Key: IMPALA-7668
> URL: https://issues.apache.org/jira/browse/IMPALA-7668
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: bharath v
>Assignee: bharath v
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> There are a few places in the code that uses URLClassLoaders to load some 
> java classes at runtime. One example is when loading Java UDFs at startup.
> {code:java}
> public static List extractFunctions(String db,
>   ...
>   URL[] classLoaderUrls = new URL[] {new URL(localJarPath.toString())};
>   URLClassLoader urlClassLoader = new URLClassLoader(classLoaderUrls);
> {code}
> Starting JDK7, URLClassloader lets the caller close all the closeables opened 
> by it, avoiding bugs like FD leaks etc.
> https://docs.oracle.com/javase/7/docs/api/java/net/URLClassLoader.html#close()
> We have seen issues like lingering FDs from this code using certain versions 
> of JDKs where the FDs of temporary jars (copied to /tmp) by this code are not 
> closed and hence their space from disk is not claimed causing disk space 
> issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7740) Incorrect doc description for nvl2()

2018-10-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662596#comment-16662596
 ] 

ASF subversion and git services commented on IMPALA-7740:
-

Commit e00c0822abaa12cb7a99b6b78ce3fc25d5cd2e11 in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e00c082 ]

IMPALA-7739 IMPALA-7740: [DOCS] Correct descriptions of NVL2 and DECODE

- Corrected the return values of the NVL2 function.
- Updated the DECODE section.
- Simplified the examples.

Change-Id: I7f6b9d56e85f7dffeb29218b244af1cc535dc03e
Reviewed-on: http://gerrit.cloudera.org:8080/11758
Reviewed-by: Paul Rogers 
Reviewed-by: Alex Rodoni 
Tested-by: Alex Rodoni 


> Incorrect doc description for nvl2()
> 
>
> Key: IMPALA-7740
> URL: https://issues.apache.org/jira/browse/IMPALA-7740
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Impala offers the NVL2() function from 
> [Oracle|https://docs.oracle.com/cd/B28359_01/olap.111/b28126/dml_functions_2049.htm#OLADM625].
>  A clearer definition is 
> [here|https://www.techonthenet.com/oracle/functions/nvl2.php]:
> {quote}
> The syntax for the NVL2 function in Oracle/PLSQL is:
> {{NVL2( string1, value_if_not_null, value_if_null )}}
> {quote}
> Contrast that with the [Impala 
> description|https://impala.apache.org/docs/build3x/html/topics/impala_conditional_functions.html]:
> {quote}
> Enhanced variant of the nvl() function. Tests an expression and returns 
> different result values depending on whether it is NULL or not. _If the first 
> argument is NULL, returns the second argument. If the first argument is not 
> NULL, returns the third argument._ Equivalent to the nvl2() function from 
> Oracle Database.
> {quote}
> (Emphasis added.) The description is exactly backward. To see this:
> {noformat}
> select n, nvl2(n, 10, 20) from ints;
> +--+---+
> | n| if(n is not null, 10, 20) |
> +--+---+
> | NULL | 20|
> | 0| 10|
> +--+---+
> {noformat}
> Hence, the implementation follows Oracle, the documentation is wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7739) Errata in documentation of decode() method

2018-10-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662595#comment-16662595
 ] 

ASF subversion and git services commented on IMPALA-7739:
-

Commit e00c0822abaa12cb7a99b6b78ce3fc25d5cd2e11 in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e00c082 ]

IMPALA-7739 IMPALA-7740: [DOCS] Correct descriptions of NVL2 and DECODE

- Corrected the return values of the NVL2 function.
- Updated the DECODE section.
- Simplified the examples.

Change-Id: I7f6b9d56e85f7dffeb29218b244af1cc535dc03e
Reviewed-on: http://gerrit.cloudera.org:8080/11758
Reviewed-by: Paul Rogers 
Reviewed-by: Alex Rodoni 
Tested-by: Alex Rodoni 


> Errata in documentation of decode() method
> --
>
> Key: IMPALA-7739
> URL: https://issues.apache.org/jira/browse/IMPALA-7739
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Alex Rodoni
>Priority: Minor
> Fix For: Impala 3.1.0
>
>
> Consider the description of {{decode()}} in [the 
> docs|https://impala.apache.org/docs/build3x/html/topics/impala_conditional_functions.html].
>  Original text:
> {quote}
> decode(type expression, type search1, type result1 [, type search2, type 
> result2 ...] [, type default] )
> Purpose: Compares an expression to one or more possible values, and returns a 
> corresponding result when a match is found.
> Return type: same as the initial argument value, except that integer values 
> are promoted to BIGINT and floating-point values are promoted to DOUBLE; use 
> CAST() when inserting into a smaller numeric column
> Usage notes:
> * Can be used as shorthand for a CASE expression.
> * The original expression and the search expressions must of the same type or 
> convertible types. * The result expression can be a different type, but all 
> result expressions must be of the same type.
> * Returns a successful match If the original expression is NULL and a search 
> expression is also NULL. the
> * Returns NULL if the final default value is omitted and none of the search 
> expressions match the original expression.
> {quote}
> Revise:
> * Remove “type” prefix for arguments (here and in all functions), it really 
> adds no value and is actually confusing because the argument is not a type, 
> it is a value.
> * Usage notes: Comparison is done using the IS NOT DISTINCT (<=>) operator: 
> this NULL can be used as a search condition.
> * Since this is a list, great between the two items in the second item in 
> usage.
> * The third item has a dangling “the” at the end.
> * Reword the fourth item: “If none of the search expressions match the 
> expression, then returns the default (if given) or NULL (if no default is 
> given.)”
> Suggested revised text:
> Purpose: Compares an expression to one or more possible values using the IS 
> NOT DISTINCT operator, and returns a corresponding result when a match is 
> found.
> Usage notes:
> * Can be used as shorthand for a CASE expression.
> * The expression and the search expressions must of the same type or 
> convertible types.
> * The result expression can be a different type, but all result expressions 
> must be of the same type.
> * Returns a successful match If the original expression is NULL and a search 
> expression is also NULL. (Uses IS NOT DISTINCT to do the comparison.)
> * NULL can be used as a search expression.
> * Returns NULL if the final default value is omitted and none of the search 
> expressions match the original expression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7677) multiple count(distinct): Check failed: !hash_partitions_.empty()

2018-10-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662597#comment-16662597
 ] 

ASF subversion and git services commented on IMPALA-7677:
-

Commit 15e8ce4f273945ce548fe677ee0140dea8068e6d in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=15e8ce4 ]

IMPALA-7677: Fix DCHECK failure in GroupingAggregator

After inserting all of its input into its Aggregators,
StreamingAggregationNode performs some cleanup, such as calling
InputDone() on each Aggregator.

Previously, StreamingAggregationNode only checked that all of the
child's batches had been fetched before doing this cleanup, which
causes problems if the final child batch isn't processed fully in a
single GetNext() call. In this case, multiple calls to InputDone()
lead to a DCHECK failure.

The solution is to only perform the cleanup once the final child batch
has been fully processed.

Testing:
- Added an e2e test with a query that hits this condition.

Change-Id: I851007a60472d0e53081c076c863c866c516677c
Reviewed-on: http://gerrit.cloudera.org:8080/11626
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> multiple count(distinct): Check failed: !hash_partitions_.empty()
> -
>
> Key: IMPALA-7677
> URL: https://issues.apache.org/jira/browse/IMPALA-7677
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
>  Labels: crash, query_generator
>
> The random query generator found a query that crashes an executor node. 
> Multiple COUNT(DISTINCT) looks involved, so assigning to [~twmarshall].
> Query that reproduces the crash:
> {noformat}
> USE tpch;
> SELECT
> COUNT(DISTINCT -475.116242696) AS int_col,
> COALESCE('67', CAST(COUNT(DISTINCT a1.o_totalprice) AS STRING)) AS char_col,
> CAST(COUNT(DISTINCT a1.o_orderkey) AS STRING) AS char_col_1,
> IF(True, '15', CAST(COUNT(DISTINCT False) AS STRING)) AS char_col_2,
> (COUNT(DISTINCT a1.o_orderdate)) / (-899.6051032421) AS decimal_col
> FROM orders a1
> WHERE
> (CAST('1992-11-22 00:00:00' AS TIMESTAMP)) NOT IN (SELECT
> CAST('1992-09-02 00:00:00' AS TIMESTAMP) + INTERVAL IF(False, 744, 
> COUNT(DISTINCT a2.o_custkey)) MINUTE AS timestamp_col
> FROM orders a2
> INNER JOIN orders a3 ON (a2.o_shippriority) = (a3.o_totalprice));
> {noformat}
> {noformat}
> I1008 09:31:15.222935 110602 impala-internal-service.cc:49] 
> ExecQueryFInstances(): query_id=5d4496651cffa066:fc72c69a 
> coord=mikeb-ub162:22000 #instances=6
> I1008 09:31:15.223341 110608 query-state.cc:472] Executing instance. 
> instance_id=5d4496651cffa066:fc72c69a000d fragment_idx=1 
> per_fragment_instance_idx=1 coord_state_idx=1 #in-flight=1
> I1008 09:31:15.224743 110610 query-state.cc:472] Executing instance. 
> instance_id=5d4496651cffa066:fc72c69a000b fragment_idx=2 
> per_fragment_instance_idx=1 coord_state_idx=1 #in-flight=2
> I1008 09:31:15.225250 110616 query-state.cc:472] Executing instance. 
> instance_id=5d4496651cffa066:fc72c69a0008 fragment_idx=4 
> per_fragment_instance_idx=1 coord_state_idx=1 #in-flight=3
> I1008 09:31:15.225797 110619 query-state.cc:472] Executing instance. 
> instance_id=5d4496651cffa066:fc72c69a0006 fragment_idx=5 
> per_fragment_instance_idx=1 coord_state_idx=1 #in-flight=4
> I1008 09:31:15.225960 110620 query-state.cc:472] Executing instance. 
> instance_id=5d4496651cffa066:fc72c69a0002 fragment_idx=6 
> per_fragment_instance_idx=1 coord_state_idx=1 #in-flight=5
> I1008 09:31:15.226035 110622 query-state.cc:472] Executing instance. 
> instance_id=5d4496651cffa066:fc72c69a0004 fragment_idx=7 
> per_fragment_instance_idx=1 coord_state_idx=1 #in-flight=6
> I1008 09:31:16.570009 110622 query-state.cc:480] Instance completed. 
> instance_id=5d4496651cffa066:fc72c69a0004 #in-flight=5 status=OK
> I1008 09:31:16.571622 110602 query-exec-mgr.cc:97] QueryState: 
> query_id=5d4496651cffa066:fc72c69a refcnt=7
> I1008 09:31:16.646488 110654 krpc-data-stream-mgr.cc:294] DeregisterRecvr(): 
> fragment_instance_id=5d4496651cffa066:fc72c69a0006, node=11
> I1008 09:31:17.235110 110619 krpc-data-stream-mgr.cc:294] DeregisterRecvr(): 
> fragment_instance_id=5d4496651cffa066:fc72c69a0006, node=10
> I1008 09:31:17.236071 110620 query-state.cc:480] Instance completed. 
> instance_id=5d4496651cffa066:fc72c69a0002 #in-flight=4 status=OK
> I1008 09:31:17.237010 110616 krpc-data-stream-mgr.cc:294] DeregisterRecvr(): 
> fragment_instance_id=5d4496651cffa066:fc72c69a0008, node=12
> I1008 09:31:17.238026 110619 query-state.cc:480] Instance completed. 
> instance_id=5d449

[jira] [Commented] (IMPALA-5031) UBSAN clean and method for testing UBSAN cleanliness

2018-10-25 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664125#comment-16664125
 ] 

ASF subversion and git services commented on IMPALA-5031:
-

Commit 1d72c7584fd33746b17826c1dc54a1819fce9ec5 in impala's branch 
refs/heads/master from [~jbapple]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=1d72c75 ]

IMPALA-5031: Increase backend test timeout under UBSAN_FULL

When codegen is run with ubsan, backend tests slow down
dramatically. This patch increases the timeout to four hours in when
UBSAN_FULL is the build type. This limit is approached by expr-test,
which takes almost three hours under UBSAN_FULL.

Change-Id: I3eee4c2b3affc9d65d86c043fcc382d7248adf3e
Reviewed-on: http://gerrit.cloudera.org:8080/11764
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins 


> UBSAN clean and method for testing UBSAN cleanliness
> 
>
> Key: IMPALA-5031
> URL: https://issues.apache.org/jira/browse/IMPALA-5031
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend, Infrastructure
>Affects Versions: Impala 2.9.0
>Reporter: Jim Apple
>Assignee: Jim Apple
>Priority: Minor
>
> http://releases.llvm.org/3.8.0/tools/clang/docs/UndefinedBehaviorSanitizer.html
>  builds are supported after https://gerrit.cloudera.org/#/c/6186/, but 
> Impala's test suite triggers many errors under UBSAN. Those errors should be 
> fixed and then there should be a way to run the test suite under UBSAN and 
> fail if there were any errors detected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



  1   2   3   4   5   6   7   8   9   10   >