[jira] [Commented] (IMPALA-7758) chars_formats dependent tables are created using the wrong LOCATION

2018-10-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665919#comment-16665919
 ] 

ASF subversion and git services commented on IMPALA-7758:
-

Commit 2e5d65819aaa52e1a89bc5cc212bba3b1b404339 in impala's branch 
refs/heads/master from [~dknupp]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=2e5d658 ]

IMPALA-7758: Fix LOCATION clause when creating chars_formats_*

The current location resolves to /user/hive/warehouse/chars_formats_*.

Impala's test data actually lives at /test-warehouse/chars_formats_*.

Tested this by reloading data from scratch and running the core tests.

Change-Id: I781b484e7a15ccaa5de590563d68b3dca6a658e5
Reviewed-on: http://gerrit.cloudera.org:8080/11789
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> chars_formats dependent tables are created using the wrong LOCATION
> ---
>
> Key: IMPALA-7758
> URL: https://issues.apache.org/jira/browse/IMPALA-7758
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
>
> In testdata/bin/load-dependent-tables.sql, the LOCATION clause when creating 
> the various chars_formats tables (e.g. text) use:
> {noformat}
> LOCATION '${hiveconf:hive.metastore.warehouse.dir}/chars_formats_text'
> {noformat}
> ...which resolves to {{/user/hive/warehouse/chars_formats_text}}
> However, the actual test warehouse root dir is {{/test-warehouse}}, not 
> {{/user/hive/warehouse}}.
> {noformat}
> $ hdfs dfs -cat /test-warehouse/chars_formats_text/chars-formats.txt
> abcde,88db79c70974e02deb3f01cfdcc5daae2078f21517d1021994f12685c0144addae3ce0dbd6a540b55b88af68486251fa6f0c8f9f94b3b1b4bc64c69714e281f388db79c70974,variable
>  length
> abc 
> ,8d3fffddf79e9a232ffd19f9ccaa4d6b37a6a243dbe0f23137b108a043d9da13121a9b505c804956b22e93c7f93969f4a7ba8ddea45bf4aab0bebc8f814e09918d3fffddf79e,abc
> abcdef,68f8c4575da360c32abb46689e58193a0eeaa905ae6f4a5e6c702a6ae1db35a6f86f8222b7a5489d96eb0466c755b677a64160d074617096a8c6279038bc720468f8c4575da3,b2fe9d4638503a57f93396098f24103a20588631727d0f0b5016715a3f6f2616628f09b1f63b23e484396edf949d9a1c307dbe11f23b971afd75b0f639d8a3f1
> {noformat}
> versus...
> {noformat}
> $ hdfs dfs -cat /user/hive/warehouse/chars_formats_text/chars-formats.txt
> cat: `/user/hive/warehouse/chars_formats_text/chars-formats.txt': No such 
> file or directory
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7662) test_parquet reads bad_magic_number.parquet without an error

2018-10-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665920#comment-16665920
 ] 

ASF subversion and git services commented on IMPALA-7662:
-

Commit 449fe73d2145bd22f0f857623c3652a097f06d73 in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=449fe73 ]

IMPALA-7662: fix error race when scanner open fails

This is very similar to IMPALA-7335, except happens
when 'progress_' is incremented in the call chain
HdfsScanNode::ProcessSplit
-> HdfsScanNodeBase::CreateAndOpenScanner()
-> HdfsScanner::Close()

The fix required restructuring the code so that
SetDoneInternal() is called with the error *before*
HdfsScanner::Close(). This required a refactoring because
HdfsScanNodeBase doesn't actually know about SetDoneInternal().

My fix is to put the common logic between HdfsScanNode and
HdfsScanNodeMt into a helper in HdfsScanNodeBase, then in
HdfsScanNode, make sure to call SetDoneInternal() before
closing the scanner.

I also reworked HdfsScanNode::ProcessSplit() to handle error propagation
internally. I think the joint responsibility between ProcessSplit() and
its caller for handling errors made things harder than necessary.

Testing:
Added a debug action and test that reproduced the race before the fix.

Change-Id: I45a61210ca7d057b048c77d9f2f2695ec450f19b
Reviewed-on: http://gerrit.cloudera.org:8080/11596
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> test_parquet reads bad_magic_number.parquet without an error
> 
>
> Key: IMPALA-7662
> URL: https://issues.apache.org/jira/browse/IMPALA-7662
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
> Environment: Impala ddef2cb9b14e7f8cf9a68a2a382e10a8e0f91c3d 
> exhaustive debug build
>Reporter: Tianyi Wang
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: correctness
>
> {noformat}
> 09:51:41 === FAILURES 
> ===
> 09:51:41  TestParquet.test_parquet[exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': 
> 'HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT:FAIL@0.5', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> 09:51:41 [gw5] linux2 -- Python 2.7.5 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/../infra/python/env/bin/python
> 09:51:41 query_test/test_scanners.py:300: in test_parquet
> 09:51:41 self.run_test_case('QueryTest/parquet', vector)
> 09:51:41 common/impala_test_suite.py:423: in run_test_case
> 09:51:41 assert False, "Expected exception: %s" % expected_str
> 09:51:41 E   AssertionError: Expected exception: File 
> 'hdfs://localhost:20500/test-warehouse/bad_magic_number_parquet/bad_magic_number.parquet'
>  has an invalid version number: 
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7710) test_owner_privileges_with_grant failed with AuthorizationException

2018-10-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665917#comment-16665917
 ] 

ASF subversion and git services commented on IMPALA-7710:
-

Commit 8d628d7b62c4903bfd0236d2de150dae0752f0fe in impala's branch 
refs/heads/master from [~aholley]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=8d628d7 ]

IMPALA-7710: test_owner_privileges_with_grant failed with AuthorizationException

The problem was acache consistency issue between impalad and catalogd.
Because a Sentry refresh was occuring during an update to privileges
from the alter table set owner, impalad had the correct privileges,
which allowed the "show grant role" to succeed but the privileges in
catalogd were being overwritten from the sentry refresh. Added a delay
in the drop call to ensure privileges are updated. This is a
workaround to get the tests to pass with the existing behaviour and
should be reassessed if IMPALA-7763 is implemented.  This would add a
lock to possibly prevent this, but will need it's own assessment.

Testing:
- Ran custom cluster tests 50 times

Change-Id: I5a1babd3dcbb94ffaa1f3e6ef2cebf1a1d391219
Reviewed-on: http://gerrit.cloudera.org:8080/11786
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> test_owner_privileges_with_grant failed with AuthorizationException 
> 
>
> Key: IMPALA-7710
> URL: https://issues.apache.org/jira/browse/IMPALA-7710
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Adam Holley
>Priority: Blocker
>  Labels: broken-build
>
> A build with the fix of IMPALA-7633 failed like the following. 
> {noformat}
> authorization.test_owner_privileges.TestOwnerPrivileges.test_owner_privileges_with_grant[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] (from pytest)
> Failing for the past 1 build (Since Failed#35 )
> Took 1 min 39 sec.
> add description
> Error Message
> ImpalaBeeswaxException: ImpalaBeeswaxException:  INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>  MESSAGE: AuthorizationException: User 
> 'oo_user1' does not have privileges to execute 'DROP' on: 
> test_owner_privileges_with_grant_77e49af8.owner_priv_view
> Stacktrace
> authorization/test_owner_privileges.py:165: in 
> test_owner_privileges_with_grant
> sentry_refresh_timeout_s=SENTRY_REFRESH_TIMEOUT_S)
> authorization/test_owner_privileges.py:225: in __execute_owner_privilege_tests
> test_obj.obj_name), user="oo_user1")
> common/sentry_cache_test_suite.py:106: in user_query
> return self.execute_query_expect_success(client, query, user=user)
> common/impala_test_suite.py:523: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:531: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:621: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: AuthorizationException: User 'oo_user1' does not have 
> privileges to execute 'DROP' on: 
> test_owner_privileges_with_grant_77e49af8.owner_priv_view
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7760) Privilege version inconsistency causes a hang when running invalidate metadata

2018-10-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665916#comment-16665916
 ] 

ASF subversion and git services commented on IMPALA-7760:
-

Commit f8b7f257fb676d96fbde0ce9b078bbe4af3d4a4f in impala's branch 
refs/heads/master from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=f8b7f25 ]

IMPALA-7760: Privilege version inconsistency causes a hang when running 
invalidate metadata

Before this patch, a bug in SentryProxy caused a hang when running
invalidate metadata due to privilege version inconsistency. I was able
to manually reproduce the issue by doing the following steps:

1. Get all Sentry role privileges for role a: [x, y] --> in SentryProxy
2. Add a sleep statement before getting all Sentry roles to simulate the
   timing issue--> in SentryProxy
3. Remove role a --> Externally via Sentry CLI
4. Privileges x and y in step 1 do not get removed in the catalog even
   those they were removed in step 3, which causes the catalog version
   inconsistency
5. Run invalidate metadata, this will cause it to hang due to catalog
   version inconsistency

The fix is to remove all privileges in the catalog if there are no
privileges (null or empty) returned by Sentry.

Testing:
- Manually tested the patch with by the above steps and did not
  encounter the hang  when issuing invalidate metadata.

Change-Id: Ib1e0db2b1f727476f489c732c4f4e5bc1582429f
Reviewed-on: http://gerrit.cloudera.org:8080/11794
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Privilege version inconsistency causes a hang when running invalidate metadata
> --
>
> Key: IMPALA-7760
> URL: https://issues.apache.org/jira/browse/IMPALA-7760
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Critical
>
> 1. Add a sleep statement before getting all Sentry roles in SentryProxy.
> 2. Remove role any existing role via Sentry CLI.
> 3. Run invalidate metadata, this will cause Impala to hang.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7335) Assertion Failure - test_corrupt_files

2018-10-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665921#comment-16665921
 ] 

ASF subversion and git services commented on IMPALA-7335:
-

Commit 449fe73d2145bd22f0f857623c3652a097f06d73 in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=449fe73 ]

IMPALA-7662: fix error race when scanner open fails

This is very similar to IMPALA-7335, except happens
when 'progress_' is incremented in the call chain
HdfsScanNode::ProcessSplit
-> HdfsScanNodeBase::CreateAndOpenScanner()
-> HdfsScanner::Close()

The fix required restructuring the code so that
SetDoneInternal() is called with the error *before*
HdfsScanner::Close(). This required a refactoring because
HdfsScanNodeBase doesn't actually know about SetDoneInternal().

My fix is to put the common logic between HdfsScanNode and
HdfsScanNodeMt into a helper in HdfsScanNodeBase, then in
HdfsScanNode, make sure to call SetDoneInternal() before
closing the scanner.

I also reworked HdfsScanNode::ProcessSplit() to handle error propagation
internally. I think the joint responsibility between ProcessSplit() and
its caller for handling errors made things harder than necessary.

Testing:
Added a debug action and test that reproduced the race before the fix.

Change-Id: I45a61210ca7d057b048c77d9f2f2695ec450f19b
Reviewed-on: http://gerrit.cloudera.org:8080/11596
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Assertion Failure - test_corrupt_files
> --
>
> Key: IMPALA-7335
> URL: https://issues.apache.org/jira/browse/IMPALA-7335
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: nithya
>Assignee: Pooja Nilangekar
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.1.0
>
>
> test_corrupt_files fails 
>  
> query_test.test_scanners.TestParquet.test_corrupt_files[exec_option: 
> \\{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from 
> pytest)
>  
> {code:java}
> Error Message
> query_test/test_scanners.py:300: in test_corrupt_files     
> self.run_test_case('QueryTest/parquet-abort-on-error', vector) 
> common/impala_test_suite.py:420: in run_test_case     assert False, "Expected 
> exception: %s" % expected_str E   AssertionError: Expected exception: Column 
> metadata states there are 11 values, but read 10 values from column id.
> STACKTRACE
> query_test/test_scanners.py:300: in test_corrupt_files
>     self.run_test_case('QueryTest/parquet-abort-on-error', vector)
> common/impala_test_suite.py:420: in run_test_case
>     assert False, "Expected exception: %s" % expected_str
> E   AssertionError: Expected exception: Column metadata states there are 11 
> values, but read 10 values from column id.
> Standard Error
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=0;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id, cnt from bad_column_metadata t, (select count(*) cnt from 
> t.int_array) v;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id from bad_column_metadata;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> SELECT * from bad_parquet_strings_negative_len;
> -- executing against localhost:21000
> SELECT * from bad_parquet_strings_out_of_bounds;
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id, cnt from bad_column_metadata t, (select count(*) cnt from 
> t.int_array) v;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> set 

[jira] [Commented] (IMPALA-7763) Consider locking principal/privilege update and Sentry refresh operations

2018-10-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665918#comment-16665918
 ] 

ASF subversion and git services commented on IMPALA-7763:
-

Commit 8d628d7b62c4903bfd0236d2de150dae0752f0fe in impala's branch 
refs/heads/master from [~aholley]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=8d628d7 ]

IMPALA-7710: test_owner_privileges_with_grant failed with AuthorizationException

The problem was acache consistency issue between impalad and catalogd.
Because a Sentry refresh was occuring during an update to privileges
from the alter table set owner, impalad had the correct privileges,
which allowed the "show grant role" to succeed but the privileges in
catalogd were being overwritten from the sentry refresh. Added a delay
in the drop call to ensure privileges are updated. This is a
workaround to get the tests to pass with the existing behaviour and
should be reassessed if IMPALA-7763 is implemented.  This would add a
lock to possibly prevent this, but will need it's own assessment.

Testing:
- Ran custom cluster tests 50 times

Change-Id: I5a1babd3dcbb94ffaa1f3e6ef2cebf1a1d391219
Reviewed-on: http://gerrit.cloudera.org:8080/11786
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Consider locking principal/privilege update and Sentry refresh operations 
> --
>
> Key: IMPALA-7763
> URL: https://issues.apache.org/jira/browse/IMPALA-7763
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Fredy Wijaya
>Priority: Major
>
> There's currently no lock between a Sentry refresh and any operations outside 
> Sentry refresh for updating principal and privileges, which can cause the 
> catalog to be temporarily inconsistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-2515) Impala is unable to read a Parquet decimal column if size is larger than needed

2018-10-26 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho reassigned IMPALA-2515:
--

Assignee: Michal Ostrowski

> Impala is unable to read a Parquet decimal column if size is larger than 
> needed
> ---
>
> Key: IMPALA-2515
> URL: https://issues.apache.org/jira/browse/IMPALA-2515
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.3.0
>Reporter: Taras Bobrovytsky
>Assignee: Michal Ostrowski
>Priority: Minor
>  Labels: ramp-up
>
> Impala cannot read this:
> {code}
> {"name": "tmp_1",
>  "type": "fixed",
>  "size": 8,
>  "logicalType": "decimal",
>  "precision": 10,
>  "scale": 5}
> {code}
> However, this can be read:
> {code}
> {"name": "tmp_1",
>  "type": "fixed",
>  "size": 5,
>  "logicalType": "decimal",
>  "precision": 10,
>  "scale": 5}
> {code}
> Size must be precisely set to this, or Impala is unable to read the decimal 
> column:
> {code}
> size = int(math.ceil((math.log(2, 10) + precision) / math.log(256, 10)))
> {code}
> There is nothing in the Parquet spec that says that Decimal columns must be 
> sized precisely. Arguably it's a bug in the writer if it's doing it, because 
> it's just wasting space.
> https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7241) progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)

2018-10-26 Thread Michael Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665800#comment-16665800
 ] 

Michael Ho commented on IMPALA-7241:


IMPALA-7213 added sequence number of ExecStatus update to the coordinator, thus 
preventing duplicated / out-of-order updates being applied at the coordinator.

> progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)
> ---
>
> Key: IMPALA-7241
> URL: https://issues.apache.org/jira/browse/IMPALA-7241
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Michael Ho
>Priority: Blocker
>  Labels: crash, stress
>
> During a stress test with 8 nodes running an insecure debug build based off 
> master, an impalad hit a DCHECK. The concurrency level at the time was 
> between 150-180 queries.
> The DCHECK was {{progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 
> 0)}}.
> The stack was:
> {noformat}
> #0  0x7f6a73e811f7 in raise () from /lib64/libc.so.6
> #1  0x7f6a73e828e8 in abort () from /lib64/libc.so.6
> #2  0x04300e34 in google::DumpStackTraceAndExit() ()
> #3  0x042f78ad in google::LogMessage::Fail() ()
> #4  0x042f9152 in google::LogMessage::SendToLog() ()
> #5  0x042f7287 in google::LogMessage::Flush() ()
> #6  0x042fa84e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x01f7fedd in impala::ProgressUpdater::Update(long) ()
> #8  0x0313912b in 
> impala::Coordinator::BackendState::InstanceStats::Update(impala::TFragmentInstanceExecStatus
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #9  0x031369e6 in 
> impala::Coordinator::BackendState::ApplyExecStatusReport(impala::TReportExecStatusParams
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #10 0x031250b4 in 
> impala::Coordinator::UpdateBackendExecStatus(impala::TReportExecStatusParams 
> const&) ()
> #11 0x01e86395 in 
> impala::ClientRequestState::UpdateBackendExecStatus(impala::TReportExecStatusParams
>  const&) ()
> #12 0x01e27594 in 
> impala::ImpalaServer::ReportExecStatus(impala::TReportExecStatusResult&, 
> impala::TReportExecStatusParams const&) ()
> #13 0x01ebb8a0 in 
> impala::ImpalaInternalService::ReportExecStatus(impala::TReportExecStatusResult&,
>  impala::TReportExecStatusParams const&) ()
> #14 0x02fa6f62 in 
> impala::ImpalaInternalServiceProcessor::process_ReportExecStatus(int, 
> apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, 
> void*) ()
> #15 0x02fa6540 in 
> impala::ImpalaInternalServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*,
>  apache::thrift::protocol::TProtocol*, std::string const&, int, void*) ()
> #16 0x018892b0 in 
> apache::thrift::TDispatchProcessor::process(boost::shared_ptr,
>  boost::shared_ptr, void*) ()
> #17 0x01c80d1b in 
> apache::thrift::server::TAcceptQueueServer::Task::run() ()
> #18 0x01c78ffb in 
> impala::ThriftThread::RunRunnable(boost::shared_ptr,
>  impala::Promise*) ()
> #19 0x01c7a721 in boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise (impala::PromiseMode)0>*>::operator()(impala::ThriftThread*, 
> boost::shared_ptr, 
> impala::Promise*) const ()
> #20 0x01c7a5b7 in void 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> 
> >::operator() boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf2 impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>&, 
> boost::_bi::list0&, int) ()
> #21 0x01c7a303 in boost::_bi::bind_t impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >::operator()() ()
> #22 0x01c7a216 in 
> boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >, void>::invoke(boost::detail::function::function_buffer&) ()
> #23 0x01bbb81c in boost::function0::operator()() const ()
> #24 0x01fb6eaf in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #25 0x01fbef87 in void 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 

[jira] [Issue Comment Deleted] (IMPALA-7241) progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)

2018-10-26 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-7241:
---
Comment: was deleted

(was: IMPALA-7123 added sequence number of ExecStatus update to the 
coordinator, thus preventing duplicated / out-of-order updates being applied at 
the coordinator.)

> progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)
> ---
>
> Key: IMPALA-7241
> URL: https://issues.apache.org/jira/browse/IMPALA-7241
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Michael Ho
>Priority: Blocker
>  Labels: crash, stress
>
> During a stress test with 8 nodes running an insecure debug build based off 
> master, an impalad hit a DCHECK. The concurrency level at the time was 
> between 150-180 queries.
> The DCHECK was {{progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 
> 0)}}.
> The stack was:
> {noformat}
> #0  0x7f6a73e811f7 in raise () from /lib64/libc.so.6
> #1  0x7f6a73e828e8 in abort () from /lib64/libc.so.6
> #2  0x04300e34 in google::DumpStackTraceAndExit() ()
> #3  0x042f78ad in google::LogMessage::Fail() ()
> #4  0x042f9152 in google::LogMessage::SendToLog() ()
> #5  0x042f7287 in google::LogMessage::Flush() ()
> #6  0x042fa84e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x01f7fedd in impala::ProgressUpdater::Update(long) ()
> #8  0x0313912b in 
> impala::Coordinator::BackendState::InstanceStats::Update(impala::TFragmentInstanceExecStatus
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #9  0x031369e6 in 
> impala::Coordinator::BackendState::ApplyExecStatusReport(impala::TReportExecStatusParams
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #10 0x031250b4 in 
> impala::Coordinator::UpdateBackendExecStatus(impala::TReportExecStatusParams 
> const&) ()
> #11 0x01e86395 in 
> impala::ClientRequestState::UpdateBackendExecStatus(impala::TReportExecStatusParams
>  const&) ()
> #12 0x01e27594 in 
> impala::ImpalaServer::ReportExecStatus(impala::TReportExecStatusResult&, 
> impala::TReportExecStatusParams const&) ()
> #13 0x01ebb8a0 in 
> impala::ImpalaInternalService::ReportExecStatus(impala::TReportExecStatusResult&,
>  impala::TReportExecStatusParams const&) ()
> #14 0x02fa6f62 in 
> impala::ImpalaInternalServiceProcessor::process_ReportExecStatus(int, 
> apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, 
> void*) ()
> #15 0x02fa6540 in 
> impala::ImpalaInternalServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*,
>  apache::thrift::protocol::TProtocol*, std::string const&, int, void*) ()
> #16 0x018892b0 in 
> apache::thrift::TDispatchProcessor::process(boost::shared_ptr,
>  boost::shared_ptr, void*) ()
> #17 0x01c80d1b in 
> apache::thrift::server::TAcceptQueueServer::Task::run() ()
> #18 0x01c78ffb in 
> impala::ThriftThread::RunRunnable(boost::shared_ptr,
>  impala::Promise*) ()
> #19 0x01c7a721 in boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise (impala::PromiseMode)0>*>::operator()(impala::ThriftThread*, 
> boost::shared_ptr, 
> impala::Promise*) const ()
> #20 0x01c7a5b7 in void 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> 
> >::operator() boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf2 impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>&, 
> boost::_bi::list0&, int) ()
> #21 0x01c7a303 in boost::_bi::bind_t impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >::operator()() ()
> #22 0x01c7a216 in 
> boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >, void>::invoke(boost::detail::function::function_buffer&) ()
> #23 0x01bbb81c in boost::function0::operator()() const ()
> #24 0x01fb6eaf in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #25 0x01fbef87 in void 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 

[jira] [Commented] (IMPALA-7241) progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)

2018-10-26 Thread Michael Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665799#comment-16665799
 ] 

Michael Ho commented on IMPALA-7241:


IMPALA-7123 added sequence number of ExecStatus update to the coordinator, thus 
preventing duplicated / out-of-order updates being applied at the coordinator.

> progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)
> ---
>
> Key: IMPALA-7241
> URL: https://issues.apache.org/jira/browse/IMPALA-7241
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Michael Ho
>Priority: Blocker
>  Labels: crash, stress
>
> During a stress test with 8 nodes running an insecure debug build based off 
> master, an impalad hit a DCHECK. The concurrency level at the time was 
> between 150-180 queries.
> The DCHECK was {{progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 
> 0)}}.
> The stack was:
> {noformat}
> #0  0x7f6a73e811f7 in raise () from /lib64/libc.so.6
> #1  0x7f6a73e828e8 in abort () from /lib64/libc.so.6
> #2  0x04300e34 in google::DumpStackTraceAndExit() ()
> #3  0x042f78ad in google::LogMessage::Fail() ()
> #4  0x042f9152 in google::LogMessage::SendToLog() ()
> #5  0x042f7287 in google::LogMessage::Flush() ()
> #6  0x042fa84e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x01f7fedd in impala::ProgressUpdater::Update(long) ()
> #8  0x0313912b in 
> impala::Coordinator::BackendState::InstanceStats::Update(impala::TFragmentInstanceExecStatus
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #9  0x031369e6 in 
> impala::Coordinator::BackendState::ApplyExecStatusReport(impala::TReportExecStatusParams
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #10 0x031250b4 in 
> impala::Coordinator::UpdateBackendExecStatus(impala::TReportExecStatusParams 
> const&) ()
> #11 0x01e86395 in 
> impala::ClientRequestState::UpdateBackendExecStatus(impala::TReportExecStatusParams
>  const&) ()
> #12 0x01e27594 in 
> impala::ImpalaServer::ReportExecStatus(impala::TReportExecStatusResult&, 
> impala::TReportExecStatusParams const&) ()
> #13 0x01ebb8a0 in 
> impala::ImpalaInternalService::ReportExecStatus(impala::TReportExecStatusResult&,
>  impala::TReportExecStatusParams const&) ()
> #14 0x02fa6f62 in 
> impala::ImpalaInternalServiceProcessor::process_ReportExecStatus(int, 
> apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, 
> void*) ()
> #15 0x02fa6540 in 
> impala::ImpalaInternalServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*,
>  apache::thrift::protocol::TProtocol*, std::string const&, int, void*) ()
> #16 0x018892b0 in 
> apache::thrift::TDispatchProcessor::process(boost::shared_ptr,
>  boost::shared_ptr, void*) ()
> #17 0x01c80d1b in 
> apache::thrift::server::TAcceptQueueServer::Task::run() ()
> #18 0x01c78ffb in 
> impala::ThriftThread::RunRunnable(boost::shared_ptr,
>  impala::Promise*) ()
> #19 0x01c7a721 in boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise (impala::PromiseMode)0>*>::operator()(impala::ThriftThread*, 
> boost::shared_ptr, 
> impala::Promise*) const ()
> #20 0x01c7a5b7 in void 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> 
> >::operator() boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf2 impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>&, 
> boost::_bi::list0&, int) ()
> #21 0x01c7a303 in boost::_bi::bind_t impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >::operator()() ()
> #22 0x01c7a216 in 
> boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >, void>::invoke(boost::detail::function::function_buffer&) ()
> #23 0x01bbb81c in boost::function0::operator()() const ()
> #24 0x01fb6eaf in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #25 0x01fbef87 in void 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 

[jira] [Updated] (IMPALA-7241) progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)

2018-10-26 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-7241:
---
Target Version: Impala 3.2.0  (was: Impala 3.1.0)

> progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)
> ---
>
> Key: IMPALA-7241
> URL: https://issues.apache.org/jira/browse/IMPALA-7241
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Michael Ho
>Priority: Blocker
>  Labels: crash, stress
>
> During a stress test with 8 nodes running an insecure debug build based off 
> master, an impalad hit a DCHECK. The concurrency level at the time was 
> between 150-180 queries.
> The DCHECK was {{progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 
> 0)}}.
> The stack was:
> {noformat}
> #0  0x7f6a73e811f7 in raise () from /lib64/libc.so.6
> #1  0x7f6a73e828e8 in abort () from /lib64/libc.so.6
> #2  0x04300e34 in google::DumpStackTraceAndExit() ()
> #3  0x042f78ad in google::LogMessage::Fail() ()
> #4  0x042f9152 in google::LogMessage::SendToLog() ()
> #5  0x042f7287 in google::LogMessage::Flush() ()
> #6  0x042fa84e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x01f7fedd in impala::ProgressUpdater::Update(long) ()
> #8  0x0313912b in 
> impala::Coordinator::BackendState::InstanceStats::Update(impala::TFragmentInstanceExecStatus
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #9  0x031369e6 in 
> impala::Coordinator::BackendState::ApplyExecStatusReport(impala::TReportExecStatusParams
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #10 0x031250b4 in 
> impala::Coordinator::UpdateBackendExecStatus(impala::TReportExecStatusParams 
> const&) ()
> #11 0x01e86395 in 
> impala::ClientRequestState::UpdateBackendExecStatus(impala::TReportExecStatusParams
>  const&) ()
> #12 0x01e27594 in 
> impala::ImpalaServer::ReportExecStatus(impala::TReportExecStatusResult&, 
> impala::TReportExecStatusParams const&) ()
> #13 0x01ebb8a0 in 
> impala::ImpalaInternalService::ReportExecStatus(impala::TReportExecStatusResult&,
>  impala::TReportExecStatusParams const&) ()
> #14 0x02fa6f62 in 
> impala::ImpalaInternalServiceProcessor::process_ReportExecStatus(int, 
> apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, 
> void*) ()
> #15 0x02fa6540 in 
> impala::ImpalaInternalServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*,
>  apache::thrift::protocol::TProtocol*, std::string const&, int, void*) ()
> #16 0x018892b0 in 
> apache::thrift::TDispatchProcessor::process(boost::shared_ptr,
>  boost::shared_ptr, void*) ()
> #17 0x01c80d1b in 
> apache::thrift::server::TAcceptQueueServer::Task::run() ()
> #18 0x01c78ffb in 
> impala::ThriftThread::RunRunnable(boost::shared_ptr,
>  impala::Promise*) ()
> #19 0x01c7a721 in boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise (impala::PromiseMode)0>*>::operator()(impala::ThriftThread*, 
> boost::shared_ptr, 
> impala::Promise*) const ()
> #20 0x01c7a5b7 in void 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> 
> >::operator() boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf2 impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>&, 
> boost::_bi::list0&, int) ()
> #21 0x01c7a303 in boost::_bi::bind_t impala::ThriftThread, 
> boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >::operator()() ()
> #22 0x01c7a216 in 
> boost::detail::function::void_function_obj_invoker0 boost::_mfi::mf2 boost::shared_ptr, 
> impala::Promise*>, 
> boost::_bi::list3, 
> boost::_bi::value >, 
> boost::_bi::value*> > 
> >, void>::invoke(boost::detail::function::function_buffer&) ()
> #23 0x01bbb81c in boost::function0::operator()() const ()
> #24 0x01fb6eaf in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #25 0x01fbef87 in void 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list0&, int) ()
> #26 0x01fbeeab in boost::_bi::bind_t const&, std::string const&, boost::function, impala::ThreadDebugInfo 
> 

[jira] [Created] (IMPALA-7775) StatestoreSslTest crashed with mutex lock failed in pthread_mutex_lock: Invalid argument

2018-10-26 Thread Tim Armstrong (JIRA)
Tim Armstrong created IMPALA-7775:
-

 Summary: StatestoreSslTest crashed with mutex lock failed in 
pthread_mutex_lock: Invalid argument
 Key: IMPALA-7775
 URL: https://issues.apache.org/jira/browse/IMPALA-7775
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Tim Armstrong


{noformat}

20:17:28 [==] Running 2 tests from 2 test cases.
20:17:28 [--] Global test environment set-up.
20:17:28 [--] 1 test from StatestoreTest
20:17:28 [ RUN  ] StatestoreTest.SmokeTest
20:17:28 [   OK ] StatestoreTest.SmokeTest (24 ms)
20:17:28 [--] 1 test from StatestoreTest (24 ms total)
20:17:28 
20:17:28 [--] 1 test from StatestoreSslTest
20:17:28 [ RUN  ] StatestoreSslTest.SmokeTest
20:17:28 terminate called after throwing an instance of 
'boost::exception_detail::clone_impl
 >'
20:17:28   what():  boost: mutex lock failed in pthread_mutex_lock: Invalid 
argument
20:17:28 Wrote minidump to 
/home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
20:17:28 Wrote minidump to 
/home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
20:17:28 
{noformat}
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3441

This smells like a lifecycle bug in the backend test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7586) Incorrect results when querying primary = "\"" in Kudu

2018-10-26 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665765#comment-16665765
 ] 

Tim Armstrong commented on IMPALA-7586:
---

I can pick this up for the release since I know what's going on.

> Incorrect results when querying primary = "\"" in Kudu
> --
>
> Key: IMPALA-7586
> URL: https://issues.apache.org/jira/browse/IMPALA-7586
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Will Berkeley
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness, kudu
> Attachments: impalakudu_pred_bug.profile
>
>
> Version string from catalogd web ui:
> {noformat}
> catalogd version 3.1.0-cdh6.x-SNAPSHOT RELEASE (build 
> 8baac7f5849b6bacb02fedeb9b3fe2b2ee9450ee)
> {noformat}
> A reproduction script for the impala-shell:
> {noformat}
> create table test(name string, primary key(name) ) stored as kudu;
> insert into test values ("\"");
> -- Modified 1 row(s), 0 row error(s) in 4.01s
> -- row found in full table scan
> select * from test;
> -- Fetched 1 row(s) in 0.15s
> -- row not found on = predicate (pushed to kudu)
> select * from test where name="\"";
> -- Fetched 0 row(s) in 0.13s
> -- row found when predicate cannot be pushed to kudu
> select * from test where name like "\"";
> -- Fetched 1 row(s) in 0.13s
> {noformat}
> This was originally reported as KUDU-2575. I tried to reproduce directly 
> against Kudu using the python client but got the expected result.
> From the plan and profile, Impala is pushing down the predicate, but Kudu is 
> not being scanned, possibly because the Kudu client short-circuits the scan 
> as having no results based on the predicate Impala pushes down.
> {noformat}
> 00:SCAN KUDU [default.test]
>kudu predicates: name = '"'
>mem-estimate=0B mem-reservation=0B thread-reservation=1
>tuple-ids=0 row-size=15B cardinality=unavailable
>in pipelines: 00(GETNEXT)
> {noformat}
> {noformat}
> KUDU_SCAN_NODE (id=0)
>   - AverageScannerThreadConcurrency: 0.00 (0.0)
>   - InactiveTotalTime: 0ns (0)
>   - KuduRemoteScanTokens: 0 (0)
>   - MaterializeTupleTime(*): 0ns (0)
>   - NumScannerThreadMemUnavailable: 0 (0)
>   - NumScannerThreadsStarted: 1 (1)
>   - PeakMemoryUsage: 24.0 KiB (24576)
>   - PeakScannerThreadConcurrency: 1 (1)
>   - RowBatchBytesEnqueued: 16.0 KiB (16384)
>   - RowBatchQueueGetWaitTime: 0ns (0)
>   - RowBatchQueuePeakMemoryUsage: 0 B (0)
>   - RowBatchQueuePutWaitTime: 0ns (0)
>   - RowBatchesEnqueued: 1 (1)
>   - RowsRead: 0 (0)
> ===>  - RowsReturned: 0 (0)
>   - RowsReturnedRate: 0 per second (0)
>   - ScanRangesComplete: 1 (1)
>   - ScannerThreadsInvoluntaryContextSwitches: 0 (0)
>   - ScannerThreadsTotalWallClockTime: 0ns (0)
> - ScannerThreadsSysTime: 158.00us (158000)
> - ScannerThreadsUserTime: 0ns (0)
>   - ScannerThreadsVoluntaryContextSwitches: 2 (2)
> ===>  - TotalKuduScanRoundTrips: 0 (0)
>   - TotalTime: 1ms (172)
> {noformat}
> I also confirmed Kudu sees no scan from Impala for this query using the 
> /scans page of the tablet servers.
> Full profile attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7586) Incorrect results when querying primary = "\"" in Kudu

2018-10-26 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-7586:
-

Assignee: Tim Armstrong  (was: Adrian Ng)

> Incorrect results when querying primary = "\"" in Kudu
> --
>
> Key: IMPALA-7586
> URL: https://issues.apache.org/jira/browse/IMPALA-7586
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Will Berkeley
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness, kudu
> Attachments: impalakudu_pred_bug.profile
>
>
> Version string from catalogd web ui:
> {noformat}
> catalogd version 3.1.0-cdh6.x-SNAPSHOT RELEASE (build 
> 8baac7f5849b6bacb02fedeb9b3fe2b2ee9450ee)
> {noformat}
> A reproduction script for the impala-shell:
> {noformat}
> create table test(name string, primary key(name) ) stored as kudu;
> insert into test values ("\"");
> -- Modified 1 row(s), 0 row error(s) in 4.01s
> -- row found in full table scan
> select * from test;
> -- Fetched 1 row(s) in 0.15s
> -- row not found on = predicate (pushed to kudu)
> select * from test where name="\"";
> -- Fetched 0 row(s) in 0.13s
> -- row found when predicate cannot be pushed to kudu
> select * from test where name like "\"";
> -- Fetched 1 row(s) in 0.13s
> {noformat}
> This was originally reported as KUDU-2575. I tried to reproduce directly 
> against Kudu using the python client but got the expected result.
> From the plan and profile, Impala is pushing down the predicate, but Kudu is 
> not being scanned, possibly because the Kudu client short-circuits the scan 
> as having no results based on the predicate Impala pushes down.
> {noformat}
> 00:SCAN KUDU [default.test]
>kudu predicates: name = '"'
>mem-estimate=0B mem-reservation=0B thread-reservation=1
>tuple-ids=0 row-size=15B cardinality=unavailable
>in pipelines: 00(GETNEXT)
> {noformat}
> {noformat}
> KUDU_SCAN_NODE (id=0)
>   - AverageScannerThreadConcurrency: 0.00 (0.0)
>   - InactiveTotalTime: 0ns (0)
>   - KuduRemoteScanTokens: 0 (0)
>   - MaterializeTupleTime(*): 0ns (0)
>   - NumScannerThreadMemUnavailable: 0 (0)
>   - NumScannerThreadsStarted: 1 (1)
>   - PeakMemoryUsage: 24.0 KiB (24576)
>   - PeakScannerThreadConcurrency: 1 (1)
>   - RowBatchBytesEnqueued: 16.0 KiB (16384)
>   - RowBatchQueueGetWaitTime: 0ns (0)
>   - RowBatchQueuePeakMemoryUsage: 0 B (0)
>   - RowBatchQueuePutWaitTime: 0ns (0)
>   - RowBatchesEnqueued: 1 (1)
>   - RowsRead: 0 (0)
> ===>  - RowsReturned: 0 (0)
>   - RowsReturnedRate: 0 per second (0)
>   - ScanRangesComplete: 1 (1)
>   - ScannerThreadsInvoluntaryContextSwitches: 0 (0)
>   - ScannerThreadsTotalWallClockTime: 0ns (0)
> - ScannerThreadsSysTime: 158.00us (158000)
> - ScannerThreadsUserTime: 0ns (0)
>   - ScannerThreadsVoluntaryContextSwitches: 2 (2)
> ===>  - TotalKuduScanRoundTrips: 0 (0)
>   - TotalTime: 1ms (172)
> {noformat}
> I also confirmed Kudu sees no scan from Impala for this query using the 
> /scans page of the tablet servers.
> Full profile attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-7614) Impala 3.1 Doc: Document the New Invalidate Options

2018-10-26 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7614 started by Alex Rodoni.
---
> Impala 3.1 Doc: Document the New Invalidate Options
> ---
>
> Key: IMPALA-7614
> URL: https://issues.apache.org/jira/browse/IMPALA-7614
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
>
> Document the new options:
> - invalidate_tables_timeout_s
> - invalidate_tables_on_memory_pressure{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6859) De-templatize RpcMgrTestBase

2018-10-26 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-6859.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> De-templatize RpcMgrTestBase
> 
>
> Key: IMPALA-6859
> URL: https://issues.apache.org/jira/browse/IMPALA-6859
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Michael Ho
>Priority: Major
>  Labels: security, test
> Fix For: Impala 3.1.0
>
>
> Now that we've gotten rid of the old way of Kinit-ing (IMPALA-5893), we can 
> detemplatize RpcMgrTestBase, since there's only one option to run the 
> kerberos tests with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6859) De-templatize RpcMgrTestBase

2018-10-26 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665757#comment-16665757
 ] 

Tim Armstrong commented on IMPALA-6859:
---

Fixed by



 commit 5c541b960491ba91533712144599fb3b6d99521d
Author: Michael Ho 
Date:   Thu Aug 23 00:33:16 2018 -0700

Add missing authorization in KRPC

In 2.12.0, Impala adopted Kudu RPC library for certain backened services
(TransmitData(), EndDataStream()). While the implementation uses Kerberos
for authenticating users connecting to the backend services, there is no
authorization implemented. This is a regression from the Thrift based
implementation because it registered a SASL callback (SaslAuthorizeInternal)
to be invoked during the connection negotiation. With this regression,
an unauthorized but authenticated user may invoke RPC calls to Impala 
backend
services.

This change fixes the issue above by overriding the default authorization 
method
for the DataStreamService. The authorization method will only let 
authenticated
principal which matches FLAGS_principal / FLAGS_be_principal to access the 
service.
Also added a new startup flag --krb5_ccname to allow users to customize the 
locations
of the Kerberos credentials cache.

Testing done:
1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an 
unauthorized
user is not allowed to access the service.
2. Ran some queries in a Kerberos enabled cluster to make sure there is no 
error.
3. Exhaustive builds.

Thanks to Todd Lipcon for pointing out the problem and his guidance on the 
fix.

Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Reviewed-on: http://gerrit.cloudera.org:8080/11331
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> De-templatize RpcMgrTestBase
> 
>
> Key: IMPALA-6859
> URL: https://issues.apache.org/jira/browse/IMPALA-6859
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Michael Ho
>Priority: Major
>  Labels: security, test
> Fix For: Impala 3.1.0
>
>
> Now that we've gotten rid of the old way of Kinit-ing (IMPALA-5893), we can 
> detemplatize RpcMgrTestBase, since there's only one option to run the 
> kerberos tests with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7446) Queries can spill earlier than necessary because of accumulation of free buffers and clean pages

2018-10-26 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7446:
--
Target Version: Impala 3.2.0  (was: Impala 3.1.0)

> Queries can spill earlier than necessary because of accumulation of free 
> buffers and clean pages
> 
>
> Key: IMPALA-7446
> URL: https://issues.apache.org/jira/browse/IMPALA-7446
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: resource-management
>
> See IMPALA-7442 for an example where the query started to spill even when 
> memory could have been made available by freeing buffers or evicting clean 
> pages. Usually this would just result in spilling earlier than necessary, but 
> in the case of IMPALA-7442 it lead to a query failure.
> My original intent was that BufferPool::ReleaseMemory() should be called in 
> situations like this, but that was not done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7774) Test JIRA, please ignore

2018-10-26 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7774.
---
Resolution: Incomplete

> Test JIRA, please ignore
> 
>
> Key: IMPALA-7774
> URL: https://issues.apache.org/jira/browse/IMPALA-7774
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7774) Test JIRA, please ignore

2018-10-26 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7774:
--
Priority: Major  (was: Critical)

> Test JIRA, please ignore
> 
>
> Key: IMPALA-7774
> URL: https://issues.apache.org/jira/browse/IMPALA-7774
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7774) Test JIRA, please ignore

2018-10-26 Thread Tim Armstrong (JIRA)
Tim Armstrong created IMPALA-7774:
-

 Summary: Test JIRA, please ignore
 Key: IMPALA-7774
 URL: https://issues.apache.org/jira/browse/IMPALA-7774
 Project: IMPALA
  Issue Type: Improvement
Reporter: Tim Armstrong
Assignee: Tim Armstrong






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7765) Impala 3.1 Doc: Docuement MAX_MEM_ESTIMATE_FOR_ADMISSION

2018-10-26 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7765:

Description: https://gerrit.cloudera.org/#/c/11804/

> Impala 3.1 Doc: Docuement MAX_MEM_ESTIMATE_FOR_ADMISSION
> 
>
> Key: IMPALA-7765
> URL: https://issues.apache.org/jira/browse/IMPALA-7765
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
>
> https://gerrit.cloudera.org/#/c/11804/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-7761) Add multiple count distinct to targeted stress and targeted perf

2018-10-26 Thread Thomas Tauber-Marshall (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7761 started by Thomas Tauber-Marshall.
--
> Add multiple count distinct to targeted stress and targeted perf
> 
>
> Key: IMPALA-7761
> URL: https://issues.apache.org/jira/browse/IMPALA-7761
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
>
> With IMPALA-110 in, we should add queries with multiple count distinct to 
> targeted stress and targeted perf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-3652) Fix resource transfer in subplans with limits

2018-10-26 Thread Thomas Tauber-Marshall (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-3652 started by Thomas Tauber-Marshall.
--
> Fix resource transfer in subplans with limits
> -
>
> Key: IMPALA-3652
> URL: https://issues.apache.org/jira/browse/IMPALA-3652
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Affects Versions: Impala 2.6.0
>Reporter: Tim Armstrong
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: resource-management
>
> There is a tricky corner case in our resource transfer model with subplans 
> and limits. The problem is that the limit in the subplan may mean that the 
> exec node is reset before it has returned its full output. The resource 
> transfer logic generally attaches resources to batches at specific points in 
> the output, e.g. end of partition, end of block, so it's possible that 
> batches returned before the Reset() may reference resources that have not yet 
> been transferred. It's unclear if we test this scenario consistently or if 
> it's always handled correctly.
> One example is this query, reported in IMPALA-5456:
> {code}
> select c_custkey, c_mktsegment, o_orderkey, o_orderdate
> from customer c,
>   (select o1.o_orderkey, o2.o_orderdate
>from c.c_orders o1, c.c_orders o2
>where o1.o_orderkey = o2.o_orderkey limit 10) v limit 500;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7758) chars_formats dependent tables are created using the wrong LOCATION

2018-10-26 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-7758:
-

Assignee: David Knupp

> chars_formats dependent tables are created using the wrong LOCATION
> ---
>
> Key: IMPALA-7758
> URL: https://issues.apache.org/jira/browse/IMPALA-7758
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
>
> In testdata/bin/load-dependent-tables.sql, the LOCATION clause when creating 
> the various chars_formats tables (e.g. text) use:
> {noformat}
> LOCATION '${hiveconf:hive.metastore.warehouse.dir}/chars_formats_text'
> {noformat}
> ...which resolves to {{/user/hive/warehouse/chars_formats_text}}
> However, the actual test warehouse root dir is {{/test-warehouse}}, not 
> {{/user/hive/warehouse}}.
> {noformat}
> $ hdfs dfs -cat /test-warehouse/chars_formats_text/chars-formats.txt
> abcde,88db79c70974e02deb3f01cfdcc5daae2078f21517d1021994f12685c0144addae3ce0dbd6a540b55b88af68486251fa6f0c8f9f94b3b1b4bc64c69714e281f388db79c70974,variable
>  length
> abc 
> ,8d3fffddf79e9a232ffd19f9ccaa4d6b37a6a243dbe0f23137b108a043d9da13121a9b505c804956b22e93c7f93969f4a7ba8ddea45bf4aab0bebc8f814e09918d3fffddf79e,abc
> abcdef,68f8c4575da360c32abb46689e58193a0eeaa905ae6f4a5e6c702a6ae1db35a6f86f8222b7a5489d96eb0466c755b677a64160d074617096a8c6279038bc720468f8c4575da3,b2fe9d4638503a57f93396098f24103a20588631727d0f0b5016715a3f6f2616628f09b1f63b23e484396edf949d9a1c307dbe11f23b971afd75b0f639d8a3f1
> {noformat}
> versus...
> {noformat}
> $ hdfs dfs -cat /user/hive/warehouse/chars_formats_text/chars-formats.txt
> cat: `/user/hive/warehouse/chars_formats_text/chars-formats.txt': No such 
> file or directory
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7363) Spurious error generated by sequence file scanner with weird scan range length

2018-10-26 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-7363:
-

Assignee: Pooja Nilangekar

> Spurious error generated by sequence file scanner with weird scan range length
> --
>
> Key: IMPALA-7363
> URL: https://issues.apache.org/jira/browse/IMPALA-7363
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Pooja Nilangekar
>Priority: Major
>  Labels: avro
>
> Repro on master
> {noformat}
> tarmstrong@tarmstrong-box:~/Impala/incubator-impala$ impala-shell.sh
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build 
> cec33fa0ae75392668273d40b5a1bc4bbd7e9e2e)
> ***
> Welcome to the Impala shell.
> (Impala Shell v3.1.0-SNAPSHOT (cec33fa) built on Thu Jul 26 09:50:10 PDT 2018)
> To see a summary of a query's progress that updates in real-time, run 'set
> LIVE_PROGRESS=1;'.
> ***
> [localhost:21000] default> use tpch_seq_snap;
> Query: use tpch_seq_snap
> [localhost:21000] tpch_seq_snap> SET max_scan_range_length=5377;
> MAX_SCAN_RANGE_LENGTH set to 5377
> [localhost:21000] tpch_seq_snap> select count(*)
>> from lineitem;
> Query: select count(*)
> from lineitem
> Query submitted at: 2018-07-26 14:10:18 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=e9428efe173ad2f4:84b66bdb
> +--+
> | count(*) |
> +--+
> | 5993651  |
> +--+
> WARNINGS: SkipText: length is negative
> Problem parsing file 
> hdfs://localhost:20500/test-warehouse/tpch.lineitem_seq_snap/00_0 at 
> 36472193
> {noformat}
> Found while adding a test for IMPALA-7360



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7738) Implement timeouts for HDFS calls

2018-10-26 Thread Joe McDonnell (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell reassigned IMPALA-7738:
-

Assignee: Joe McDonnell

> Implement timeouts for HDFS calls
> -
>
> Key: IMPALA-7738
> URL: https://issues.apache.org/jira/browse/IMPALA-7738
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, 
> Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Joe McDonnell
>Priority: Critical
>
> Currently, there is no timeout with the various HDFS calls (e.g. hdfsOpen(), 
> hdfsRead()) we made in libhdfs.so in either the disk-io-mgr thread or scanner 
> thread context. Various users of Impala have complaint in the past about hung 
> queries which eventually boiled down to stuck hdfs calls. HDFS maintainers 
> have been slow to find the root cause of those hangs. To make this kind of 
> stuck queries problem easier to identify in the future, we should just 
> enforce a timeout in various hdfs calls so the queries will fail when certain 
> HDFS calls take longer than a designated timeout period.
> There may be multiple layers which this timeout can be enforced:
>  * at Impala level, we can have a fixed sized thread pool which handles all 
> hdfs calls. The existing hdfs calls will be a wrapper with a timeout.
>  * at libhdfs.so, enforce a timeout at places in the HDFS client code which 
> may block forever.
> The second option is probably beyond the charter of Apache Impala project.
> cc'ing [~tarmstr...@cloudera.com], [~joemcdonnell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-4268) buffer more than a batch of rows at coordinator

2018-10-26 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-4268:
-

Assignee: Bikramjeet Vig

> buffer more than a batch of rows at coordinator
> ---
>
> Key: IMPALA-4268
> URL: https://issues.apache.org/jira/browse/IMPALA-4268
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Henry Robinson
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: query-lifecycle, resource-management
> Attachments: rows-produced-histogram.png
>
>
> In IMPALA-2905, we are introducing a {{PlanRootSink}} that handles the 
> production of output rows at the root of a plan.
> The implementation in IMPALA-2905 has the plan execute in a separate thread 
> to the consumer, which calls {{GetNext()}} to retrieve the rows. However, the 
> sender thread will block until {{GetNext()}} is called, so that there are no 
> complications about memory usage and ownership due to having several batches 
> in flight at one time.
> However, this also leads to many context switches, as each {{GetNext()}} call 
> yields to the sender to produce the rows. If the sender was to fill a buffer 
> asynchronously, the consumer could pull out of that buffer without taking a 
> context switch in many cases (and the extra buffering might smooth out any 
> performance spikes due to client delays, which currently directly affect plan 
> execution).
> The tricky part is managing the mismatch between the size of the row batches 
> processed in {{Send()}} and the size of the fetch result asked for by the 
> client. The sender materializes output rows in a {{QueryResultSet}} that is 
> owned by the coordinator. That is not, currently, a splittable object - 
> instead it contains the actual RPC response struct that will hit the wire 
> when the RPC completes. As asynchronous sender cannot know the batch size, 
> which may change on every fetch call. So the {{GetNext()}} implementation 
> would need to be able to split out the {{QueryResultSet}} to match the 
> correct fetch size, and handle stitching together other {{QueryResultSets}} - 
> without doing extra copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7773) ScanRange::ReadFromCache() holds RequestContext lock while opening file

2018-10-26 Thread Joe McDonnell (JIRA)
Joe McDonnell created IMPALA-7773:
-

 Summary: ScanRange::ReadFromCache() holds RequestContext lock 
while opening file
 Key: IMPALA-7773
 URL: https://issues.apache.org/jira/browse/IMPALA-7773
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Joe McDonnell


When a scanner thread is reading a file via HDFS caching, it executes 
ScanRange::ReadFromCache() while holding the RequestContext::lock_. 
ReadFromCache() calls Open(), which can require an RPC to the NameNode for 
HDFS. If the NameNode is slow (or hangs), this will be holding the lock for the 
duration. This lock is used by other scanner threads for this scan node. Disk 
Io Mgr threads sometimes need to get this lock (e.g. 
RequestContext::ReadDone(), called by DiskThreadLoop()). This can severely 
impact the system.

We should look into what it would take to drop the lock during the Open() call 
(and during the subsequent hadoopReadZero() call). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7532) Add retry/back-off to fetch-from-catalog RPCs

2018-10-26 Thread Tianyi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyi Wang resolved IMPALA-7532.
-
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Add retry/back-off to fetch-from-catalog RPCs
> -
>
> Key: IMPALA-7532
> URL: https://issues.apache.org/jira/browse/IMPALA-7532
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Tianyi Wang
>Priority: Major
> Fix For: Impala 3.2.0
>
>
> Currently if there is an error connecting to the catalog server, the 'fetch 
> from catalog' implementation will retry with no apparent backoff. We should 
> retry for some period of time with backoff in between the attempts, so that 
> impala can ride over short interruptions of the catalog service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-3446) Materialized views

2018-10-26 Thread Ruslan Dautkhanov (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665602#comment-16665602
 ] 

Ruslan Dautkhanov commented on IMPALA-3446:
---

Any updates on this case?

Both HIVE-14484 and HIVE-14249 are resolved in Hive so technically materialized 
views are supported in upstream Hive now. 

Would be great to have same in Impala, as Impala's use cases are exactly around 
faster interactive response time, than for Hive.

Thank you.

 

> Materialized views
> --
>
> Key: IMPALA-3446
> URL: https://issues.apache.org/jira/browse/IMPALA-3446
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Affects Versions: Impala 2.5.0
>Reporter: Marcell Szabo
>Priority: Minor
>
> This JIRA is a placeholder for this big topic.
> Some user stories I can imagine under the epic:
> # materialized view is a CTAS where the SELECT is saved in the HMS and can be 
> rerun by a simple command
> # materialized view detects that the source data has changed and falls back 
> to be a view instead of SELECT *
> # materialized view detects that the source data has changed and reruns the 
> CTAS automatically
> # ... reruns only the necessary changes (e.g. if only some of the partitions 
> change)
> # Impala and Hive to have common semantics of materialized views 
> (https://issues.apache.org/jira/browse/HIVE-10459)
> # materialized view stores extra statistics that help the optimizer, storing 
> the full resultset is optional
> # query optimizer checks for every query whether part of the query plan can 
> be covered by an existing materialized view



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7698) Add centos/redhat 6/7 support to bootstrap_system.sh

2018-10-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665583#comment-16665583
 ] 

ASF subversion and git services commented on IMPALA-7698:
-

Commit c1701074d6e94d98a43ab049ef807ac1b368180f in impala's branch 
refs/heads/master from [~philip]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c170107 ]

IMPALA-7698: Add centos support to bootstrap_system.

Largely, the changes involve conditionalizing some invocations to
account for differences between RH and Ubuntu. The trickiest bits were
timezone-related test errors (see below), postgresql permissions (need
to accept md5 passwords from localhost) and default ulimits (1024 user
processes/threads is not enough).

To test this, I built using test-with-docker. In additional to the
ulimit issue, I ran into the fact that /tmp needed 1777 permissions for
the postgresql socket, and entrypoint.sh had a few places that needed
special cases. At the moment, the data load ran fine, as did most of the
tests. I observed a test that relied on a python2.7-ism fail, which is
part of the point of this.

In the course of development, I encountered a handful of tests fail with
"Encounter parse error: failed to open /usr/share/zoneinfo/GMT-08:00 -
No such file or directory.", which was reproduced as follows:

[localhost:21000] default> use functional_orc_def; select * from alltypes;
...
WARNINGS: Encounter parse error: failed to open 
/usr/share/zoneinfo/GMT-08:00 - No such file or directory.

With Quanlong's help, I learned what was happening. test-with-docker was
translating my time zone (America/Los_Angeles) to US/Pacific-New,
because realpath(/etc/localtime) = US/Pacific-New. This timezone exists
in centos:6, so that wasn't a problem. However, this timezone does not
exist in the package "tzdata-java", which is the copy of the timezone
information used by Java. (There are bugs here that may have been fixed
in centos:7.) As a result, when ORC asks (by using
TimeZone.getDefault().getID()) the JDK
(src/solaris/native/java/util/TimeZone_md.c) for the default timezone,
it can't find the same name as /etc/localtime points to in its
repository and defaults to "GMT-08:00". This string then gets written
into the ORC files generated by Hive as part of data load, and then the
C++ library can't read them. This is fixed by changing "realpath"
to "readlink" in test-with-docker.py.

centos:7 is not addressed by this change. The move to systemd makes
"service sshd start" (and the same for postgresql) not work, and
additional care needs to be done to work around that.

This change is a joint effort with Laszlo Gaal.

Change-Id: Id54294d7607f51de87a9de373dcfc4a33f4bedf5
Reviewed-on: http://gerrit.cloudera.org:8080/11731
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 


> Add centos/redhat 6/7 support to bootstrap_system.sh
> 
>
> Key: IMPALA-7698
> URL: https://issues.apache.org/jira/browse/IMPALA-7698
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Major
>
> {{bootstrap_system.sh}} currently only works on Ubuntu. Making it work on 
> CentOS/Redhat would open the door to running automated tests on those 
> platforms more readily, including using {{test-with-docker}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-7765) Impala 3.1 Doc: Docuement MAX_MEM_ESTIMATE_FOR_ADMISSION

2018-10-26 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7765 started by Alex Rodoni.
---
> Impala 3.1 Doc: Docuement MAX_MEM_ESTIMATE_FOR_ADMISSION
> 
>
> Key: IMPALA-7765
> URL: https://issues.apache.org/jira/browse/IMPALA-7765
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-1780) Catch exceptions thrown by UDFs

2018-10-26 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp reopened IMPALA-1780:
-

Reopening to change Resolution field.

> Catch exceptions thrown by UDFs
> ---
>
> Key: IMPALA-1780
> URL: https://issues.apache.org/jira/browse/IMPALA-1780
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.1.1, Impala 2.3.0
>Reporter: Henry Robinson
>Priority: Major
>  Labels: crash, downgraded
>
> Catch exceptions thrown by UDFs so Impala doesn't crash.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-1780) Catch exceptions thrown by UDFs

2018-10-26 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-1780.
-
Resolution: Won't Fix

> Catch exceptions thrown by UDFs
> ---
>
> Key: IMPALA-1780
> URL: https://issues.apache.org/jira/browse/IMPALA-1780
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.1.1, Impala 2.3.0
>Reporter: Henry Robinson
>Priority: Major
>  Labels: crash, downgraded
>
> Catch exceptions thrown by UDFs so Impala doesn't crash.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-3959) data loading jenkins jobs don't save test logs

2018-10-26 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-3959.
-
Resolution: Invalid

> data loading jenkins jobs don't save test logs
> --
>
> Key: IMPALA-3959
> URL: https://issues.apache.org/jira/browse/IMPALA-3959
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.7.0
>Reporter: Michael Brown
>Priority: Critical
>
> Even though data loading jobs run BE, FE, and core EE tests, the logs for 
> these tests are not saved. (The only artifacts saved are those used by 
> snapshot consumers: the snapshot, metastore snapshot, and git hash). This is 
> a problem when there are flaky tests that fail there that we haven't seen 
> fail elsewhere: we have no forensic evidence to search through for clues.
> Example:
> http://sandbox.jenkins.cloudera.com/job/impala-asf-master-core-data-load/29/
> {noformat}
> 22:55:09 99% tests passed, 1 tests failed out of 78
> 22:55:09 
> 22:55:09 The following tests FAILED:
> 22:55:09   13 - kudu-scan-node-test (OTHER_FAULT)
> 22:55:09 Errors while running CTest
> 22:55:09 make: *** [test] Error 8
> {noformat}
> This kudu scan node test failed, but we have no other info on it, because we 
> have no artifacts.
> Part of the problem is that the data load job has a separate entry point, so 
> everything built up in {{Impala-aux/jenkins/build.sh}} to handle archiving 
> doesn't exist for {{Impala-aux/jenkins/build-data-load.sh}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-3959) data loading jenkins jobs don't save test logs

2018-10-26 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp reopened IMPALA-3959:
-

Reopening to change Resolution field.

> data loading jenkins jobs don't save test logs
> --
>
> Key: IMPALA-3959
> URL: https://issues.apache.org/jira/browse/IMPALA-3959
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.7.0
>Reporter: Michael Brown
>Priority: Critical
>
> Even though data loading jobs run BE, FE, and core EE tests, the logs for 
> these tests are not saved. (The only artifacts saved are those used by 
> snapshot consumers: the snapshot, metastore snapshot, and git hash). This is 
> a problem when there are flaky tests that fail there that we haven't seen 
> fail elsewhere: we have no forensic evidence to search through for clues.
> Example:
> http://sandbox.jenkins.cloudera.com/job/impala-asf-master-core-data-load/29/
> {noformat}
> 22:55:09 99% tests passed, 1 tests failed out of 78
> 22:55:09 
> 22:55:09 The following tests FAILED:
> 22:55:09   13 - kudu-scan-node-test (OTHER_FAULT)
> 22:55:09 Errors while running CTest
> 22:55:09 make: *** [test] Error 8
> {noformat}
> This kudu scan node test failed, but we have no other info on it, because we 
> have no artifacts.
> Part of the problem is that the data load job has a separate entry point, so 
> everything built up in {{Impala-aux/jenkins/build.sh}} to handle archiving 
> doesn't exist for {{Impala-aux/jenkins/build-data-load.sh}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7214) Lots of misleading/incorrect use of DataNode in Impala docs

2018-10-26 Thread Alex Rodoni (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665557#comment-16665557
 ] 

Alex Rodoni commented on IMPALA-7214:
-

Hi [~tarmstrong], I am assigning this to you for now. When you have sometime to 
look at it and have feedback, please reassign it to me. Thanks!

> Lots of misleading/incorrect use of DataNode in Impala docs
> ---
>
> Key: IMPALA-7214
> URL: https://issues.apache.org/jira/browse/IMPALA-7214
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 2.12.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> The docs tend to conflate DataNodes (a HDFS service) and Impala daemons. I 
> think this stems from the original deployment practice of always colocating 
> Impala daemons with HDFS datanodes so that HDFS data could always be read 
> from a local DataNode. 
> I'm a bit pedantic so the conflation feels wrong to me regardless, but I 
> think this will become increasingly confusing as alternative deployments 
> without colocated HDFS DataNodes become more common (e.g. running against S3, 
> running with a separate HDFS service).
> E.g. picking an example at random:
> {noformat}
> In Impala 1.4.0 and higher, the LIMIT clause is now 
> optional (rather than required) for
> queries that use the ORDER BY clause. Impala 
> automatically uses a temporary disk work area
> to perform the sort if the sort operation would otherwise exceed the 
> Impala memory limit for a particular
> DataNode.
> {noformat}
> This is wrong because the memory limit is for an Impala daemon, which is the 
> process that does the actual sorting. So here I think it should be "Impala 
> daemon" instead of "DataNode".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7214) Lots of misleading/incorrect use of DataNode in Impala docs

2018-10-26 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni reassigned IMPALA-7214:
---

Assignee: Tim Armstrong  (was: Alex Rodoni)

> Lots of misleading/incorrect use of DataNode in Impala docs
> ---
>
> Key: IMPALA-7214
> URL: https://issues.apache.org/jira/browse/IMPALA-7214
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 2.12.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> The docs tend to conflate DataNodes (a HDFS service) and Impala daemons. I 
> think this stems from the original deployment practice of always colocating 
> Impala daemons with HDFS datanodes so that HDFS data could always be read 
> from a local DataNode. 
> I'm a bit pedantic so the conflation feels wrong to me regardless, but I 
> think this will become increasingly confusing as alternative deployments 
> without colocated HDFS DataNodes become more common (e.g. running against S3, 
> running with a separate HDFS service).
> E.g. picking an example at random:
> {noformat}
> In Impala 1.4.0 and higher, the LIMIT clause is now 
> optional (rather than required) for
> queries that use the ORDER BY clause. Impala 
> automatically uses a temporary disk work area
> to perform the sort if the sort operation would otherwise exceed the 
> Impala memory limit for a particular
> DataNode.
> {noformat}
> This is wrong because the memory limit is for an Impala daemon, which is the 
> process that does the actual sorting. So here I think it should be "Impala 
> daemon" instead of "DataNode".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work stopped] (IMPALA-7214) Lots of misleading/incorrect use of DataNode in Impala docs

2018-10-26 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7214 stopped by Alex Rodoni.
---
> Lots of misleading/incorrect use of DataNode in Impala docs
> ---
>
> Key: IMPALA-7214
> URL: https://issues.apache.org/jira/browse/IMPALA-7214
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 2.12.0
>Reporter: Tim Armstrong
>Assignee: Alex Rodoni
>Priority: Major
>
> The docs tend to conflate DataNodes (a HDFS service) and Impala daemons. I 
> think this stems from the original deployment practice of always colocating 
> Impala daemons with HDFS datanodes so that HDFS data could always be read 
> from a local DataNode. 
> I'm a bit pedantic so the conflation feels wrong to me regardless, but I 
> think this will become increasingly confusing as alternative deployments 
> without colocated HDFS DataNodes become more common (e.g. running against S3, 
> running with a separate HDFS service).
> E.g. picking an example at random:
> {noformat}
> In Impala 1.4.0 and higher, the LIMIT clause is now 
> optional (rather than required) for
> queries that use the ORDER BY clause. Impala 
> automatically uses a temporary disk work area
> to perform the sort if the sort operation would otherwise exceed the 
> Impala memory limit for a particular
> DataNode.
> {noformat}
> This is wrong because the memory limit is for an Impala daemon, which is the 
> process that does the actual sorting. So here I think it should be "Impala 
> daemon" instead of "DataNode".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7738) Implement timeouts for HDFS calls

2018-10-26 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665536#comment-16665536
 ] 

Tim Armstrong commented on IMPALA-7738:
---

[~joemcdonnell] more a curiousity than anything at the moment (since it only 
works for HDFS, not other filesystems), but there was some work on an 
alternative HDFS client that supports some things like that: HDFS-8707. Came 
across that a while ago.

> Implement timeouts for HDFS calls
> -
>
> Key: IMPALA-7738
> URL: https://issues.apache.org/jira/browse/IMPALA-7738
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, 
> Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Priority: Critical
>
> Currently, there is no timeout with the various HDFS calls (e.g. hdfsOpen(), 
> hdfsRead()) we made in libhdfs.so in either the disk-io-mgr thread or scanner 
> thread context. Various users of Impala have complaint in the past about hung 
> queries which eventually boiled down to stuck hdfs calls. HDFS maintainers 
> have been slow to find the root cause of those hangs. To make this kind of 
> stuck queries problem easier to identify in the future, we should just 
> enforce a timeout in various hdfs calls so the queries will fail when certain 
> HDFS calls take longer than a designated timeout period.
> There may be multiple layers which this timeout can be enforced:
>  * at Impala level, we can have a fixed sized thread pool which handles all 
> hdfs calls. The existing hdfs calls will be a wrapper with a timeout.
>  * at libhdfs.so, enforce a timeout at places in the HDFS client code which 
> may block forever.
> The second option is probably beyond the charter of Apache Impala project.
> cc'ing [~tarmstr...@cloudera.com], [~joemcdonnell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7738) Implement timeouts for HDFS calls

2018-10-26 Thread Joe McDonnell (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665504#comment-16665504
 ] 

Joe McDonnell commented on IMPALA-7738:
---

For Open(), we only care about the file handle, and if the call blocks, it 
shouldn't hold up any memory (or it doesn't have to). For Read(), the buffer 
that we are reading into becomes toxic, because if the read ever succeeds, it 
will write to that memory. (Killing a hung thread sounds hard / error prone 
(based on 5 mins looking around), so I'm assuming we're just letting the thread 
run.)

I looked at hdfs.h, and there are no async calls or calls with timeouts.

> Implement timeouts for HDFS calls
> -
>
> Key: IMPALA-7738
> URL: https://issues.apache.org/jira/browse/IMPALA-7738
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, 
> Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Priority: Critical
>
> Currently, there is no timeout with the various HDFS calls (e.g. hdfsOpen(), 
> hdfsRead()) we made in libhdfs.so in either the disk-io-mgr thread or scanner 
> thread context. Various users of Impala have complaint in the past about hung 
> queries which eventually boiled down to stuck hdfs calls. HDFS maintainers 
> have been slow to find the root cause of those hangs. To make this kind of 
> stuck queries problem easier to identify in the future, we should just 
> enforce a timeout in various hdfs calls so the queries will fail when certain 
> HDFS calls take longer than a designated timeout period.
> There may be multiple layers which this timeout can be enforced:
>  * at Impala level, we can have a fixed sized thread pool which handles all 
> hdfs calls. The existing hdfs calls will be a wrapper with a timeout.
>  * at libhdfs.so, enforce a timeout at places in the HDFS client code which 
> may block forever.
> The second option is probably beyond the charter of Apache Impala project.
> cc'ing [~tarmstr...@cloudera.com], [~joemcdonnell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7751) Kudu insert statement should push down range partition predicates

2018-10-26 Thread Thomas Tauber-Marshall (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665494#comment-16665494
 ] 

Thomas Tauber-Marshall commented on IMPALA-7751:


Yeah, this would be a significant change, I think.

One issue is that Impala currently treats Kudu table partitioning as a black 
box for a few reasons - Impala doesn't have a builtin concept of multi-level 
partitioning schemes (see IMPALA-5255), we don't want to have to guarantee that 
Impala's hash partitioning operates the same as Kudu - so we currently just use 
a Kudu API call to determine the partition for each row. So, we would probably 
need a Kudu API for this that doesn't currently exist (though I suppose we 
could sort of hack it by setting up a scan over the table we're inserting into 
and then seeing which partitions the scan tokens correspond to), or we would 
need to fix the mentioned issues.

Unfortunately, while bulk inserts into Kudu are definitely a pain point for 
Impala, its not really an area that's on the current roadmap. Of course, we 
always welcome contributions and I'm happy to work with anyone who wants to 
take some of this on.

> Kudu insert statement should push down range partition predicates
> -
>
> Key: IMPALA-7751
> URL: https://issues.apache.org/jira/browse/IMPALA-7751
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Quanlong Huang
>Priority: Major
> Attachments: metrics1.tsv, metrics2.tsv, metrics3.tsv, profile.txt
>
>
> We have a job dumping newly added data in HDFS into Kudu table for good 
> performance of point queries. Each day we create a new range partition in 
> Kudu for the new data on this day. When we add more and more Kudu range 
> partitions, we found performance degradation of this job.
> The root cause is, the insert statement for kudu does not leverage the 
> partition predicates for kudu range partition keys, which causes skew on the 
> insert nodes.
> How to reveal this:
> Step 1: Launch impala cluster with 3 nodes.
> Step 2: Create an HDFS table with more than 3 underlying files, thus will 
> have more than 3 scan ranges
> {code:sql}
> create table default.metrics_tbl (
>   source_id string,
>   event_timestamp bigint,
>   value double
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> {code}
> Upload the three attached tsv files into its directory and refresh this table 
> in Impala.
> Step 3: Create a Kudu table with mix partitions containing 3 hash partitions 
> and 3 range partitions.
> {code:sql}
> create table default.metrics_kudu_tbl (
>   source_id string,
>   event_timestamp bigint,
>   value double,
>   primary key(source_id, event_timestamp)
> ) partition by
>   hash (source_id) PARTITIONS 3,
>   range (event_timestamp) (
> partition 0 <= values < 1,
> partition 1 <= values < 2,
> partition 2 <= values < 3
> ) stored as kudu;
> {code}
> Step 4: Dump rows in HDFS table into Kudu giving partition predicates.
> {code:sql}
> insert into table metrics_kudu_tbl
>   select source_id, event_timestamp, value from metrics_tbl
>   where event_timestamp >= 1 and event_timestamp < 2;
> {code}
> Step 5: Looking into the profile, there're three fragment instances 
> containing KuduTableSink but only one of them received and generated data.
> {code:java}
> Averaged Fragment F01:
>   KuduTableSink:
>  - TotalNumRows: 1.00K (1000)
> Fragment F01:
>   Instance 6347506799a2966d:6e82f4920004
> KuduTableSink:
>- TotalNumRows: 3.00K (3000)
>   Instance 6347506799a2966d:6e82f4920005
> KuduTableSink:
>- TotalNumRows: 0 (0)
>   Instance 6347506799a2966d:6e82f4920003
> KuduTableSink:
>- TotalNumRows: 0 (0)
> {code}
> Thus, only one fragment instance of F01 is sorting and ingesting data into 
> Impala.
> Generally, if there're N range partitions and all the inserted rows are 
> belong to one range (supplied by the partition predicates in WHERE clause), 
> only 1/N of the insert fragments are producing data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-6814) query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on remote clusters

2018-10-26 Thread David Knupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp closed IMPALA-6814.
---
Resolution: Cannot Reproduce

Apparently not. The actual line was apparently {{row_regex: .*Error parsing 
row: file: $NAMENODE/.* before offset: \d+}}, and $NAMENODE was resolving to 
"localhost." Some other change somewhere must have fixed that, because it's now 
resolving properly.

> query_test.test_queries.TestQueriesTextTables.test_strict_mode failing on 
> remote clusters
> -
>
> Key: IMPALA-6814
> URL: https://issues.apache.org/jira/browse/IMPALA-6814
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.12.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Critical
>
> It looks like {{localhost}} is hardcoded in the test verification.
>  
> *Stacktrace*
> query_test/test_queries.py:161: in test_strict_mode
>  self.run_test_case('QueryTest/strict-mode', vector)
>  common/impala_test_suite.py:427: in run_test_case
>  self.__verify_results_and_errors(vector, test_section, result, use_db)
>  common/impala_test_suite.py:300: in __verify_results_and_errors
>  replace_filenames_with_placeholder)
>  common/test_result_verifier.py:317: in verify_raw_results
>  verify_errors(expected_errors, actual_errors)
>  common/test_result_verifier.py:274: in verify_errors
>  VERIFIER_MAP['VERIFY_IS_EQUAL'](expected, actual)
>  common/test_result_verifier.py:231: in verify_query_result_is_equal
>  assert expected_results == actual_results
>  E assert Comparing QueryTestResults (expected vs actual):
>  [...]
>  E row_regex: .*Error parsing row: file: 
> hdfs://{color:#ff}*localhost*{color}:20500/.* before offset: \d+ != 
> 'Error parsing row: file: 
> hdfs://**:8020/test-warehouse/overflow/overflow.txt, before offset: 
> 454'
>  E row_regex: .*Error parsing row: file: 
> hdfs://{color:#ff}*localhost*{color}:20500/.* before offset: \d+ != 
> 'Error parsing row: file: 
> hdfs://**:8020/test-warehouse/overflow/overflow.txt, before offset: 
> 454'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7772) Print message when open file limit restricts size of file handle cache

2018-10-26 Thread Joe McDonnell (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell reassigned IMPALA-7772:
-

Assignee: Joe McDonnell

> Print message when open file limit restricts size of file handle cache
> --
>
> Key: IMPALA-7772
> URL: https://issues.apache.org/jira/browse/IMPALA-7772
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
>
> The size of the file handle cache is determined by the 
> min(max_cached_file_handles, OS limit on number of open files). Right now, 
> there is no message printed if the OS limit on the number of open files is 
> less than max_cached_file_handles. This is confusing, because the file handle 
> cache will be smaller than expected. We should print a message on startup in 
> this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7772) Print message when open file limit restricts size of file handle cache

2018-10-26 Thread Joe McDonnell (JIRA)
Joe McDonnell created IMPALA-7772:
-

 Summary: Print message when open file limit restricts size of file 
handle cache
 Key: IMPALA-7772
 URL: https://issues.apache.org/jira/browse/IMPALA-7772
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Joe McDonnell


The size of the file handle cache is determined by the 
min(max_cached_file_handles, OS limit on number of open files). Right now, 
there is no message printed if the OS limit on the number of open files is less 
than max_cached_file_handles. This is confusing, because the file handle cache 
will be smaller than expected. We should print a message on startup in this 
case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7087) Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata

2018-10-26 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7087:
---

Assignee: Sahil Takiar

> Impala is unable to read Parquet decimal columns with lower precision/scale 
> than table metadata
> ---
>
> Key: IMPALA-7087
> URL: https://issues.apache.org/jira/browse/IMPALA-7087
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: decimal, parquet
>
> This is similar to IMPALA-2515, except relates to a different precision/scale 
> in the file metadata rather than just a mismatch in the bytes used to store 
> the data. In a lot of cases we should be able to convert the decimal type on 
> the fly to the higher-precision type.
> {noformat}
> ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid 
> type length. Expecting: 11 len in file: 8
> {noformat}
> It would be convenient to allow reading parquet files where the 
> precision/scale in the file can be converted to the precision/scale in the 
> table metadata without loss of precision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7640) ALTER TABLE RENAME on managed Kudu table should rename underlying Kudu table

2018-10-26 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7640:
---

Assignee: Sahil Takiar

> ALTER TABLE RENAME on managed Kudu table should rename underlying Kudu table
> 
>
> Key: IMPALA-7640
> URL: https://issues.apache.org/jira/browse/IMPALA-7640
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.12.0
>Reporter: Mike Percy
>Assignee: Sahil Takiar
>Priority: Major
>
> Currently, when I execute ALTER TABLE RENAME on a managed Kudu table it will 
> not rename the underlying Kudu table. Because of IMPALA-5654 it becomes 
> nearly impossible to rename the underlying Kudu table, which is confusing and 
> makes the Kudu tables harder to identify and manage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7507) Clean up user-facing error messages in LocalCatalog mode

2018-10-26 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-7507:
--
Affects Version/s: Impala 3.1.0

> Clean up user-facing error messages in LocalCatalog mode
> 
>
> Key: IMPALA-7507
> URL: https://issues.apache.org/jira/browse/IMPALA-7507
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Todd Lipcon
>Priority: Major
>
> Currently even normal error messages for things like missing databases are 
> quite ugly when running with LocalCatalog:
> {code}
> ERROR: LocalCatalogException: Could not load table names for database 
> 'test_minimal_topic_updates_b246004e' from HMS
> CAUSED BY: TException: 
> TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
> error_msgs:[CatalogException: Database not found: 
> test_minimal_topic_updates_b246004e]))
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7507) Clean up user-facing error messages in LocalCatalog mode

2018-10-26 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-7507:
--
Component/s: Catalog

> Clean up user-facing error messages in LocalCatalog mode
> 
>
> Key: IMPALA-7507
> URL: https://issues.apache.org/jira/browse/IMPALA-7507
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Todd Lipcon
>Priority: Major
>
> Currently even normal error messages for things like missing databases are 
> quite ugly when running with LocalCatalog:
> {code}
> ERROR: LocalCatalogException: Could not load table names for database 
> 'test_minimal_topic_updates_b246004e' from HMS
> CAUSED BY: TException: 
> TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
> error_msgs:[CatalogException: Database not found: 
> test_minimal_topic_updates_b246004e]))
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7507) Clean up user-facing error messages in LocalCatalog mode

2018-10-26 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v reassigned IMPALA-7507:
-

Assignee: Anurag Mantripragada

> Clean up user-facing error messages in LocalCatalog mode
> 
>
> Key: IMPALA-7507
> URL: https://issues.apache.org/jira/browse/IMPALA-7507
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Todd Lipcon
>Assignee: Anurag Mantripragada
>Priority: Major
>
> Currently even normal error messages for things like missing databases are 
> quite ugly when running with LocalCatalog:
> {code}
> ERROR: LocalCatalogException: Could not load table names for database 
> 'test_minimal_topic_updates_b246004e' from HMS
> CAUSED BY: TException: 
> TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
> error_msgs:[CatalogException: Database not found: 
> test_minimal_topic_updates_b246004e]))
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7532) Add retry/back-off to fetch-from-catalog RPCs

2018-10-26 Thread bharath v (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665310#comment-16665310
 ] 

bharath v commented on IMPALA-7532:
---

[~tianyiwang] can this be closed?

> Add retry/back-off to fetch-from-catalog RPCs
> -
>
> Key: IMPALA-7532
> URL: https://issues.apache.org/jira/browse/IMPALA-7532
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Tianyi Wang
>Priority: Major
>
> Currently if there is an error connecting to the catalog server, the 'fetch 
> from catalog' implementation will retry with no apparent backoff. We should 
> retry for some period of time with backoff in between the attempts, so that 
> impala can ride over short interruptions of the catalog service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7771) Download page should not link to unreleased code

2018-10-26 Thread Jim Apple (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665184#comment-16665184
 ] 

Jim Apple commented on IMPALA-7771:
---

And which clause of the policy do you believe that violates?

> Download page should not link to unreleased code
> 
>
> Key: IMPALA-7771
> URL: https://issues.apache.org/jira/browse/IMPALA-7771
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sebb
>Priority: Major
>
> The download page must not link to unleased code such as repos:
> http://www.apache.org/dev/release-download-pages.html#links
> Such links are only to be published on pages for developers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7771) Download page should not link to unreleased code

2018-10-26 Thread Sebb (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665179#comment-16665179
 ] 

Sebb commented on IMPALA-7771:
--

The download page:

https://impala.apache.org/downloads.html

references

https://git-wip-us.apache.org/repos/asf/impala.git


> Download page should not link to unreleased code
> 
>
> Key: IMPALA-7771
> URL: https://issues.apache.org/jira/browse/IMPALA-7771
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sebb
>Priority: Major
>
> The download page must not link to unleased code such as repos:
> http://www.apache.org/dev/release-download-pages.html#links
> Such links are only to be published on pages for developers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7771) Download page should not link to unreleased code

2018-10-26 Thread Jim Apple (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665170#comment-16665170
 ] 

Jim Apple commented on IMPALA-7771:
---

Thanks for taking an interest, [~s...@apache.org]. I'm not sure exactly what 
links on that page violate exactly which clause in the policy that you linked 
to.

> Download page should not link to unreleased code
> 
>
> Key: IMPALA-7771
> URL: https://issues.apache.org/jira/browse/IMPALA-7771
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sebb
>Priority: Major
>
> The download page must not link to unleased code such as repos:
> http://www.apache.org/dev/release-download-pages.html#links
> Such links are only to be published on pages for developers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7771) Download page should not link to unreleased code

2018-10-26 Thread Sebb (JIRA)
Sebb created IMPALA-7771:


 Summary: Download page should not link to unreleased code
 Key: IMPALA-7771
 URL: https://issues.apache.org/jira/browse/IMPALA-7771
 Project: IMPALA
  Issue Type: Bug
Reporter: Sebb


The download page must not link to unleased code such as repos:

http://www.apache.org/dev/release-download-pages.html#links

Such links are only to be published on pages for developers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7770) SPLIT_PART to support negative indexes

2018-10-26 Thread Tristan Stevens (JIRA)
Tristan Stevens created IMPALA-7770:
---

 Summary: SPLIT_PART to support negative indexes
 Key: IMPALA-7770
 URL: https://issues.apache.org/jira/browse/IMPALA-7770
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Tristan Stevens


Request is for SPLIT_PART to support negative indexes i.e. support right to 
left searching.

See Snowflake documentation for details: 
https://docs.snowflake.net/manuals/sql-reference/functions/split_part.html:

partNr: Requested part of the split (1-based). 0 is treated as 1. If the value 
is negative, the parts are counted from the right side of the string.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org