[jira] [Commented] (IMPALA-8781) Add additional tests in test_result_spooling.py and validate cancellation logic
[ https://issues.apache.org/jira/browse/IMPALA-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901695#comment-16901695 ] ASF subversion and git services commented on IMPALA-8781: - Commit e849855b232660d418dd5e226221bfc853b1ad9c in impala's branch refs/heads/master from Sahil Takiar [ https://gitbox.apache.org/repos/asf?p=impala.git;h=e849855 ] IMPALA-8781: Fix TestResultSpooling::test_multi_batches Prefix the query in TestResultSpooling::test_multi_batches with the database name. This was causing the Dockerized tests to fail. I double checked what other tests do and all the ones I saw either switch to the appropriate database or prefix the table name using the database name. The latter seemed more straightforward. I was not able to re-produce this locally, and its odd that this only affected the Dockerized tests (even more odd is that it seems to either be intermittent, or only affecting Dockerized tests triggered by gerrit-verify-dryrun-external). Regardless, it is a straightforward fix that makes the TestResultSpooling::test_multi_batches consistent with the rest of the tests. Testing: * Ran test_result_spooling.py locally using both bin/impala-py.test and tests/run-tests.py. Change-Id: I939eedba37003f5c720cea96e5c3532e2cc6312c Reviewed-on: http://gerrit.cloudera.org:8080/14022 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Add additional tests in test_result_spooling.py and validate cancellation > logic > --- > > Key: IMPALA-8781 > URL: https://issues.apache.org/jira/browse/IMPALA-8781 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Fix For: Impala 3.3.0 > > > {{test_result_spooling.py}} currently runs a few basic tests with result > spooling enabled. We should add some more to cover all necessary edge cases > (ensure all Impala types are returned correctly, UDFs are evaluated > correctly, etc.) and add tests to validate the cancellation logic in > {{PlanRootSink}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8833) Check failed: bit_width <= sizeof(T) * 8 (40 vs. 32) in BatchedBitReader::UnpackBatch()
[ https://issues.apache.org/jira/browse/IMPALA-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901669#comment-16901669 ] ASF subversion and git services commented on IMPALA-8833: - Commit 33f1e86ce3dc5e278b804004671062f46d42d90e in impala's branch refs/heads/master from Daniel Becker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=33f1e86 ] IMPALA-8833: Check failed in BatchedBitReader::UnpackBatch() After raising the maximum bit width for bit packing to 64 bits, DictDecoder accepted bit widths between 32 and 64, but internally it uses 32 bit integers and unpacking ran into a DCHECK. Adding a check to DictDecoder to catch if the bit width is higher than 32. Testing: Added a test that asserts that DictDecoder accepts bit widths 0-32 and rejects higher bit widths which could still be unpacked otherwise. Change-Id: I4cba3338a93f8287c24abbe3ad9bfcbfa756bca4 Reviewed-on: http://gerrit.cloudera.org:8080/14019 Tested-by: Impala Public Jenkins Reviewed-by: Tim Armstrong > Check failed: bit_width <= sizeof(T) * 8 (40 vs. 32) in > BatchedBitReader::UnpackBatch() > > > Key: IMPALA-8833 > URL: https://issues.apache.org/jira/browse/IMPALA-8833 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Daniel Becker >Priority: Blocker > Labels: broken-build, crash, flaky > > {noformat} > F0801 21:24:10.571285 15993 bit-stream-utils.inline.h:126] > d04ba69d5da8ffd1:a9045b820001] Check failed: bit_width <= sizeof(T) * 8 > (40 vs. 32) > *** Check failure stack trace: *** > @ 0x52f63ac google::LogMessage::Fail() > @ 0x52f7c51 google::LogMessage::SendToLog() > @ 0x52f5d86 google::LogMessage::Flush() > @ 0x52f934d google::LogMessageFatal::~LogMessageFatal() > @ 0x2b265b5 impala::BatchedBitReader::UnpackBatch<>() > @ 0x2ae8623 impala::RleBatchDecoder<>::FillLiteralBuffer() > @ 0x2b2cadb impala::RleBatchDecoder<>::DecodeLiteralValues<>() > @ 0x2b27bfb impala::DictDecoder<>::DecodeNextValue() > @ 0x2b16fed > impala::ScalarColumnReader<>::ReadSlotsNoConversion() > @ 0x2ac7252 impala::ScalarColumnReader<>::ReadSlots() > @ 0x2a76cef > impala::ScalarColumnReader<>::MaterializeValueBatchRepeatedDefLevel() > @ 0x2a58faa impala::ScalarColumnReader<>::ReadValueBatch<>() > @ 0x2a20e8e > impala::ScalarColumnReader<>::ReadNonRepeatedValueBatch() > @ 0x29b189c impala::HdfsParquetScanner::AssembleRows() > @ 0x29ac6de impala::HdfsParquetScanner::GetNextInternal() > @ 0x29aa656 impala::HdfsParquetScanner::ProcessSplit() > @ 0x249172d impala::HdfsScanNode::ProcessSplit() > @ 0x2490902 impala::HdfsScanNode::ScannerThread() > @ 0x248fc8b > _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv > @ 0x2492253 > {noformat} > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6915 > Log lines around the failure: > {noformat} > [gw5] PASSED > query_test/test_scanners.py::TestParquet::test_bad_compression_codec[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: parquet/none] > query_test/test_nested_types.py::TestMaxNestingDepth::test_load_hive_table[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_scanners.py::TestParquet::test_bad_compression_codec[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'debug_action': > '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > [gw1] PASSED > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q7[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q8[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshol
[jira] [Commented] (IMPALA-8376) Add per-directory limits for scratch disk usage
[ https://issues.apache.org/jira/browse/IMPALA-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901664#comment-16901664 ] ASF subversion and git services commented on IMPALA-8376: - Commit 411189a8d733a66c363c72f8c404123d68640a3e in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=411189a ] IMPALA-8376: directory limits for scratch usage This extends the --scratch_dirs syntax to support specifying a max capacity per directory, similarly to the --data_cache confirmation. The capacity is delimited from the directory name with ":" and uses the usual syntax for specifying memory. The following are valid arguments: * --scratch_dirs=/dir1,/dir2 (no limits) * --scratch_dirs=/dir1,/dir2:25G (only a limit on /dir2) * --scratch_dirs=/dir1:5MB,/dir2 (only a limit on /dir) * --scratch_dirs=/dir1:-1,/dir2:0 (alternative ways of expressing no limit) The usage is tracked with a metric per directory. Allocations from that directory start to fail when the limit is exceeded. These metrics are exposed as tmp-file-mgr.scratch-space-bytes-used.dir-0, tmp-file-mgr.scratch-space-bytes-used.dir-1, etc. Also add support for parsing terabyte specifiers to a utility function that is used for parsing many configurations. Testing: Added a unit test to exercise TmpFileMgr. Manually ran a spilling query on an impalad with multiple scratch dirs configured with different limits. Confirmed via metrics that the capacities were enforced. Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441 Reviewed-on: http://gerrit.cloudera.org:8080/13986 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Add per-directory limits for scratch disk usage > --- > > Key: IMPALA-8376 > URL: https://issues.apache.org/jira/browse/IMPALA-8376 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Labels: resource-management > Fix For: Impala 3.3.0 > > > The current syntax is: > {noformat} > --scratch_dirs=/data/1/impala/impalad,/data/10/impala/impalad,/data/11/impala/impalad,/data/2/impala/impalad,/data/3/impala/impalad,/data/4/impala/impalad,/data/5/impala/impalad,/data/6/impala/impalad,/data/7/impala/impalad,/data/8/impala/impalad,/data/9/impala/impalad,/data/12/impala/impalad > {noformat} > The current syntax for the data cache is > {noformat} > --data_cache_dir=/tmp --data_cache_size=500MB > {noformat} > One idea is to allow optionally specifying the limit after each directory: > {noformat} > --scratch_dirs=/data/1/impala/impalad:500MB,/data/10/impala/impalad:2GB,/data/11/impala/impalad > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8376) Add per-directory limits for scratch disk usage
[ https://issues.apache.org/jira/browse/IMPALA-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8376. --- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Add per-directory limits for scratch disk usage > --- > > Key: IMPALA-8376 > URL: https://issues.apache.org/jira/browse/IMPALA-8376 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Labels: resource-management > Fix For: Impala 3.3.0 > > > The current syntax is: > {noformat} > --scratch_dirs=/data/1/impala/impalad,/data/10/impala/impalad,/data/11/impala/impalad,/data/2/impala/impalad,/data/3/impala/impalad,/data/4/impala/impalad,/data/5/impala/impalad,/data/6/impala/impalad,/data/7/impala/impalad,/data/8/impala/impalad,/data/9/impala/impalad,/data/12/impala/impalad > {noformat} > The current syntax for the data cache is > {noformat} > --data_cache_dir=/tmp --data_cache_size=500MB > {noformat} > One idea is to allow optionally specifying the limit after each directory: > {noformat} > --scratch_dirs=/data/1/impala/impalad:500MB,/data/10/impala/impalad:2GB,/data/11/impala/impalad > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8627) Re-enable catalog v2 in containers
[ https://issues.apache.org/jira/browse/IMPALA-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901663#comment-16901663 ] ASF subversion and git services commented on IMPALA-8627: - Commit 39613c8226aeb48f639bccc361f002c7085cf75a in impala's branch refs/heads/master from Vihang Karajgaonkar [ https://gitbox.apache.org/repos/asf?p=impala.git;h=39613c8 ] IMPALA-8627: Enable catalog-v2 in tests This patch enables catalog-v2 by default in all the tests. Test fixes: 1. Modified test_observability which fails on catalog-v2 since the profile emits different metadata load events. The test now looks for the right events on the profile depending on whether catalogv2 is enabled or not. 2. TableName.java constructor allows non-lowercased table and database names. This causes problems at the local catalog cache which expects the tablenames to be always in lowercase. More details on this failure are available in IMPALA-8627. The patch makes sure that the loadTable requests in local catalog do a explicit conversion of tablename to lowercase in order to get around the issue. 3. Fixes the JdbcTest which checks for existence of table comment in the getTables metadata jdbc call. In catalog-v2 since the columns are not requested, LocalTable is not loaded and hence the test needs to be modified to check if catalog-v2 is enabled. 4. Skips test_sanity which creates a Hive db and issues a invalidate metadata to make it visible in catalog. Unfortunately, in catalog-v2 currently there is no way to see a newly created database when event polling is disabled. 5. Similar to above (4) test_metadata_query_statements.py creates a hive db and issues a invalidate metadata. The test runs QueryTest/describe-db which is split into two one for checking the hive-db and other contains rest of the queries of the original describe-db. The split makes it possible to only execute the test partially when catalog-v2 is enabled Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d Reviewed-on: http://gerrit.cloudera.org:8080/13933 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Re-enable catalog v2 in containers > -- > > Key: IMPALA-8627 > URL: https://issues.apache.org/jira/browse/IMPALA-8627 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Vihang Karajgaonkar >Priority: Major > Labels: catalog-v2 > Fix For: Impala 3.3.0 > > > We also need to set --invalidate_tables_on_memory_pressure on the impalads > for that to be fully effective - the impalads send table usage info to the > catalogd -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8627) Re-enable catalog v2 in containers
[ https://issues.apache.org/jira/browse/IMPALA-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901662#comment-16901662 ] ASF subversion and git services commented on IMPALA-8627: - Commit 39613c8226aeb48f639bccc361f002c7085cf75a in impala's branch refs/heads/master from Vihang Karajgaonkar [ https://gitbox.apache.org/repos/asf?p=impala.git;h=39613c8 ] IMPALA-8627: Enable catalog-v2 in tests This patch enables catalog-v2 by default in all the tests. Test fixes: 1. Modified test_observability which fails on catalog-v2 since the profile emits different metadata load events. The test now looks for the right events on the profile depending on whether catalogv2 is enabled or not. 2. TableName.java constructor allows non-lowercased table and database names. This causes problems at the local catalog cache which expects the tablenames to be always in lowercase. More details on this failure are available in IMPALA-8627. The patch makes sure that the loadTable requests in local catalog do a explicit conversion of tablename to lowercase in order to get around the issue. 3. Fixes the JdbcTest which checks for existence of table comment in the getTables metadata jdbc call. In catalog-v2 since the columns are not requested, LocalTable is not loaded and hence the test needs to be modified to check if catalog-v2 is enabled. 4. Skips test_sanity which creates a Hive db and issues a invalidate metadata to make it visible in catalog. Unfortunately, in catalog-v2 currently there is no way to see a newly created database when event polling is disabled. 5. Similar to above (4) test_metadata_query_statements.py creates a hive db and issues a invalidate metadata. The test runs QueryTest/describe-db which is split into two one for checking the hive-db and other contains rest of the queries of the original describe-db. The split makes it possible to only execute the test partially when catalog-v2 is enabled Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d Reviewed-on: http://gerrit.cloudera.org:8080/13933 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Re-enable catalog v2 in containers > -- > > Key: IMPALA-8627 > URL: https://issues.apache.org/jira/browse/IMPALA-8627 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Vihang Karajgaonkar >Priority: Major > Labels: catalog-v2 > Fix For: Impala 3.3.0 > > > We also need to set --invalidate_tables_on_memory_pressure on the impalads > for that to be fully effective - the impalads send table usage info to the > catalogd -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8813) Impala Doc: Support Hive ACID Insert-only Tables
[ https://issues.apache.org/jira/browse/IMPALA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901644#comment-16901644 ] Alex Rodoni commented on IMPALA-8813: - https://gerrit.cloudera.org/#/c/14021/ > Impala Doc: Support Hive ACID Insert-only Tables > > > Key: IMPALA-8813 > URL: https://issues.apache.org/jira/browse/IMPALA-8813 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Alex Rodoni >Assignee: Alex Rodoni >Priority: Major > Labels: future_release_doc, in_33 > > Create, Insert, and read Insert-only ACID tables. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8811) Impala Doc: query option to change default ACID type of new tables
[ https://issues.apache.org/jira/browse/IMPALA-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Rodoni updated IMPALA-8811: Description: https://gerrit.cloudera.org/#/c/14021/ > Impala Doc: query option to change default ACID type of new tables > -- > > Key: IMPALA-8811 > URL: https://issues.apache.org/jira/browse/IMPALA-8811 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Alex Rodoni >Assignee: Alex Rodoni >Priority: Major > Labels: future_release_doc, in_33 > > https://gerrit.cloudera.org/#/c/14021/ -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8811) Impala Doc: query option to change default ACID type of new tables
[ https://issues.apache.org/jira/browse/IMPALA-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8811 started by Alex Rodoni. --- > Impala Doc: query option to change default ACID type of new tables > -- > > Key: IMPALA-8811 > URL: https://issues.apache.org/jira/browse/IMPALA-8811 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Alex Rodoni >Assignee: Alex Rodoni >Priority: Major > Labels: future_release_doc, in_33 > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8786) BufferedPlanRootSink should directly write to a QueryResultSet if one is available
[ https://issues.apache.org/jira/browse/IMPALA-8786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901557#comment-16901557 ] Sahil Takiar commented on IMPALA-8786: -- Agree. If the batch boundaries don't match up properly, this optimization won't work. Another issue is that the client waits until rows are available before calling {{GetNext}} so its not always likely that a {{QueryResultSet}} will be available when the first {{RowBatch}} is produced. > BufferedPlanRootSink should directly write to a QueryResultSet if one is > available > -- > > Key: IMPALA-8786 > URL: https://issues.apache.org/jira/browse/IMPALA-8786 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > > {{BufferedPlanRootSink}} uses a {{RowBatchQueue}} to buffer {{RowBatch}}-es > and then the consumer thread reads them and writes them to a given > {{QueryResultSet}}. Implementations of {{RowBatchQueue}} might end up copying > the buffered {{RowBatch}}-es (e.g. if the queue is backed by a > {{BufferedTupleStream}}). An optimization would be for the producer thread to > directly write to the consumer {{QueryResultSet}}. This optimization would > only be triggered if (1) the queue is empty, and (2) the consumer thread has > a {{QueryResultSet}} available for writing. > This "fast path" is useful in a few different scenarios: > * If the consumer is faster than at reading rows than the producer is at > sending them; in this case, the overhead of buffering rows in a > {{RowBatchQueue}} can be completely avoided > * For queries that return under 1024 its likely that the consumer will > produce a {{QueryResultSet}} before the first {{RowBatch}} is returned > (except perhaps for very trivial queries) -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8840) Check failed: num_bytes <= sizeof(T) (5 vs. 4)
Xiaomeng Zhang created IMPALA-8840: -- Summary: Check failed: num_bytes <= sizeof(T) (5 vs. 4) Key: IMPALA-8840 URL: https://issues.apache.org/jira/browse/IMPALA-8840 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.3.0 Reporter: Xiaomeng Zhang Assignee: Daniel Becker Not sure if this is due to same issue as https://issues.apache.org/jira/browse/IMPALA-8833#, the error message is a little different. {code:java} F0805 18:48:08.737411 5488 bit-stream-utils.inline.h:173] 284731e5d1aad693:05c883020001] Check failed: num_bytes <= sizeof(T) (8 vs. 4) *** Check failure stack trace: *** @ 0x52fb9bc google::LogMessage::Fail() @ 0x52fd261 google::LogMessage::SendToLog() @ 0x52fb396 google::LogMessage::Flush() @ 0x52fe95d google::LogMessageFatal::~LogMessageFatal() @ 0x2b2b867 impala::BatchedBitReader::GetBytes<>() @ 0x2aeda65 impala::RleBatchDecoder<>::NextCounts() @ 0x2a82896 impala::RleBatchDecoder<>::NextNumRepeats() @ 0x2b1927f impala::ScalarColumnReader<>::ReadSlotsNoConversion() @ 0x2ac7c2c impala::ScalarColumnReader<>::ReadSlots() @ 0x2a7b861 impala::ScalarColumnReader<>::MaterializeValueBatchRepeatedDefLevel() @ 0x2a5b3b0 impala::ScalarColumnReader<>::ReadValueBatch<>() @ 0x2a256a4 impala::ScalarColumnReader<>::ReadNonRepeatedValueBatch() @ 0x29b6eb6 impala::HdfsParquetScanner::AssembleRows() @ 0x29b1cf8 impala::HdfsParquetScanner::GetNextInternal() @ 0x29afc70 impala::HdfsParquetScanner::ProcessSplit() @ 0x2494bc3 impala::HdfsScanNode::ProcessSplit() @ 0x2493d98 impala::HdfsScanNode::ScannerThread() @ 0x2493121 _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv @ 0x24956e9 _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE @ 0x1ea0241 boost::function0<>::operator()() @ 0x23de77a impala::Thread::SuperviseThread() @ 0x23e6afe boost::_bi::list5<>::operator()<>() @ 0x23e6a22 boost::_bi::bind_t<>::operator()() @ 0x23e69e5 boost::detail::thread_data<>::run() @ 0x4224819 thread_proxy @ 0x7fc1818c5e24 start_thread @ 0x7fc17e01f34c __clone {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8816) custom cluster tests in precommit are taking close to 2 hours
[ https://issues.apache.org/jira/browse/IMPALA-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901540#comment-16901540 ] Tim Armstrong commented on IMPALA-8816: --- I'll leave this open still since the runtime is still pretty bad. Here's the table of test durations. |[Package|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]|[Duration|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]|[Fail|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]|[(diff)|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]|[Skip|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]|[(diff)|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]|[Pass|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]|[(diff)|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]|[Total|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]|[(diff)|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/]| |[authorization.test_authorization|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/authorization.test_authorization/]|2 min 32 sec|0| |0| |8|+8|8|+8| |[authorization.test_authorized_proxy|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/authorization.test_authorized_proxy/]|1 min 29 sec|0| |0| |6|+6|6|+6| |[authorization.test_grant_revoke|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/authorization.test_grant_revoke/]|5 min 13 sec|0| |0| |9|+9|9|+9| |[authorization.test_owner_privileges|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/authorization.test_owner_privileges/]|2 min 5 sec|0| |0| |4|+4|4|+4| |[authorization.test_provider|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/authorization.test_provider/]|23 sec|0| |0| |1|+1|1|+1| |[authorization.test_ranger|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/authorization.test_ranger/]|2 min 37 sec|0| |0| |8|+8|8|+8| |[authorization.test_sentry|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/authorization.test_sentry/]|45 sec|0| |0| |3|+3|3|+3| |[authorization.test_show_grant|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/authorization.test_show_grant/]|1 min 2 sec|0| |0| |2|+2|2|+2| |[catalog_service.test_catalog_service_client|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/catalog_service.test_catalog_service_client/]|2.3 sec|0| |0| |1|+1|1|+1| |[catalog_service.test_hms_failure|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/catalog_service.test_hms_failure/]|1 min 3 sec|0| |0| |1|+1|1|+1| |[catalog_service.test_large_num_partitions|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/catalog_service.test_large_num_partitions/]|10 sec|0| |0| |2|+2|2|+2| |[custom_cluster.test_admission_controller|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_admission_controller/]|8 min 41 sec|0| |3|+3|21|+21|24|+24| |[custom_cluster.test_alloc_fail|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_alloc_fail/]|50 sec|0| |1|+1|1|+1|2|+2| |[custom_cluster.test_always_false_filter|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_always_false_filter/]|0 ms|0| |1|+1|0| |1|+1| |[custom_cluster.test_auto_scaling|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_auto_scaling/]|0 ms|0| |3|+3|0| |3|+3| |[custom_cluster.test_automatic_invalidation|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_automatic_invalidation/]|1 min 31 sec|0| |0| |4|+4|4|+4| |[custom_cluster.test_blacklist|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_blacklist/]|0 ms|0| |2|+2|0| |2|+2| |[custom_cluster.test_breakpad|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_breakpad/]|7.3 sec|0| |10|+10|1|+1|11|+11| |[custom_cluster.test_catalog_wait|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_catalog_wait/]|18 sec|0| |0| |1|+1|1|+1| |[custom_cluster.test_client_ssl|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_client_ssl/]|2 min 53 sec|0| |6|+6|10|+10|16|+16| |[custom_cluster.test_compact_catalog_updates|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_compact_catalog_updates/]|0 ms|0| |1|+1|0| |1|+1| |[custom_cluster.test_coordinators|https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6995/testReport/custom_cluster.test_coordinators/]|3
[jira] [Created] (IMPALA-8839) Impala writing data to tables should not lead to incorrect results in Hive
Yongzhi Chen created IMPALA-8839: Summary: Impala writing data to tables should not lead to incorrect results in Hive Key: IMPALA-8839 URL: https://issues.apache.org/jira/browse/IMPALA-8839 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.3.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen This include partitioned/unpartitioned tables: The proposed solution for this issue is that when Impala writes data to an unpartitioned table, it should update 'COLUMN_STATS_ACCURATE' json structure in table properties by removing its 'COLUMN_STATS' nested field (this will end up in TABLE_PARAMS table in HMS). The proposed solution for this issue is that when Impala writes data to a partitioned table, it should update 'COLUMN_STATS_ACCURATE' json structure by removing its 'COLUMN_STATS' nested field in the properties of the partitions where data was inserted (PARTITION_PARAMS table in HMS). -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8816) custom cluster tests in precommit are taking close to 2 hours
[ https://issues.apache.org/jira/browse/IMPALA-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901493#comment-16901493 ] ASF subversion and git services commented on IMPALA-8816: - Commit 4fb8e8e324ad3258d24d0ae40946c954b6c21a8d in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=4fb8e8e ] IMPALA-8816: reduce custom cluster test runtime in core This includes some optimisations and a bulk move of tests to exhaustive. Move a bunch of custom cluster tests to exhaustive. I selected these partially based on runtime (i.e. I looked most carefully at the tests that ran for over a minute) and the likelihood of them catching a precommit bug. Regression tests for specific edge cases and tests for parts of the code that are very stable were prime candidates. Remove an unnecessary cluster restart in test_breakpad. Merge test_scheduler_error into test_failpoints to avoid an unnecessary cluster restart. Speed up cluster starts by ensuring that the default statestore args are applied even when _start_impala_cluster() is called directly. This shaves a couple of seconds off each restart. We made the default args use a faster update frequency - see IMPALA-7185 - but they did not take effect in all tests. Change-Id: Ib2e3e7ebc9695baec4d69183387259958df10f62 Reviewed-on: http://gerrit.cloudera.org:8080/13967 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins > custom cluster tests in precommit are taking close to 2 hours > - > > Key: IMPALA-8816 > URL: https://issues.apache.org/jira/browse/IMPALA-8816 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > This is affecting precommit times substantially. We should either speed up > the tests or, more likely, move some to exhaustive. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7185) Reduce statestore frequency for custom cluster tests by default
[ https://issues.apache.org/jira/browse/IMPALA-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901494#comment-16901494 ] ASF subversion and git services commented on IMPALA-7185: - Commit 4fb8e8e324ad3258d24d0ae40946c954b6c21a8d in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=4fb8e8e ] IMPALA-8816: reduce custom cluster test runtime in core This includes some optimisations and a bulk move of tests to exhaustive. Move a bunch of custom cluster tests to exhaustive. I selected these partially based on runtime (i.e. I looked most carefully at the tests that ran for over a minute) and the likelihood of them catching a precommit bug. Regression tests for specific edge cases and tests for parts of the code that are very stable were prime candidates. Remove an unnecessary cluster restart in test_breakpad. Merge test_scheduler_error into test_failpoints to avoid an unnecessary cluster restart. Speed up cluster starts by ensuring that the default statestore args are applied even when _start_impala_cluster() is called directly. This shaves a couple of seconds off each restart. We made the default args use a faster update frequency - see IMPALA-7185 - but they did not take effect in all tests. Change-Id: Ib2e3e7ebc9695baec4d69183387259958df10f62 Reviewed-on: http://gerrit.cloudera.org:8080/13967 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins > Reduce statestore frequency for custom cluster tests by default > --- > > Key: IMPALA-7185 > URL: https://issues.apache.org/jira/browse/IMPALA-7185 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 2.13.0, Impala 3.1.0 > > > It takes several seconds to run the first query after cluster startup because > of the statestore propagation delay for the catalog, which adds some real > time to custom cluster tests. We should think about lowering the default > update interval for those tests to make them start up faster. > We could just prefix the statestored_args with lower values, allowing > individual tests to override if needed. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8600) Reload partition does not work for transactional tables
[ https://issues.apache.org/jira/browse/IMPALA-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901492#comment-16901492 ] ASF subversion and git services commented on IMPALA-8600: - Commit 972104b6d6611ba0c1667671f9c25061fbe19b55 in impala's branch refs/heads/master from Gabor Kaszab [ https://gitbox.apache.org/repos/asf?p=impala.git;h=972104b ] IMPALA-8600: AnalyzerTest.TestAnalyzeTransactional() test fix Adjusts expected error message in AnalyzerTest.TestAnalyzeTransactional() after rewriting the message. Change-Id: I7f1ed5da8cd3511eae4db12fb5ce1235aee50fd6 Reviewed-on: http://gerrit.cloudera.org:8080/14017 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Reload partition does not work for transactional tables > --- > > Key: IMPALA-8600 > URL: https://issues.apache.org/jira/browse/IMPALA-8600 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > Fix For: Impala 3.3.0 > > > If a table is transactional, a reload partition call should fetch the valid > writeIds. Without doing this, the reload will skip adding all the newly > created delta files of the transactional table pertaining to the new writeIds. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8534) Enable data cache by default for end-to-end containerised tests
[ https://issues.apache.org/jira/browse/IMPALA-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned IMPALA-8534: - Assignee: Tim Armstrong > Enable data cache by default for end-to-end containerised tests > --- > > Key: IMPALA-8534 > URL: https://issues.apache.org/jira/browse/IMPALA-8534 > Project: IMPALA > Issue Type: Sub-task >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > Following on from IMPALA-8121, I don't think we can enable the data cache by > default, since it depends on what volumes are available to the container at > runtime. But we should definitely enable it for tests. > [~kwho] said > {quote}When I tested with the data cache enabled in a mini-cluster with 3 > node using the default scale of workload, I ran with 500 MB with 1 partition > by running > start-impala-cluster.py --data_cache_dir=/tmp --data_cache_size=500MB > You can also a pre-existing directory as the startup flag of Impala like > --data_cache=/tmp/data-cache-0:500MB > {quote} > start-impala-cluster.py already mounts some host directories into the > container, so we could either do the same for the data cache, or just depend > on the container root filesystem (which is likely to be slow, unfortunately). -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8451) Default configs for admission control
[ https://issues.apache.org/jira/browse/IMPALA-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8451: -- Labels: docker (was: ) > Default configs for admission control > - > > Key: IMPALA-8451 > URL: https://issues.apache.org/jira/browse/IMPALA-8451 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Labels: docker > > We probably want to have some basic admission control enabled for the > dockerised containers. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8451) Default configs for admission control
[ https://issues.apache.org/jira/browse/IMPALA-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8451 started by Tim Armstrong. - > Default configs for admission control > - > > Key: IMPALA-8451 > URL: https://issues.apache.org/jira/browse/IMPALA-8451 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > We probably want to have some basic admission control enabled for the > dockerised containers. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8534) Enable data cache by default for end-to-end containerised tests
[ https://issues.apache.org/jira/browse/IMPALA-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8534. --- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Enable data cache by default for end-to-end containerised tests > --- > > Key: IMPALA-8534 > URL: https://issues.apache.org/jira/browse/IMPALA-8534 > Project: IMPALA > Issue Type: Sub-task >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.3.0 > > > Following on from IMPALA-8121, I don't think we can enable the data cache by > default, since it depends on what volumes are available to the container at > runtime. But we should definitely enable it for tests. > [~kwho] said > {quote}When I tested with the data cache enabled in a mini-cluster with 3 > node using the default scale of workload, I ran with 500 MB with 1 partition > by running > start-impala-cluster.py --data_cache_dir=/tmp --data_cache_size=500MB > You can also a pre-existing directory as the startup flag of Impala like > --data_cache=/tmp/data-cache-0:500MB > {quote} > start-impala-cluster.py already mounts some host directories into the > container, so we could either do the same for the data cache, or just depend > on the container root filesystem (which is likely to be slow, unfortunately). -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8425) Optimize size of Impala docker containers
[ https://issues.apache.org/jira/browse/IMPALA-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8425: -- Issue Type: Improvement (was: Bug) > Optimize size of Impala docker containers > - > > Key: IMPALA-8425 > URL: https://issues.apache.org/jira/browse/IMPALA-8425 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Joe McDonnell >Priority: Major > > We should take a look at the size of the containers produced by the build and > see if it is worth optimising. I think the main contributor to size right now > is likely the impala binary, which we could shrink by stripping debug > symbols. We could also look at using a different base image from Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8462) Get exhaustive tests passing with dockerised minicluster
[ https://issues.apache.org/jira/browse/IMPALA-8462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8462: -- Issue Type: Improvement (was: Sub-task) Parent: (was: IMPALA-7947) > Get exhaustive tests passing with dockerised minicluster > > > Key: IMPALA-8462 > URL: https://issues.apache.org/jira/browse/IMPALA-8462 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8425) Optimize size of Impala docker containers
[ https://issues.apache.org/jira/browse/IMPALA-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8425: -- Issue Type: Bug (was: Sub-task) Parent: (was: IMPALA-7947) > Optimize size of Impala docker containers > - > > Key: IMPALA-8425 > URL: https://issues.apache.org/jira/browse/IMPALA-8425 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Joe McDonnell >Priority: Major > > We should take a look at the size of the containers produced by the build and > see if it is worth optimising. I think the main contributor to size right now > is likely the impala binary, which we could shrink by stripping debug > symbols. We could also look at using a different base image from Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8425) Optimize size of Impala docker containers
[ https://issues.apache.org/jira/browse/IMPALA-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8425: -- Summary: Optimize size of Impala docker containers (was: Evaluate size of Impala docker containers) > Optimize size of Impala docker containers > - > > Key: IMPALA-8425 > URL: https://issues.apache.org/jira/browse/IMPALA-8425 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Joe McDonnell >Priority: Major > > We should take a look at the size of the containers produced by the build and > see if it is worth optimising. I think the main contributor to size right now > is likely the impala binary, which we could shrink by stripping debug > symbols. We could also look at using a different base image from Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8824) Impala Doc: Document DROP TABLE for Insert-only ACID Tables
[ https://issues.apache.org/jira/browse/IMPALA-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Rodoni updated IMPALA-8824: Labels: future_release_doc (was: future_release_doc in_33) > Impala Doc: Document DROP TABLE for Insert-only ACID Tables > --- > > Key: IMPALA-8824 > URL: https://issues.apache.org/jira/browse/IMPALA-8824 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Alex Rodoni >Assignee: Alex Rodoni >Priority: Major > Labels: future_release_doc > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8838) Impala wrote audit log with missing statement_type
Tim Armstrong created IMPALA-8838: - Summary: Impala wrote audit log with missing statement_type Key: IMPALA-8838 URL: https://issues.apache.org/jira/browse/IMPALA-8838 Project: IMPALA Issue Type: Bug Affects Versions: Impala 2.9.0 Reporter: Tim Armstrong We saw an audit log with a missing statement_type, where it should have been QUERY. Filing a bug to see if this reoccurs and if there is a pattern to it (we don't have a way to reproduce or debug now). {noformat} { "serviceType": "IMPALA", "serviceName": "impala", "extraValues": { "12345678912345": { "status": "", "impersonator": null, "start_time": "2019-01-01 00:00:00.0", "network_address": "123.123.123.123:12345", "authorization_failure": false, "sql_statement": "SELECT NDV_NO_FINALIZE(col) AS col, CAST(-1 as BIGINT), 8, CAST(8 as DOUBLE), COUNT(col), ... FROM table WHERE (day='2019-01-01') GROUP BY day", "session_id\\ ": "xx:xx", "query_id": "xxx:xx", "catalog_objects": [ { "privilege": "VIEW_METADATA", "object_type": "", "name": "_impala_builtins" }, { "privilege": "SELECT", " object_type": "", "name": "table" } ], "statement_type": "", "user": "u...@realm.net" } } } {noformat} statement_type is printed here: https://github.com/cloudera/Impala/blob/cdh5-2.9.0_5.12.2/be/src/service/impala-server.cc#L474 It calls out to the function which prints an enum here:https://github.com/cloudera/Impala/blob/cdh5-2.9.0_5.12.2/be/src/util/debug-util.cc#L68. The only way it can produce an empty string is if the enum value is out-of-range, which shouldn't be possible unless we're reading an uninitialised value or the memory is somehow corrupted. However, all the surrounding fields in the TExecRequest object look like they were written out to the audit log OK The code has changed a bit in master because of the thrift version upgrade, but it is still equivalent as far as I can see. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8826) Impala Doc: Add docs for PLAN_ROOT_SINK and result spooling
[ https://issues.apache.org/jira/browse/IMPALA-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901361#comment-16901361 ] Sahil Takiar commented on IMPALA-8826: -- It is likely going to be for the 3.3 release, although I'm not exactly sure when that will be. > Impala Doc: Add docs for PLAN_ROOT_SINK and result spooling > --- > > Key: IMPALA-8826 > URL: https://issues.apache.org/jira/browse/IMPALA-8826 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Sahil Takiar >Assignee: Alex Rodoni >Priority: Major > Labels: future_release_doc > > Currently, I don't see many docs explaining what a {{PLAN_ROOT_SINK}} is, > even though it shows up in explain plans and runtime profiles. After more of > the changes in IMPALA-8656 are merged, understanding what {{PLAN_ROOT_SINK}} > is will be more important, because it will start taking up a memory > reservation and possibly spilling to disk. > I don't see any docs on data sinks in general, so perhaps it would be useful > to create a dedicated page for explaining data sinks and how they work. We > can start by documenting the {{PLAN_ROOT_SINK}} as that may be the most > commonly used one. > We should document all the changes being made in IMPALA-8656 as well. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8781) Add additional tests in test_result_spooling.py and validate cancellation logic
[ https://issues.apache.org/jira/browse/IMPALA-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar resolved IMPALA-8781. -- Resolution: Fixed Fix Version/s: Impala 3.3.0 Commit Hash: bbec8fa74961755269298706302477780019e7d5 IMPALA-8781: Result spooling tests to cover edge cases and cancellation Adds additional tests to test_result_spooling.py to cover various edge cases when fetching query results (ensure all Impala types are returned properly, UDFs are evaluated correctly, etc.). A new QueryTest file result-spooling.test is added to encapsulate all these tests. Tests with a decreased ROW_BATCH_SIZE are added as well to validate that BufferedPlanRootSink buffers row batches correctly. BufferedPlanRootSink requires careful synchronization of the producer and consumer threads, especially when queries are cancelled. The TestResultSpoolingCancellation class is dedicated to running cancellation tests with SPOOL_QUERY_RESULTS = true. The implementation is heavily borrowed from test_cancellation.py and some of the logic is re-factored into a new utility class called cancel_utils.py to avoid code duplication between test_cancellation.py and test_result_spooling.py. Testing: * Looped test_result_spooling.py overnight with no failures * Core tests passed Change-Id: Ib3b3a1539c4a5fa9b43c8ca315cea16c9701e283 Reviewed-on: http://gerrit.cloudera.org:8080/13907 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Add additional tests in test_result_spooling.py and validate cancellation > logic > --- > > Key: IMPALA-8781 > URL: https://issues.apache.org/jira/browse/IMPALA-8781 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Fix For: Impala 3.3.0 > > > {{test_result_spooling.py}} currently runs a few basic tests with result > spooling enabled. We should add some more to cover all necessary edge cases > (ensure all Impala types are returned correctly, UDFs are evaluated > correctly, etc.) and add tests to validate the cancellation logic in > {{PlanRootSink}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8837) Impala Doc: Document impersonalization via HTTP
Alex Rodoni created IMPALA-8837: --- Summary: Impala Doc: Document impersonalization via HTTP Key: IMPALA-8837 URL: https://issues.apache.org/jira/browse/IMPALA-8837 Project: IMPALA Issue Type: Sub-task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8828) Support impersonation via http paths
[ https://issues.apache.org/jira/browse/IMPALA-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901286#comment-16901286 ] Thomas Tauber-Marshall commented on IMPALA-8828: Yes, in the places where it says that the delegated user can be set using the hiveserver2 property impala.doas.user we should add that clients that connect over the http interface can specify the 'doAs' parameter in the http path. We should probably also document the Knox integration work in general. I think for now its fine to just mention in impala_authentication.html that we support proxying connections to Impala through Knox. We'll may eventually want to give it a full page explaining the whole process, but that's not very high priority at the moment (it doesn't even technically work yet, and once it does it should be pretty straight forward to set up once users follow Knox's own docs) > Support impersonation via http paths > > > Key: IMPALA-8828 > URL: https://issues.apache.org/jira/browse/IMPALA-8828 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Affects Versions: Impala 3.3.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Major > Labels: security > Fix For: Impala 3.3.0 > > > When clients connect over http, we should allow them to perform impersonation > via the 'doAs' parameter, eg. by specifying a path of the form > '/?doAs=' > This is useful for example for Apache Knox, which proxies connections to > Impala and authenticates as itself via Kerberos but runs queries as other > users. > We can leverage the existing support for impersonation, eg. knox would have > to be included in 'authorized_proxy_user_config' to be able to do the > impersonation -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8832) Queries fail to run when connecting to Impala over Knox
[ https://issues.apache.org/jira/browse/IMPALA-8832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Tauber-Marshall resolved IMPALA-8832. Resolution: Fixed Fix Version/s: Impala 3.3.0 > Queries fail to run when connecting to Impala over Knox > --- > > Key: IMPALA-8832 > URL: https://issues.apache.org/jira/browse/IMPALA-8832 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 3.3.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Blocker > Labels: security > Fix For: Impala 3.3.0 > > > Impala recently added support for HTTP clients over HS2. One of the > motivations for this work was to allow proxying of connections to Impala > through other services such as Apache Knox. > However, testing in testing with Knox, it seems that its possible to connect > to Impala successfully, but then queries fail to run or results aren't > retrieved. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8828) Support impersonation via http paths
[ https://issues.apache.org/jira/browse/IMPALA-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901263#comment-16901263 ] Alex Rodoni commented on IMPALA-8828: - [~twmarshall] Will this require doc updates in impala_delegation.html? > Support impersonation via http paths > > > Key: IMPALA-8828 > URL: https://issues.apache.org/jira/browse/IMPALA-8828 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Affects Versions: Impala 3.3.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Major > Labels: security > Fix For: Impala 3.3.0 > > > When clients connect over http, we should allow them to perform impersonation > via the 'doAs' parameter, eg. by specifying a path of the form > '/?doAs=' > This is useful for example for Apache Knox, which proxies connections to > Impala and authenticates as itself via Kerberos but runs queries as other > users. > We can leverage the existing support for impersonation, eg. knox would have > to be included in 'authorized_proxy_user_config' to be able to do the > impersonation -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8771) Missing stats warning for complex type columns
[ https://issues.apache.org/jira/browse/IMPALA-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901252#comment-16901252 ] ASF subversion and git services commented on IMPALA-8771: - Commit 227b839e4e71778b74b045331682317e29014c7c in impala's branch refs/heads/master from Tamas Mate [ https://gitbox.apache.org/repos/asf?p=impala.git;h=227b839 ] IMPALA-8771: Missing stats warning for complex type columns An extra condition is added to the table stats checking, so that the complex type columns are skipped and can not trigger missing stats warning. Change-Id: Ia1b5c14da0c7f6eab373d80b2dbf7c974b2eb567 Reviewed-on: http://gerrit.cloudera.org:8080/13965 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong > Missing stats warning for complex type columns > -- > > Key: IMPALA-8771 > URL: https://issues.apache.org/jira/browse/IMPALA-8771 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 2.13.0, Impala 3.3.0 >Reporter: bharath v >Assignee: Tamas Mate >Priority: Minor > Labels: observability > > We currently don't support column stats for complex typed columns (ingored in > `compute stats` statements). However running queries against those columns > throws the missing col stats warning which is confusing. > > {noformat} > select count(*) from > customers c, > c.orders o;{noformat} > {noformat} > Max Per-Host Resource Reservation: Memory=16.00KB Threads=3 > Per-Host Resource Estimates: Memory=36MB > WARNING: The following tables are missing relevant table and/or column > statistics. > default.customers > Analyzed query: SELECT count(*) FROM `default`.customers c, c.orders > o{noformat} > > We could probably skip the warnings if we detect the missing stats are for > complex typed columns, until we support them. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8832) Queries fail to run when connecting to Impala over Knox
[ https://issues.apache.org/jira/browse/IMPALA-8832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901251#comment-16901251 ] ASF subversion and git services commented on IMPALA-8832: - Commit bcedd1572ee69bf5f5551af08f8fcb0ae0c48aea in impala's branch refs/heads/master from Thomas Tauber-Marshall [ https://gitbox.apache.org/repos/asf?p=impala.git;h=bcedd15 ] IMPALA-8832: Fix HTTP client protocol to work with Apache Knox This patch fixes two bugs with Impala's HTTP client protocol: - THttpServer transports are no longer wrapped in TBufferedTransports. THttpServer already has its own support for buffering to process one HTTP request at a time, and wrapping it in a TBufferedTransport interferes with this, in some cases causing client requests to either not be processed or to recieve multiple responses. - Fixes a bug in THttpTransport where when a chunked HTTP request is finished being processed, the 'readHeaders_' variable is never reset and further requests over the connection are not processed. Testing: - Tested by proxying beeline connections to Impala through Apache Knox Change-Id: I5c9d934a654a9e6aaf9207fa5856f956baaacf55 Reviewed-on: http://gerrit.cloudera.org:8080/14008 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins > Queries fail to run when connecting to Impala over Knox > --- > > Key: IMPALA-8832 > URL: https://issues.apache.org/jira/browse/IMPALA-8832 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 3.3.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Blocker > Labels: security > > Impala recently added support for HTTP clients over HS2. One of the > motivations for this work was to allow proxying of connections to Impala > through other services such as Apache Knox. > However, testing in testing with Knox, it seems that its possible to connect > to Impala successfully, but then queries fail to run or results aren't > retrieved. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables
[ https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901249#comment-16901249 ] Vinoth Chandar commented on IMPALA-8778: >Another small question about how to determine that the Hoodie specific path, >it seems that I can use HoodiePartitionMetadata to check whether it is a valid >dataset if invalid or dataset not found, I can treat it as a no hoodie path, >am I correct? The HoodieTableMetaClient already does those things for you. We can follow up on the HUDI ticket more (to keep this about Impala/Hudi integration alone). Also, I'd suggest that we land this once we have renamed packaged on Hudi to org.apache.hudi and made the first release.. Rough ETA, end of month. So you can keep working on the patch as is, test and finally we can just pick up the new artifact. > Support read/write Apache Hudi tables > - > > Key: IMPALA-8778 > URL: https://issues.apache.org/jira/browse/IMPALA-8778 > Project: IMPALA > Issue Type: New Feature >Reporter: Yuanbin Cheng >Assignee: Yuanbin Cheng >Priority: Major > > Apache Impala currently not support Apache Hudi, cannot even pull metadata > from Hive. > Related issue: > [https://github.com/apache/incubator-hudi/issues/179] > [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues] > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7486) Admit less memory on dedicated coordinator for admission control purposes
[ https://issues.apache.org/jira/browse/IMPALA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikramjeet Vig resolved IMPALA-7486. Resolution: Fixed Fix Version/s: Impala 3.3.0 > Admit less memory on dedicated coordinator for admission control purposes > - > > Key: IMPALA-7486 > URL: https://issues.apache.org/jira/browse/IMPALA-7486 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Tim Armstrong >Assignee: Bikramjeet Vig >Priority: Major > Labels: resource-management, scalability > Fix For: Impala 3.3.0 > > > Following on from IMPALA-7349, we should consider handling dedicated > coordinators specially rather than admitting a uniform amount of memory on > all backends. > The specific scenario I'm interested in targeting is the case where we a > coordinator that is executing many "lightweight" coordinator fragments, e.g. > just an ExchangeNode and PlanRootSink, plus maybe other lightweight operators > like UnionNode that don't use much memory or CPU. With the current behaviour > it's possible for a coordinator to reach capacity from the point-of-view of > admission control when at runtime it is actually very lightly loaded. > This is particularly true if coordinators and executors have different > process mem limits. This will be somewhat common since they're often deployed > on different hardware or the coordinator will have more memory dedicated to > its embedded JVM for the catalog cache. > More generally we could admit different amounts per backend depending on how > many fragments are running, but I think this incremental step would address > the most important cases and be a little easier to understand. > We may want to defer this work until we've implemented distributed runtime > filter aggregation, which will significantly reduce coordinator memory > pressure, and until we've improved distributed overadmission (since the > coordinator behaviour may help throttle overadmission ). -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8806) Add metrics to improve observability of executor groups
[ https://issues.apache.org/jira/browse/IMPALA-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikramjeet Vig resolved IMPALA-8806. Resolution: Fixed Fix Version/s: Impala 3.3.0 > Add metrics to improve observability of executor groups > --- > > Key: IMPALA-8806 > URL: https://issues.apache.org/jira/browse/IMPALA-8806 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 3.3.0 >Reporter: Bikramjeet Vig >Assignee: Bikramjeet Vig >Priority: Major > Labels: observability > Fix For: Impala 3.3.0 > > > As a follow on IMPALA-8484, it makes sense to add some metrics to provide > better observability into the state of executor groups. > Some metrics can be: > - number of executor groups with any impalas in them > - number of healthy executor groups > - number of backends. Currently we have a python helper that calculates this > - get_num_known_live_backends, but it really should be a metric (we could > replace the test code with a metric if we did this). -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8796) Add unit tests to UnpackAndDecodeValues
[ https://issues.apache.org/jira/browse/IMPALA-8796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901197#comment-16901197 ] Daniel Becker commented on IMPALA-8796: --- [https://gerrit.cloudera.org/#/c/14004/] > Add unit tests to UnpackAndDecodeValues > --- > > Key: IMPALA-8796 > URL: https://issues.apache.org/jira/browse/IMPALA-8796 > Project: IMPALA > Issue Type: Test > Components: Backend >Reporter: Daniel Becker >Assignee: Daniel Becker >Priority: Minor > > BitPacking::UnpackAndDecodeValues has no unit tests in bit-packing-test.cc. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8836) Support COMPUTE STATS on insert only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csaba Ringhofer updated IMPALA-8836: Labels: impala-acid (was: ) > Support COMPUTE STATS on insert only ACID tables > > > Key: IMPALA-8836 > URL: https://issues.apache.org/jira/browse/IMPALA-8836 > Project: IMPALA > Issue Type: New Feature > Components: Backend, Frontend >Affects Versions: Impala 3.3.0 >Reporter: Csaba Ringhofer >Assignee: Csaba Ringhofer >Priority: Critical > Labels: impala-acid > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8836) Support COMPUTE STATS on insert only ACID tables
Csaba Ringhofer created IMPALA-8836: --- Summary: Support COMPUTE STATS on insert only ACID tables Key: IMPALA-8836 URL: https://issues.apache.org/jira/browse/IMPALA-8836 Project: IMPALA Issue Type: New Feature Components: Backend, Frontend Affects Versions: Impala 3.3.0 Reporter: Csaba Ringhofer Assignee: Csaba Ringhofer -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8833) Check failed: bit_width <= sizeof(T) * 8 (40 vs. 32) in BatchedBitReader::UnpackBatch()
[ https://issues.apache.org/jira/browse/IMPALA-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901092#comment-16901092 ] Daniel Becker commented on IMPALA-8833: --- [https://gerrit.cloudera.org/#/c/14019/] > Check failed: bit_width <= sizeof(T) * 8 (40 vs. 32) in > BatchedBitReader::UnpackBatch() > > > Key: IMPALA-8833 > URL: https://issues.apache.org/jira/browse/IMPALA-8833 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Daniel Becker >Priority: Blocker > Labels: broken-build, crash, flaky > > {noformat} > F0801 21:24:10.571285 15993 bit-stream-utils.inline.h:126] > d04ba69d5da8ffd1:a9045b820001] Check failed: bit_width <= sizeof(T) * 8 > (40 vs. 32) > *** Check failure stack trace: *** > @ 0x52f63ac google::LogMessage::Fail() > @ 0x52f7c51 google::LogMessage::SendToLog() > @ 0x52f5d86 google::LogMessage::Flush() > @ 0x52f934d google::LogMessageFatal::~LogMessageFatal() > @ 0x2b265b5 impala::BatchedBitReader::UnpackBatch<>() > @ 0x2ae8623 impala::RleBatchDecoder<>::FillLiteralBuffer() > @ 0x2b2cadb impala::RleBatchDecoder<>::DecodeLiteralValues<>() > @ 0x2b27bfb impala::DictDecoder<>::DecodeNextValue() > @ 0x2b16fed > impala::ScalarColumnReader<>::ReadSlotsNoConversion() > @ 0x2ac7252 impala::ScalarColumnReader<>::ReadSlots() > @ 0x2a76cef > impala::ScalarColumnReader<>::MaterializeValueBatchRepeatedDefLevel() > @ 0x2a58faa impala::ScalarColumnReader<>::ReadValueBatch<>() > @ 0x2a20e8e > impala::ScalarColumnReader<>::ReadNonRepeatedValueBatch() > @ 0x29b189c impala::HdfsParquetScanner::AssembleRows() > @ 0x29ac6de impala::HdfsParquetScanner::GetNextInternal() > @ 0x29aa656 impala::HdfsParquetScanner::ProcessSplit() > @ 0x249172d impala::HdfsScanNode::ProcessSplit() > @ 0x2490902 impala::HdfsScanNode::ScannerThread() > @ 0x248fc8b > _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv > @ 0x2492253 > {noformat} > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6915 > Log lines around the failure: > {noformat} > [gw5] PASSED > query_test/test_scanners.py::TestParquet::test_bad_compression_codec[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: parquet/none] > query_test/test_nested_types.py::TestMaxNestingDepth::test_load_hive_table[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_scanners.py::TestParquet::test_bad_compression_codec[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'debug_action': > '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > [gw1] PASSED > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q7[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q8[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > [gw1] PASSED > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q8[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q10a[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > [gw10] PASSED > query_test/test_sc
[jira] [Commented] (IMPALA-8833) Check failed: bit_width <= sizeof(T) * 8 (40 vs. 32) in BatchedBitReader::UnpackBatch()
[ https://issues.apache.org/jira/browse/IMPALA-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900972#comment-16900972 ] Daniel Becker commented on IMPALA-8833: --- The problem seems to be here: [https://github.com/apache/impala/blob/bbe064ec194aff4ecf1e794bd4071df4ea4be166/be/src/util/dict-encoding.h#L256] The dict decoder checks the bit width. So far the maximum bit width for bit packing was 32 and bit widths like 40 were caught here. Now that the maximum is 64, it is not caught but when the decoder tries to write it to a uint32_t it DCHECKs. > Check failed: bit_width <= sizeof(T) * 8 (40 vs. 32) in > BatchedBitReader::UnpackBatch() > > > Key: IMPALA-8833 > URL: https://issues.apache.org/jira/browse/IMPALA-8833 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Daniel Becker >Priority: Blocker > Labels: broken-build, crash, flaky > > {noformat} > F0801 21:24:10.571285 15993 bit-stream-utils.inline.h:126] > d04ba69d5da8ffd1:a9045b820001] Check failed: bit_width <= sizeof(T) * 8 > (40 vs. 32) > *** Check failure stack trace: *** > @ 0x52f63ac google::LogMessage::Fail() > @ 0x52f7c51 google::LogMessage::SendToLog() > @ 0x52f5d86 google::LogMessage::Flush() > @ 0x52f934d google::LogMessageFatal::~LogMessageFatal() > @ 0x2b265b5 impala::BatchedBitReader::UnpackBatch<>() > @ 0x2ae8623 impala::RleBatchDecoder<>::FillLiteralBuffer() > @ 0x2b2cadb impala::RleBatchDecoder<>::DecodeLiteralValues<>() > @ 0x2b27bfb impala::DictDecoder<>::DecodeNextValue() > @ 0x2b16fed > impala::ScalarColumnReader<>::ReadSlotsNoConversion() > @ 0x2ac7252 impala::ScalarColumnReader<>::ReadSlots() > @ 0x2a76cef > impala::ScalarColumnReader<>::MaterializeValueBatchRepeatedDefLevel() > @ 0x2a58faa impala::ScalarColumnReader<>::ReadValueBatch<>() > @ 0x2a20e8e > impala::ScalarColumnReader<>::ReadNonRepeatedValueBatch() > @ 0x29b189c impala::HdfsParquetScanner::AssembleRows() > @ 0x29ac6de impala::HdfsParquetScanner::GetNextInternal() > @ 0x29aa656 impala::HdfsParquetScanner::ProcessSplit() > @ 0x249172d impala::HdfsScanNode::ProcessSplit() > @ 0x2490902 impala::HdfsScanNode::ScannerThread() > @ 0x248fc8b > _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv > @ 0x2492253 > {noformat} > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6915 > Log lines around the failure: > {noformat} > [gw5] PASSED > query_test/test_scanners.py::TestParquet::test_bad_compression_codec[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: parquet/none] > query_test/test_nested_types.py::TestMaxNestingDepth::test_load_hive_table[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_scanners.py::TestParquet::test_bad_compression_codec[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'debug_action': > '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > [gw1] PASSED > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q7[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q8[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > [gw1] PASSED > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q8[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > q
[jira] [Work started] (IMPALA-8833) Check failed: bit_width <= sizeof(T) * 8 (40 vs. 32) in BatchedBitReader::UnpackBatch()
[ https://issues.apache.org/jira/browse/IMPALA-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8833 started by Daniel Becker. - > Check failed: bit_width <= sizeof(T) * 8 (40 vs. 32) in > BatchedBitReader::UnpackBatch() > > > Key: IMPALA-8833 > URL: https://issues.apache.org/jira/browse/IMPALA-8833 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Daniel Becker >Priority: Blocker > Labels: broken-build, crash, flaky > > {noformat} > F0801 21:24:10.571285 15993 bit-stream-utils.inline.h:126] > d04ba69d5da8ffd1:a9045b820001] Check failed: bit_width <= sizeof(T) * 8 > (40 vs. 32) > *** Check failure stack trace: *** > @ 0x52f63ac google::LogMessage::Fail() > @ 0x52f7c51 google::LogMessage::SendToLog() > @ 0x52f5d86 google::LogMessage::Flush() > @ 0x52f934d google::LogMessageFatal::~LogMessageFatal() > @ 0x2b265b5 impala::BatchedBitReader::UnpackBatch<>() > @ 0x2ae8623 impala::RleBatchDecoder<>::FillLiteralBuffer() > @ 0x2b2cadb impala::RleBatchDecoder<>::DecodeLiteralValues<>() > @ 0x2b27bfb impala::DictDecoder<>::DecodeNextValue() > @ 0x2b16fed > impala::ScalarColumnReader<>::ReadSlotsNoConversion() > @ 0x2ac7252 impala::ScalarColumnReader<>::ReadSlots() > @ 0x2a76cef > impala::ScalarColumnReader<>::MaterializeValueBatchRepeatedDefLevel() > @ 0x2a58faa impala::ScalarColumnReader<>::ReadValueBatch<>() > @ 0x2a20e8e > impala::ScalarColumnReader<>::ReadNonRepeatedValueBatch() > @ 0x29b189c impala::HdfsParquetScanner::AssembleRows() > @ 0x29ac6de impala::HdfsParquetScanner::GetNextInternal() > @ 0x29aa656 impala::HdfsParquetScanner::ProcessSplit() > @ 0x249172d impala::HdfsScanNode::ProcessSplit() > @ 0x2490902 impala::HdfsScanNode::ScannerThread() > @ 0x248fc8b > _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv > @ 0x2492253 > {noformat} > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/6915 > Log lines around the failure: > {noformat} > [gw5] PASSED > query_test/test_scanners.py::TestParquet::test_bad_compression_codec[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: parquet/none] > query_test/test_nested_types.py::TestMaxNestingDepth::test_load_hive_table[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_scanners.py::TestParquet::test_bad_compression_codec[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'debug_action': > '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > [gw1] PASSED > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q7[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q8[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > [gw1] PASSED > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q8[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > query_test/test_tpcds_queries.py::TestTpcdsQuery::test_tpcds_q10a[protocol: > beeswax | exec_option: {'decimal_v2': 0, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] > [gw10] PASSED > query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_decimal_tbl[protocol: > beeswa
[jira] [Commented] (IMPALA-8823) Implement DROP TABLE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900705#comment-16900705 ] Gabor Kaszab commented on IMPALA-8823: -- Won't be able to finish this until locking of ACID tables is submitted. > Implement DROP TABLE for insert-only ACID tables > > > Key: IMPALA-8823 > URL: https://issues.apache.org/jira/browse/IMPALA-8823 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > > Impala currently cannot drop insert-only ACID tables. > To implement DROP TABLE for insert-only tables at first we need to acquire an > exclusive lock from HMS, then proceed with the usual DROP TABLE process. > Heartbeating the lock might be also needed. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8823) Implement DROP TABLE for insert-only ACID tables
[ https://issues.apache.org/jira/browse/IMPALA-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8823 started by Gabor Kaszab. > Implement DROP TABLE for insert-only ACID tables > > > Key: IMPALA-8823 > URL: https://issues.apache.org/jira/browse/IMPALA-8823 > Project: IMPALA > Issue Type: Improvement >Reporter: Zoltán Borók-Nagy >Assignee: Gabor Kaszab >Priority: Major > Labels: impala-acid > > Impala currently cannot drop insert-only ACID tables. > To implement DROP TABLE for insert-only tables at first we need to acquire an > exclusive lock from HMS, then proceed with the usual DROP TABLE process. > Heartbeating the lock might be also needed. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org